.NET Remoting Authentication and Authorization Sample – Part II

 

January 2004

Michel Barnett
Microsoft Corporation

Applies to:
   Microsoft® .NET Remoting
   Microsoft® Windows®
   Windows Security Support Provider Interface

Summary: Learn about the Microsoft.Samples.Runtime.Remoting.Security assembly. This is a companion piece to Microsoft.Samples.Security.SSPI, and it describes an assembly that leverages Microsoft.Samples.Security.SSPI to provide a security solution specifically for Microsoft .NET Remoting. (49 printed pages)

Download RemSec.exe.

Contents

Introduction
Channel Sinks
Remoting Authentication
Configuration
Remoting Security
Custom Principals
Programmatic Security
Sample
Summary
Appendix: ASP.NET Supported Remoting Security
Appendix: References

Note   This article is the second in a two-part series describing a security solution for Microsoft® .NET remoting. This first part describes Microsoft.Samples.Security.SSPI—an assembly implementing a managed wrapper around the Microsoft Windows® Security Support Provider Interface. Microsoft.Samples.Security.SSPI provides the core functionality needed to authenticate a client and server as well as sign and encrypt messages sent between the two. This is the basis of a solution implementing a security solution across a remoting boundary.

This is the second article in the series describing Microsoft.Samples.Runtime.Remoting.Security. In this article I describe an assembly that leverages Microsoft.Samples.Security.SSPI to provide a security solution specifically for .NET remoting.

Note   This is an update to the original article published in the summer of 2002. Updates have been made to the contents of this article in order to support changes in the accompanying sample. Several new features have been added.

  • The Microsoft.Samples.Runtime.Remoting.Security assembly has been rewritten. The relatively monolithic implementation in the first version has been replaced with a more granular design that's easier to understand. The channel sinks now feature a client and server state machine which manage the authentication handshake.
  • Client-activated objects are now supported.
  • Asynchronous and One-way calls are now supported.
  • Mutual authentication is supported for Kerberos and Negotiate.
  • All configuration parameters for the client and server channel sink providers are now optional. Default values set the channel sinks to the most secure configuration (secure by default).
  • Impersonation no longer happens automatically on the server side. Instead, the developer of the remote object now has full control over impersonation by calling Thread.CurrentPrincipal.Identity.Impersonate().
  • The security sinks now always set a Principal on the thread calling the remote object. This allows the object implementer to take advantage of declarative security regardless of whether they explicitly inject a custom principal themselves.
  • The client channel sink no longer makes assumptions about how the SPN is formed for the server host process. This change makes it easy for the developer to run the server host under any security context.
  • Due to the way the SPN is discovered, the sample no longer encourages running the server host under the SYSTEM account. The entire sample has been tested using regular domain user accounts with no special privileges. The IIS sample was tested using the out-of-the-box ASPNET account.
  • The security sinks include new sample applications which show off its various features. These include:
    • Async—Demonstrates how the security sinks work with one way and asynchronous calls.

    • Auth—Shows how to inject your own custom principal on the server side, as well as how declarative and explicit security checks work.

    • CAO—Demonstrates that the security sinks work with client activated objects.

    • IIS—Shows how the security sinks work with objects hosted in IIS.

    • OpenFile—This is an implementation of the scenario outlined in this article. Essentially, this sample consists of a class which opens a file based on a remote client's command. The client can call the remote object using any combination of settings on the security sinks. This sample demonstrates the following features:

      Programmatic Security—The sample calls the remote object with any combination of settings on IClientSecurity. These settings include:

      **  Security Package**—NTLM, Kerberos, or Negotiate

      **  Authentication Level**—Call, Packet Integrity, Packet Privacy

      **  Impersonation Level**—Identify, Impersonate, Delegate

      **  Mutual Authentication**—Turns mutual authentication on or off for Kerberos or Negotiate

      **  Server Principal Name**—Manually sets SPN for Kerberos and Negotiate

      The server object also demonstrates use of the IServerSecurity interface

      **  Impersonation**—The server object optionally impersonates the client before opening the file.

      **  Delegation**—The sample includes two host executables designed to make it easy to test Kerberos delegation.

Introduction

Let's set the stage for what we're trying to accomplish by reviewing the scenario described in the Microsoft.Samples.Security.SSPI article. It starts with a simple, non-remoted application (Figure 1).

Figure 1. Non-remoted Foo application

The scenario consists of a single .exe running under Alice's security context. Client.exe instantiates a component called Foo which implements a single public method called OpenFile:

public bool OpenFile(string FilePath)
{
    bool success = true;
    try
    {
        FileStream fs = File.Open(FilePath, FileMode.Open);
        fs.Close();
    }
    catch (UnauthorizedAccessException ex)
    {
        success = false;
    }
    return success;
}

The ACL on Secure.txt has a single entry granting Alice full access to the file. In this case, when Alice calls Foo::OpenFile, the result is successful—so far, so good.

Now let's change the picture (Figure 2).

Figure 2. Remoted Foo application

As before Alice instantiates Foo and calls OpenFile. But this time, Foo isn't instantiated in Client.exe but instead in Server.exe. The key difference between the two scenarios is that this time Foo is running in a process under Bob's security context, not Alice's. When Alice calls OpenFile on the remote instance of Foo, the result is false—access is denied.

Note   It's not terribly important exactly how Foo is remoted, but for this discussion assume that communication between the two processes is via a TCP socket.

We know from the previous article that the reason Foo can't open the file in the second picture is because we're not using Alice's credentials. What we'd like to happen is when Alice instantiates the remote object, Server.exe authenticates Alice, impersonates her, and then opens the file with Alice's credentials. In other words, we would like the second scenario to act like the first.

The previous article explained some core concepts necessary to make this work. In fact, it went so far as to explain how to perform the cross process authentication, and how to impersonate the client, and by extension, how to do work with the client's credentials. It even explained how to sign & encrypt messages sent between client and server. So it seems we have all we need. What's left to discuss?

There are two key issues that this article addresses that the previous one did not. First, this article explains how to make authentication transparent to the client and server. Sure, the previous article explained how to do authentication, impersonation, signing & encryption. However, all of this was done at the application level. This article explains how to factor out these functions from the application level—so you don't have to write a single line of code on the client or server side to make these features work.

The second issue addresses those topics that aren't in the first article—that are particular to remoting security. In particular the idea of client trust is addressed—does the client want to allow the server to use its credentials? There's also the issue of how often to authenticate—whether at every method call or when the remote component is first instantiated. Also, when should signing and encryption be brought into play? Finally, while most of the discussion has been about authentication, how does this security solution play a role in authorization? All of these questions are addressed in this article.

So now that we've better defined what this article is about, what is it we should expect out of the security solution that we build?

  • A solution that's orthogonal to any particular application:
    • Plug it in and it "just works"
    • No special client or server code necessary
    • Configuration through .NET configuration files
    • Optional programmatic configuration
  • Configurable security—choice of any Windows authentication protocol (in particular Kerberos, NTLM, and Negotiate)
  • Configurable impersonation level (client's trust to the server)—Allowable settings are identify, impersonate, and delegate
  • Authentication Level (determines when authentication should take place, how messages should be protected)—Allowable settings are call, packetIntegrity, and packetPrivacy
  • Custom Principal—The ability to set a custom principal on the server side, especially to support role-based security

In this article I'll lead you step-by-step through the process of building an authentication solution for .NET remoting that meets the above requirements.

I'll start by explaining how to factor out the authentication code.

Where to Go from Here

The first section of the article—Channel Sinks is designed as an overview of .NET remoting, channel sinks, and channel sink providers. Since it provides the foundation for the rest of the article, it's suggested at least as a refresher on the topic. However, if you're thoroughly familiar with how .NET remoting works, you can skip this section and move right on to Remoting Authentication.

The remaining sections incrementally explain how the remoting security solution was developed. This starts with the most basic features, adding each additional function until the solution is complete. It's recommended you read these sections from beginning to end in the order that they appear.

Channel Sinks

One of the most important features of our security solution is authentication of the client to the server. I explained how to do that in the previous article including a sample application that leveraged Microsoft.Samples.Security.SSPI to authenticate a sample client to a server using a choice of authentication protocols. The problem with this is that authentication was done at the application level—the application code had to program against Microsoft.Samples.Security.SSPI and explicitly send security tokens between the client and server.

What we'd like is a solution that does all of this handshaking "under the covers". The result should be no special code on the client side—just instantiate the object and call a method. There also shouldn't be any special code in the server component—the logon session for the client (and the corresponding token) should be generated by the time the server method is called. This section is about taking the authentication handshake and factoring it out of the application layer.

What we need to accomplish this is an out-of-band mechanism for passing data (specifically, security tokens) between client and server. As it turns out, .NET remoting offers us a way to do this through Channel Sinks. Channel Sinks are a key part of the .NET remoting architecture (Figure 3).

Figure 3. Remoting architecture

As you can see from the figure there is a lot of infrastructure between the client and the remote object: Foo. When a method is called on a remote object, there are a number of stages it passes through before it reaches that server object.

First, a client communicates with a proxy. The proxy is a local representation of the remote object that the client can call in lieu of the object itself. The primary purpose of the proxy is to take the parameters in a method call and bundle them up into a message (an object which implements the IMessage interface). It's this message that gets passed through the channel on its way to the remote object.

The proxy hands off the message to the channel. The channel is responsible for transporting messages to and from remote objects using a corresponding transport protocol. There are two channels included in the .NET platform: HttpChannel and TcpChannel which leverage HTTP and TCP as a transport respectively. A message makes its way through the channel via the channel sink chain and that process starts with the first channel sink in the chain.

A message makes it way through the channel by being passed from one channel sink to the next in the sink chain. The information that gets passed along is the message itself along with corresponding header information. As the message is passed, each channel sink has access to the message and may even modify the headers accompanying the message. There are normally at least two channel sinks in the chain: the Formatter Sink and the Transport Sink.

The first sink on the client side is typically a formatter sink. It serializes the message into a stream and creates appropriate headers which are then passed down the channel sink chain to the client transport sink. The transport sink then writes this stream out to the wire.

On the server side, the server transport sink reads requests from the wire and passes the request stream to the server channel sink chain. The server formatter sink at the end of this chain deserializes the request into a message. That message is then passed off to the stack builder sink.

The stack builder sink unbundles the message into the original call parameters, sets up the call stack appropriately, and calls the remote object. Any output parameters from the remote object go through this same process in the reverse order.

Custom Channel Sinks

As it turns out, we can extend this infrastructure by adding our own channel sinks to the sink chain (Figure 4). Custom channel sinks may be added anywhere in the sink chain but are typically added between the formatter and transport channel sinks.

Figure 4. Remoting architecture with custom channel sinks

A channel sink can read the message being passed along the way or can replace the message altogether. Channel sinks can also add headers to the header array (which along with the message is what the transport sink puts on the wire).

The header information is crucial to our security solution. If we add a custom channel sink to the sink chain, we can write code to add additional headers to the header array. This is information that the client and server will never see—out-of-band-data. This is the perfect place to put information needed for the authentication handshake.

Now all we have to do is figure out how to create a custom channel sink.

Creating Channel Sink Chains

To create a new channel sink, we must implement a channel sink provider. The channel sink provider is what actually creates the channel sink(s) in a chain. A channel sink provider must implement the IClientChannelSinkProvider or IServerChannelSinkProvider interface depending on whether it's for the client or server side chain.

The implementation of a simple channel sink provider is fairly straightforward. It's mainly responsible for creating channel sinks in the chain. The client and server side sink chains may have a different number of sinks and the behavior of the client and server sinks might be very different. However, a client channel sink provider always creates client channel sinks and a server provider always creates server channel sinks. We can start by looking at a client channel sink provider:

public class SecurityClientChannelSinkProvider: IClientChannelSinkProvider
{
    private IClientChannelSinkProvider _next = null;

    public SecurityClientChannelSinkProvider () { }

    public SecurityClientChannelSinkProvider (IDictionary properties,
        ICollection providerData) { }

    
    // IClientChannelSinkProvider
    public IClientChannelSink CreateSink(IChannelSender channel,
        String url, Object remoteChannelData)
    {
        IClientChannelSink nextSink = null;
        if (_next != null)
        {
            nextSink = _next.CreateSink(channel, url, remoteChannelData); 
      if (nextSink == null)
                return null;
        }

        return new SecurityClientChannelSink (nextSink);
    }

    public IClientChannelSinkProvider Next
    {
        get { return _next; }
        set { _next = value; } 
    }
}

You can ignore the constructors for now (I'll explain in a later section why the parameterized constructor is useful to us). The main thing to focus on here is the IClientChannelSinkProvider implementation.

There are two members of IClientChannelSinkProvider. The first is a method called CreateSink(). This is our opportunity to create our own custom channel sinks. CreateSink() creates the new channel sink (SecurityClientChannelSink). But notice it also forwards the CreateSink call to the next sink provider in the chain (if there is one). CreateSink() is also responsible for making sure that the next sink and the one that it creates are linked together (this is how the chain gets built).

The second member of IClientChannelSinkProvider is the Next property. This property simply sets or gets the next sink in the chain. Normally, this would be called before CreateSink() to set the next provider in the chain (so our CreateSink() method knows which provider to forward the CreateSink() call to).

The server channel sink provider doesn't look much different than its client side counterpart:

public class SecurityServerChannelSinkProvider: IServerChannelSinkProvider
{
    private IServerChannelSinkProvider _next = null;

    public SecurityServerChannelSinkProvider () { }

    public SecurityServerChannelSinkProvider (IDictionary properties,
        ICollection providerData) { }

    
    // IServerChannelSinkProvider
    public void GetChannelData(IChannelDataStore channelData) { }

    public IServerChannelSink CreateSink(IChannelReceiver channel)
    {
        IServerChannelSink nextSink = null;
        if (_next != null)
            nextSink = _next.CreateSink(channel); 

        return new SecurityServerChannelSink (nextSink);
    }

    public IServerChannelSinkProvider Next
    {
        get { return _next; }
        set { _next = value; } 
    }
}

The two key members of IServerChannelSinkProvider are still CreateSink() and Next. Each performs essentially the same job as its client side counterpart.

To make all of this clearer, let's run through an example where a client side sink chain gets created.

Although you could create channel sink provider chains programmatically, you most commonly will use configuration files to create them. A simple configuration file to create the SecurityClientChannelSinkProvider looks something like this:

<configuration>
  <system.runtime.remoting>

    <application>

       <channels>
         <channel ref="http">
           <clientProviders>
            <formatter ref="soap" />
            <provider type="Microsoft.Samples.Runtime.Remoting.Security.
                 SecurityClientChannelSinkProvider, 
                 Microsoft.Samples.Runtime.Remoting.Security"/>
           </clientProviders>
       </channels>

    </application>

  </system.runtime.remoting>
</configuration>

The channel sink providers are created when the channel is created during the RemotingConfiguration.Configure() call. When this method is called, a set of channel sink providers are created and linked together—exactly as they're listed in the configuration file. So in this case, a channel sink provider is created for the SOAP formatter followed by the custom provider for our security sink. The providers then create their channel sinks and link them together. The end result is a sink chain much like the one already described (Figure 4).

Hopefully this gives you a better idea of how channel sink providers work and how they get created. The next thing to do is drill down on the channel sink itself.

Building a Channel Sink

Now that we know how channel sinks get created, the next thing to do is look at their internal implementation. A ChannelSink must implement IClientChannelSink or IServerChannelSink depending on which side of the remoting infrastructure it is designed to run on. Similar to a provider, the implementation of a basic channel sink is fairly straightforward:

public class SecurityClientChannelSink :
    BaseChannelObjectWithProperties, IClientChannelSink
{
    private IClientChannelSink _nextSink = null;

    public SecurityClientChannelSink(IClientChannelSink nextSink) : base()
    {
        _nextSink = nextSink;
    }

    public SecurityClientChannelSink(IChannelSender channel, String url, 
        Object remoteChannelData, IClientChannelSink nextSink) : base ()
    {
        _nextSink = nextSink;
    }

    // IClientChannelSink
    public void ProcessMessage(IMessage msg, ITransportHeaders 
      requestHeaders,
        Stream requestStream, out ITransportHeaders responseHeaders,
        out Stream responseStream)
    {
        // add header information to be picked up by the server sink
        requestHeaders["ChannelSinkCallContext"] = "hello server sink";

        // now process the message as usual
        _nextSink.ProcessMessage(msg, requestHeaders, requestStream,
            out responseHeaders, out responseStream);
    }

    // . . . the rest of IClientChannelSink . . .
}

A simple implementation of the server side channel sink looks about the same:

public class SecurityServerChannelSink : 
    BaseChannelObjectWithProperties, IServerChannelSink
{
    private IClientChannelSink _nextSink = null;

    public SecurityClientChannelSink(IServerChannelSink nextSink) : base()
    {
        _nextSink = nextSink;
    }

    public SecurityClientChannelSink(IChannelReceiver channel, 
               IServerChannelSink nextSink) : base ()
    {
        _nextSink = nextSink;
    }

    // IServerChannelSink
    public ServerProcessing ProcessMessage(
        IServerChannelSinkStack sinkStack,
        IMessage requestMsg, ITransportHeaders requestHeaders,
        Stream requestStream, out IMessage responseMsg,
        out ITransportHeaders responseHeaders, out Stream responseStream)
    {
        // retrieve the header from the client sink
        string msg = requestHeaders["ChannelSinkCallContext"] as string;
        Debug.Assert(msg == "hello server sink");

        sinkStack.Push(this, null);
        ServerProcessing processing =
            _nextSink.ProcessMessage(sinkStack, null,
            requestHeaders, requestStream, out responseMsg,
            out responseHeaders, out responseStream);
        sinkStack.Pop(this);

        return processing;
    }

    // . . . the rest of IServerChannelSink . . .
}

The key method in IClientChannelSink and IServerChannelSink is ProcessMessage(). Since we've already identified the header array as a good transport for our out-of-band data, we're most interested in how we can modify that array. On both sides there is a requestHeaders parameter that we can use to set and retrieve header information.

Imagine on the client side that we can put additional data in the request headers and pick up that information on the server side. The sample above adds a header to the array to pass a simple message to the server. But what gets passed in this header could be anything we choose. This is a great place for putting out-of-band data like SSPI security tokens.

Recap

  • We can now move out-of-band data between client & server (like SSPI authentication tokens)
  • We can add extra information to the message without requiring client or server code at the application layer; it is transparent to the client and server.

We will now apply this to a remoting authentication solution.

Remoting Authentication

Now that we know how to send out-of-band data between client and server using channel sinks, we can leverage that knowledge to perform the authentication handshake needed for our security solution. We have a choice of authentication protocols to work with; let's start with NTLM.

We know we'll leverage Microsoft.Samples.Security.SSPI to generate and process the underlying security tokens. We also know we'll need a matched set of channel sinks (client and server side) to make this work. With this in mind, we can start to lay down a plan for our authentication solution (Figure 5).

Figure 5. Implementing a NTLM Authentication Handshake (planned)

This figure shows a simplified version of the remoting infrastructure, focusing only on what happens as control flows through our custom client and server side channel sinks:

  1. The client channel sink instantiates a ClientCredential (for NTLM) and then a ClientContext to generate the security token to send to the server.

    Note   It's assumed you're familiar with classes such as ClientCredential and ClientContext. Refer to the Microsoft.Samples.Security.SSPI article for a full explanation of what these classes are for.

  2. A new header is added to the header array, setting its value to the security token generated by the ClientContext object

  3. The server channel sink retrieves the security token from the header array

  4. A ServerCredential (for NTLM) is instantiated followed by a ServerContext, passing it the security token sent from the client

At step 4 the ServerContext has accepted the security token from the client (the NTLM Negotiate message) and generated a security token to send back to the client (the Challenge message). At this point we have a problem.

The next thing we would normally do is pass control from our server side channel sink to the next sink in the chain—eventually through to the server side object (Foo). When the method call on Foo completes, the whole process would run in reverse back to the client (for any output parameters). However, our authentication handshake isn't complete. We have yet to pass back the NTLM challenge message from the server to the client channel sink. It's not until that happens (and the client sends back an appropriate response) that we should pass control along to the server side object.

The crux of our problem is that from the client's perspective, the process of making a method call is a two-leg roundtrip; from the perspective of our authentication handshake, it's a three-leg NTLM handshake. We have a mismatch, and simply piggybacking additional information in the header array is not enough to solve our problem.

What we need to do is to have a conversation between the client and server channel sink in the middle of the (two-leg) method call. The conversation in this case consists of a multi-leg authentication handshake. If the conversation ends in success (authentication) we can allow the method call to go through.

How do we create a conversation between channel sinks in the middle of a method call? Reviewing the previous section, let's look again at our implementation of IServerChannelSink.ProcessMessage(). It simply calls ProcessMessage() on the next sink in the chain and returns the result to the caller. We know from studying the remoting architecture that the "caller" is ultimately the client and the "result" that it's returning to the caller is the return message from the server object (the output parameters of the method call). But is it really? Actually, that result is simply the response message from the next channel sink (not the server object), and it is simply returning to the previous channel sink (not the client).

Now let's look at this from the perspective of the client channel sink. It does pretty much the same thing as its server side counterpart—it calls the next sink in the chain and returns the result to the caller. But what is it returning to the caller? Certainly a return message from the next channel sink. But is this a return message from a downstream channel sink or a server-side method?

Since channel sinks have to explicitly call the next sink in the chain, they can explicitly control execution flow. Therefore a pair of client/server sinks working together can create a conversation consisting of multiple roundtrips—while the base client has made only a single method call.

Mechanics of a Conversation

To manage a conversation we need to establish a protocol between the client and server side channel sink—definitely a matched pair. First, we need to decide what information to pass between the client and server channel sink. We're trying to implement an authentication handshake so at a minimum we need to pass a security token. We also need to pass a message type (so we know if the information we're passing is a client token, a server token, or perhaps even a valid response from the server object).

To simplify things, let's pass a single object that contains the properties we need (like message type and token). That way we can add properties later for additional information we need (which will undoubtedly happen). We'll call this class ChannelSinkMessage (Figure 6).

Figure 6. ChannelSinkCallContext

Second, we need to agree how this object will be passed between the client and server channel sink. In our case, we'll add a single header to the header array called "SecuritySinkMessage". All messages between the client and server will be passed via that header. This means we'll serialize the contents of ChannelSinkMessage before putting it into this header and de-serialize it when it comes out the other side.

Now we just need to implement the processing on either side to handle the authentication. Fortunately, we know what the handshake code looks like from the previous article. Conceptually, we need to implement a state machine in both channel sinks. Our state machine starts in an unauthenticated state and moves through to a final authenticated state or an error state (in which case an exception is thrown). The client side code looks like this:

// IClientChannelSink
public void ProcessMessage(IMessage msg, ITransportHeaders requestHeaders,
    Stream requestStream, out ITransportHeaders responseHeaders,
    out Stream responseStream)
{
    ClientCredential clientCredential = 
        new ClientCredential(Credential.Package.NTLM);

    ClientContext clientContext =
        new ClientContext(clientCred, "",   
        ClientContext.ContextAttributeFlags.None); 

    // initialize the message object with the message type and token
    ChannelSinkMessage channelSinkMessage =
        new ChannelSinkMessage(MessageType.AuthenticateRequest,
        clientContext.Token);

    // send the authenticate request to the server
    channelSinkMessage = 
        SendMessageToServerSink(channelSinkMessage,
        msg, requestHeaders, requestStream,
        out responseHeaders, out responseStream);

    // We need to handshake with the server side channel sink until
    // we've completed authentication.
    while (clientContext.ContinueProcessing == true)
    {
        // process the server token... 
        clientContext.Initialize(channelSinkMessage.Token);

        // put the security token into the message object
        channelSinkMessage.Token = clientContext.Token;
        channelSinkMessage.Type =
        ChannelSinkMessage.MessageType.ClientToken;

        if (clientContext.ContinueProcessing == true)
        {
            // we're not done with the handshake so we'll send a message
            // to the server side sink...
            channelSinkMessage = 
                SendMessageToServerSink(channelSinkMessage,
                msg, requestHeaders, requestStream,
                out responseHeaders, out responseStream);
        }
    }
}

Notice the similarity between this code and the client-side authentication code from the previous article. Essentially, both are the same except here we're passing messages to the server side sink via a message header (the details of this are in the SendMessageToServerSink() method).

The other half of the authentication handshake is taken care of on the server side:

// IServerChannelSink
public ServerProcessing ProcessMessage(IServerChannelSinkStack sinkStack,
    IMessage requestMsg, ITransportHeaders requestHeaders,
    Stream requestStream, out IMessage responseMsg,
    out ITransportHeaders responseHeaders, out Stream responseStream)
{
    // retrieve the message object
    ChannelSinkMessage channelSinkMessage = 
        ChannelSinkMessage.DeserializeStringToChannelSinkMessage(
        (string)requestHeaders["SecuritySinkMessage"]);

    // complete the authentication handshake
    CompleteAuthenticationHandshake(channelSinkMessage);

    // process the client message (the method call).  The original method
    // call is allowed to go through *only* if we've successfully 
    // authenticated
    ProcessRequest(channelSinkMessage, sinkStack, requestMsg, 
        requestHeaders, requestStream,
        out responseMsg, out responseHeaders, out responseStream);

    // add the security call context to the header (for the client)
    responseHeaders["SecuritySinkMessage"] = 
        ChannelSinkMessage.SerializeChannelSinkMessageToString(
        channelSinkMessage);
}

The server side ProcessMessage() is very simple. At the start of the method, we retrieve the call context object from the header array and de-serialize it into a ChannelSinkMessage object. At the end of the method we do the opposite—serialize it and then put it back into the header array (this is so the client sees any changes the server made to it).

The interesting part of this from an authentication standpoint is the method call to CompleteAuthenticationHandshake(). This method takes care of all of the details of completing the authentication handshake on the server side.

public void CompleteAuthenticationHandshake(
    ChannelSinkMessage channelSinkMessage)
{
    // process the message
    if (channelSinkMessage.Type == 
        ChannelSinkMessage.MessageType.AuthenticateRequest)
    {
        // the client is calling us for the first time on this method call
        // and is asking to be authenticated. Create a server security 
        // context.
        ServerCredential serverCredential = new     
            ServerCredential(_securityPackage);   
        _serverContext = new ServerContext(serverCredential, 
        channelSinkMessage.Token);

        // put the response into the channel sink message
        channelSinkMessage.Token = _serverContext.Token;
    }
    else if (channelSinkMessage.Type == 
        ChannelSinkMessage.MessageType.ClientToken)
    {
        // we're in the middle of an authentication conversation with the
        // client side channel sink.  In this case, we've received a
        // response which we need to process.

        // process the client token
        _serverContext.Accept(channelSinkMessage.Token);

        // put the response into the channel sink message
        channelSinkMessage.Token = _serverContext.Token;
    }

    // set the message type
    if (serverContext.ContinueProcessing == true)
        channelSinkMessage.Type = 
            ChannelSinkMessage.MessageType.ServerToken;
    else 
        channelSinkMessage.Type = 
            ChannelSinkMessage.MessageType.Authenticated;
}

To process the message from the client we first need to know if the message is an authentication request (a negotiate message). If it is, this is the first time the client channel sink is calling the server so we need to create a ServerCredential and ServerContext object (we'll put a reference to the ServerContext in a member variable so we can get back to it the next time the client calls the server). Once we've created the ServerContext, we'll put the generated token back into the channel sink message object and we're done.

If this is a subsequent call by the client channel sink then the MessageType is set to ClientToken. In this case, we're in the middle of a multi-leg authentication handshake. In this case we don't need to create a ServerContext object (we already did that in the previous call). So, we'll just retrieve the existing ServerContext and call Accept(). Once again, we'll take the generated token, and stuff it back into the channel sink message object—and we're finished.

The last step just properly sets the message type. If ServerContext.ContinueProcessing returns false, then we've successfully authenticated the client. So the message type we return to the server should simply be Authenticated. If we're not done, then we need to pass a security token back to the client. In this case, we'll set the message type to ServerToken.

The client and server code will exchange security tokens until the client has authenticated to the server. Once that's done, the next step is to allow the call to go through to the server object. That's taken care of here by the ProcessRequest() method in IServerChannelSink.ProcessMessage(). We'll skip talking about ProcessRequest() for now because we're focused here on authentication (I'll talk more about it in a later section).

So, we've successfully authenticated the client to the server but our multi-leg authentication handshake has led to another problem. While we've been handshaking, what happened to the original message (the method call to the server object)?

The answer is that it's gone, and we need to think carefully about what to do with it while we're busy exchanging security tokens.

Call Context   Reviewing the code for IServerChannelSink.ProcessMessage(), you'll notice that I save a reference to the ServerContext in a member variable of the server channel sink (the _serverContext member). This allows the server sink to process subsequent client tokens using the same ServerContext. But is this appropriate?

There may be multiple clients on different client machines connecting to the server. However, on the server, there is one channel, and one channel sink chain built to service our remote object. That means if we have more than one client, both are trying to store their ServerContext in the same _serverContext member variable and they'll end up stepping on each other.

The solution to this is to build some concept of call context. What we'd like is when a client calls in to the server sink for the first time, we create a new ServerContext object and place that in a session belonging only to that client. On a subsequent call, we can then retrieve the ServerContext from that session.

The question is how do we create the concept of call context? In the full sample that accompanies this article, I did it by having each client sink generate a GUID for every call. That GUID gets sent to the server in the header array. The server then maintains a list (a hash table) of client calls—keyed by their GUID. This allows the server to set and retrieve information for a particular caller. In this case, that information is a ServerContext object. When a call initially comes in, a ServerContext object is added to the hash table. When authentication is complete, it's removed. This allows multiple clients to be authenticated at the same time (even though there's only one instance of the server sink)—and the clients never step on one another.

Managing the Method Call

While the authentication conversation is going on, what's happening to the client message (here, the method call)? In the sample above, this point is ignored, but we need to start paying attention to it. In IClientChannelSink.ProcessMessage() look at the call to SendMessageToServerSink(). This method takes as an argument the ChannelSinkMessage (which contains the security tokens needed for the authentication handshake) but also all of the original arguments of ProcessMessage(). This means that the method call (which is wrapped up in the requestStream parameter) is being passed to the server channel sink along with the NTLM negotiate message. Now let's look at it from the server's perspective.

When the server channel sink is initially called, it creates a new ServerCredential/ServerContext and stuffs the generated security token (the Challenge) into the channel sink message object. The channel sink then returns to the caller without calling ProcessMessage() on the next sink in the chain. This means control returns (eventually) back to the client-side sink (rather than forward to the server object). As we know, this is what allows us to hold a conversation between our channel sinks.

When the client receives the reply from the server-side sink, it processes it, generates the Response and sends it back to the server sink. The server sink then processes the response and authenticates the client. It's at this point the method call should be allowed to go through to the server. But so far we've ignored all of the parameters to ProcessMessage(). This means we allowed the client message to go along with the first call to the server (along with the Negotiate message). Also, we never initialized the output parameters when returning from IServerChannelSink.ProcessMessage().

Note   It's because of this point that the code I show for IServerChannelSink.ProcessMessage won't even compile as is.

What gets returned in responseHeaders and responseStream if we're just passing back a Challenge message to the client-side sink?

All of this means that we need to be more thoughtful about what we do with the original client message while we're busy with the authentication handshake. The client message needs to be put aside while the handshake is going on. If authentication succeeds, the method call should be allowed to go through. Otherwise, an exception should be thrown.

So, what do we do with the original message? There are basically two alternatives. First we could have the client hang on to it, and only pass it on to the next channel sink when it's done authenticating (that is, ClientContext.ContinueProcessing is false). The other option is to let the message go along with the first security token and have the server sink hang on to it until authentication is complete.

The latter solution seems flawed. What if authentication fails? Then we've sent the client message on to the server sink for nothing. This may be especially poor judgment if the message is very large. It seems more reasonable to hold it in the client until the handshake is complete. This is the solution I chose.

Saving the original message is rather straightforward. If you look at IClientChannelSink.ProcessMessage() the method call is actually contained in two parameters: requestHeaders and requestStream. Putting aside the message is really just a matter of adding two member variables to the client-side sink. We'll use these data members to store references to the original requestHeaders and requestStream objects. So right after we create the ChannelSinkMessage, we'll replace the simple call to SendMessageToServerSink() with this:

// IClientChannelSink
public void ProcessMessage(IMessage msg, ITransportHeaders requestHeaders,
    Stream requestStream, out ITransportHeaders responseHeaders,
    out Stream responseStream)
{
    // . . .

    // initialize the message object with the message type and token
    ChannelSinkMessage channelSinkMessage =
        new ChannelSinkMessage(MessageType.AuthenticateRequest,
        clientContext.Token);

    // we're not done with the handshake so we need to put aside the
    // request until we've authenticated with the server...
    _requestHeaders = requestHeaders;
    _requestStream = requestStream;

    // ...send a message to the server side channel sink without
    // the client message...
    channelSinkMessage = SendMessageToServerSink(channelSinkMessage,    
        msg, null, null, out responseHeaders, out responseStream);

    // . . .
}

Notice that the requestHeaders and requestStream are not passed to the server. Null values are passed instead (actually empty TransportHeaders and MemoryStream objects are passed but SendMessageToServerSink() takes care of that detail).

Later on, when the authentication handshake is complete, the message is put back and sent along to the server (in our example along with the NTLM Response message). With this in mind we can modify the while loop found in IClientChannelSink.ProcessMessage():

// IClientChannelSink
public void ProcessMessage(IMessage msg, ITransportHeaders requestHeaders,
    Stream requestStream, out ITransportHeaders responseHeaders,
    out Stream responseStream)
{
    // . . .

    // put the security token into the message object
    channelSinkMessage.Token = clientContext.Token;
    channelSinkMessage.Type =
        ChannelSinkMessage.MessageType.ClientToken;

    if (clientContext.ContinueProcessing == true)
    {
        // we're not done with the handshake so we'll send a message
        // to the server side sink without the request
        channelSinkMessage = 
            SendMessageToServerSink(channelSinkMessage,
            msg, null, null,
            out responseHeaders, out responseStream);
    }
    else
    {
        // Send a message for the server side component with the 
          request...
        channelSinkMessage = SendMessageToServerSink(
            channelSinkMessage, msg, _requestHeaders, _requestStream,
            out responseHeaders, out responseStream);
    }

    // . . .
}

In this case we have the same code as before, but inside the while loop we make a decision as to whether or not to send along the original client message. Only if we're done with the authentication handshake do we send it along to the server.

On the server side a similar modification is made. If the server-side sink is sending a security token to the client side sink, then the responseMsg, responseHeaders, and responseStream parameters are initialized to null before returning to the client. These parameters are only set to valid values if the method call goes through to the server object.

Going back to our original picture, we now have a revised plan for authenticating a client to a server (Figure 7).

Figure 7. Implementing an NTLM Authentication Handshake (final)

In this diagram, when the client calls a method on Foo, the following interim steps take place:

  1. The client channel sink puts the original method call aside
  2. A ClientContext is created and the initial security token (the Negotiate message) is sent to the server
  3. The server channel sink sees that it is being called for the first time and creates a ServerContext. The generated security token (the Challenge) is sent back to the client sink.
  4. The security token is processed and the Response message is generated. ClientContext.ContinueProcessing is false so the original client message is sent along with the Response to the server sink.
  5. The server sink processes the Response, and authenticates the client. The original message is passed along to Foo.

Putting it All Together

The code snippets included in this section are representative of the code in the accompanying sample but the actual implementation isn't quite this simple. This is partly obvious. I mentioned at the start of this article that the accompanying sample supports Kerberos, Negotiate, and NTLM. What we've talked about in this section only covers NTLM so there must be more to the sample application. What I'd like to do in this section is to take a detour and provide more detail about how the handshake works in the sample channel sinks.

I won't go into all of the detail about how authentication works in the sample (you can look at the code yourself). But I will explain conceptually how the handshake works.

I mentioned earlier how the handshake is managed by something that conceptually resembles a state machine. In the finished sample, there is an actual implementation of a client and server state machine that manages the authentication process. The state machines exchange messages (which are implemented as .NET classes). Each of these messages is transported from one channel sink to the other when it is serialized in the header array (the same idea that we've already discussed).

The states in each state machine and the message exchange that occurs between the two is shown in the following diagram (Figure 8).

Figure 8. The states in each state machine and the message exchange that occurs between the two

You can see right away that there's more going on here than what we've talked about just getting NTLM to work. Both state machines start off in an initial state and may move through a number of other states before the method call completes. Note that each state machine always resets to the initial state for every ** method call. So the figure gives you some sense that there may be many exchanges between the client and server channel sinks for the simple two-legged method call that the client is making.

The circles in the figure represent the states in each state machine. The solid lines represent the allowable transitions between states. The dotted lines represent the messages that are exchanged between the state machines (and conceptually identifies which state processes each message).

The Generate SPN Request and Process SPN Response client states are used for discovery of the Server Principal Name (required for Kerberos). Like all other messages the SPN Request and SPN Response are implemented as .NET classes. The SPN Response contains the Server Principal Name of the server (as looked up by the Process SPN Request server state).

The Generate Client Token and Process Server Token client states handle client side processing of the actual authentication handshake. Generate Client Token sets up the ClientCredential and ClientContext objects (the types from Microsoft.Samples.Security.SSPI) and generate the initial token to be processed by the server. The client token is transmitted to the Process Client Token state on the server side and the resulting server token (if any) is transmitted back to the client. The Process Server Token state then processes the token and either loops back to Generate Client Token or on to Generate Method Call if the handshake is complete. The way the state machines are designed, the authentication may have any number of legs to the handshake. This makes it possible to support NTLM, Kerberos, and Negotiate (with any choice of options) with the same program code.

Once authentication is complete, the Generate Method Call state generates a message that accompanies the original method call (the one that the client security sink put aside). This is processed by the Process Method Call server state. The response is processed by the Process Method Response client state. These last two states mostly handle clean up before both state machines go back to their initial state, ready for the next method call.

Actually implementing state machines that implement this conceptual model breaks up the problem into manageable pieces and makes the sample easier to understand. If you're running the security sinks in a debugger, you can set a breakpoint in each state and watch how each machine transitions from one state to another.

Recap

  • We can put a message on hold, have a private conversation between the client/server channel sinks, and only on successful authentication allow the message to go through.
  • The solution we've put together is transparent to the client & server

So far we've talked about how we can authenticate with different protocols, but how does the developer choose which one to use?

Configuration

One of the most important goals of this security solution is to provide authentication without requiring code on the client or server side. The previous section explained in some detail how to do this with NTLM. But we know that the accompanying sample supports multiple authentication protocols: NTLM, Kerberos, and Negotiate. Somehow we need to tell the channel sinks which authentication protocol we'd like to use. It would be nice to be able to configure it.

We talked earlier about how channel sinks get created. Essentially there are two ways—through configuration files or through program code. We looked at a configuration file needed to create our custom security sink on the client side. This time, let's look at a complete server configuration file used to host our component, Foo, in Microsoft ASP.NET.

<configuration>
  <system.runtime.remoting>
    <application>

      <service>
        <wellknown mode="SingleCall" 
       type="Microsoft.Samples.Runtime.Remoting.
         Security.Sample.Server.Foo,
            Microsoft.Samples.Runtime.Remoting.Security.Sample.Server" 
                objectUri="Foo.rem" />
      </service>

      <channels>
        <channel ref="http">
          <serverProviders>        
            <provider type="Microsoft.Samples.Runtime.Remoting.Security.
               SecurityServerChannelSinkProvider, 
               Microsoft.Samples.Runtime.Remoting.Security" 
               securityPackage="negotiate" />
            <formatter ref="soap" />
          </serverProviders>
        </channel>
      </channels>

    </application>
  </system.runtime.remoting>
</configuration>

Note the difference in the way the provider type is specified versus the formatter type. The provider element specifies the type name of the provider while the formatter element simply contains a ref attribute. The ref attribute is just a simple way to refer to a type that's already defined elsewhere. In this case, the references to "http" and "soap" refer to fully qualified type names found in the machine.config file.

Unlike the client-side configuration file, this file has an entry for a wellknown type—this is the type we want to remote: Foo. Similar to its client side counterpart we define the channel we're going to use (HTTP) and the formatter (SOAP). We also define a provider; this is our custom channel sink.

This time, look at the extra attribute that appears in the provider element: the securityPackage attribute. This is something we haven't seen before. The provider tag defines the provider we're going to use: the type is SecurityServerChannelSinkProvider and the securityPackage is "negotiate". Presumably what this means is that our server side channel sink initializes the ServerCredential it creates with the Negotiate authentication protocol. This kind of configuration is exactly what we need. But how does this attribute make its way into our channel sink?

Recall the constructors in IXXXChannelSinkProvider. Both the client and server derivative have a parameterized constructor. One of the parameters to that constructor is an IDictionary reference. As it turns out, this is a dictionary of attributes passed into the provider element in the configuration file. Each attribute we add to that element gets passed into this constructor as an entry in the dictionary. This allows us to write code in our channel sink provider like this:

public class SecurityServerChannelSinkProvider: IServerChannelSinkProvider
{
    // . . .

    public SecurityServerChannelSinkProvider (IDictionary properties,
        ICollection providerData)
    {
        string securityPackage = properties["securityPackage"] as string;
        if (securityPackage == null)
            _securityPackage = Credential.Package.Negotiate;   // default
        else if (securityPackage == "negotiate")
            _securityPackage = Credential.Package.Negotiate;
        else if (securityPackage == "kerberos")
            _securityPackage = Credential.Package.Kerberos;
        else if (securityPackage == "ntlm")
            _securityPackage = Credential.Package.NTLM;
    }

    // . . .

    // data members
    private Credential.Package _securityPackage;
}

So now we can extract the name of the security package from the configuration file and set that in our channel sink provider. Once there, it's a simple matter to pass that to the channel sink in its constructor (remember the provider is what creates the channel sink).

Now that we know how to add custom configuration parameters, we can add any other attributes to our configuration file that might be useful to us.

Recap

  • We can now configure the channel sinks with whatever parameters we want, just by modifying the client or server configuration file.
  • We now have our choice of authentication protocols by setting the securityPackage attribute
  • We still don't have to write any client or server code to make the authentication solution work.

Next we'll look at some additional considerations for remoting security.

Remoting Security

We can now authenticate the client to the server using a choice of authentication protocols. But we now need to consider some additional security issues: impersonation level and authentication level.

Impersonation Level

To this point we've talked about how to authenticate the client to the server and even how the server can impersonate the client. But this raises a question that we haven't addressed yet. How far should the server be able to take the client's credentials? At a minimum, the server should be able to identify the caller, but not be able to do anything with the caller's credentials. Past that, it might be acceptable to allow the server to impersonate the client but limit how much work can be done with those credentials. The ultimate would be for the client to fully delegate his credentials to the server; so the server could go anywhere performing work with the client's credentials.

All of these questions relate to an issue of trust—how much does the client trust the server with its credentials? The level of trust that the client bestows on the server is the impersonation level.

For our security solution we'll make this a configuration issue. We'll set the impersonation level in the configuration file (much like we did with the security package). Since the impersonation level is the client's trust for the server, it will only be set on the client side (and therefore only set in the client configuration file). The server will only be allowed to carry the client's credentials as far as the impersonation level dictates.

Our security solution will have three impersonation levels:

  • Identify—Server can only determine identity of caller
  • Impersonate—Server can impersonate caller (cannot carry credentials across more than one network hop)
  • Delegate—Server can impersonate caller and carry credentials across unlimited network hops, at least until the Kerberos ticket expires (only supported by Kerberos).

So how do we implement the impersonation level? From a configuration standpoint, we've already seen how to add provider parameters to the configuration files. We'll do the same here, to add the impersonation level.

Here's a portion of a client configuration file with the new impersonation level attribute added:

<provider 
    type="Microsoft.Samples.Runtime.Remoting.Security.
        SecurityClientChannelSinkProvider, 
        Microsoft.Samples.Runtime.Remoting.Security" 
    securityPackage="negotiate" impersonationLevel="impersonate"/>

Once in the configuration file, we already know how to get this parameter to the client side channel sink when it's created. So I won't explain the configuration solution further.

The next issue is what to do with the impersonation level in the channel sink implementation. The impersonation level controls how we setup the ClientCredential object on the client side. This is shown by the following code fragment.

// create a credential for the configured security package
ClientCredential clientCredential = new 
  ClientCredential(_securityPackage);


// set the delegate context attribute flag if the impersonation
// level demands it
ClientContext.ContextAttributeFlags contextAttributeFlags = 
  ClientContext.ContextAttributeFlags.None;
if (_impersonationLevel == SecuritySink.ImpersonationLevel.Delegate)
    contextAttributeFlags = contextAttributeFlags | 
        ClientContext.ContextAttributeFlags.Delegate;


// set the identify flag if the impersonation level demands it
if (_impersonationLevel == SecuritySink.ImpersonationLevel.Identify)
    contextAttributeFlags = contextAttributeFlags |
        ClientContext.ContextAttributeFlags.Identify;


// create the client context
clientContext = new ClientContext(clientCredential, _serverPrincipalName, 
    contextAttributeFlags);

All we have to do is look at the impersonation level which we read in from the configuration file and set the context attribute flags appropriately. When we authenticate with the server, SSPI will ensure that the server side token that's created for the authenticated caller has the appropriate capabilities. If the impersonation level is Identify then the server will be able to obtain the identity of the caller but not perform any useful work while impersonating the caller. If the impersonation level is Impersonate, then the server will be able to impersonate the caller and perform work on the caller's behalf. If the impersonation level is Delegate then the server will be able to carry the client's credentials across multiple network hops.

In any case, we don't have to do much work. We just set the flags and SSPI takes care of the rest.

That's about it for the impersonation level. Next we'll look at the authentication level.

Anonymous Impersonation   Some people will note that the concept of impersonation level is borrowed from COM. This was intentional to give users a familiar experience when it comes to remoting security. The difference is that there is an impersonation level missing in this solution: Anonymous.

The Anonymous impersonation level in COM allows the client to grant no trust to the server; so the server cannot even determine the identity of the caller. Even in COM the anonymous impersonation level is only allowed between a client and server running on the same machine (this level is silently promoted to identify for cross machine calls).

Because of its limited use, it wasn't included in this remoting solution. After all, if you really want anonymous access, just don't install the security channel sinks.

Authentication Level

If you look closely at the authentication solution we put together, you'll notice that every time the client makes a method call, the client re-authenticates with the server. Whether you think this is over or under doing it, it raises a basic question about our authentication solution. Specifically, how tightly should we protect the messages that are sent between client and server?

The strategy we've taken so far (authenticate on every method call) certainly helps guarantee that any method call the server object receives is indeed from the authenticated caller. After all, if we didn't authenticate on every call, what's to prevent a bad guy on our network from taking advantage of the fact that we only authenticate on some method calls (like the first one) and not others?

Different approaches aren't necessarily right or wrong, but there may be cases where we would want to throttle this protection level up or down. We've faced this general issue before. Like other features of this security solution, we'll make this option configurable through an option called the authentication level.

Our security solution offers three authentication levels:

  • Call—The client is authenticated to the server on every method call.
  • PacketIntegrity—The client is authenticated on every method call. Every sender of a message generates an accompanying signature. Every receiver of a message verifies that signature. Messages are still sent in clear text.
  • PacketPrivacy—The client is authenticated on every method call. All messages are sent in encrypted form and are decrypted by the receiver.

Essentially, the authentication level defines a level of protection the client and server wishes to afford the messages passed on the network. Each level provides an incremental increase in security of message traffic. Depending on your need, higher levels of protection may be required.

In our security solution, the authentication level must be set on the client and server side. So what happens if the authentication level on the client doesn't match the authentication level on the server? Borrowing semantics from COM, the behavior is as follows:

  • If the authentication level on the client is lower than the server, then an exception is thrown. This is not allowed (Call is the lowest level, PacketPrivacy is the highest).
  • If the authentication level on the client is higher than the server, then the client authentication level is used.

In other words, the server authentication level sets a low water mark. The client can promote that by using a higher authentication level but cannot go below it. Logically, this makes sense. If the client wants to use Call but the server demands PacketPrivacy, the call should fail (the alternative is to demote the server authentication level and open a security hole). On the other hand, if the client demands PacketPrivacy and the server is happy with Call, it makes sense to promote the authentication level on the server to a higher protection level.

So how do we implement the authentication level? From a configuration standpoint, we've already seen how to add provider parameters to the configuration files. Just like with the impersonation level, we'll do the same here and add the authentication level.

Here's a portion of a client configuration file with the new authentication level attribute added:

<provider 
    type="Microsoft.Samples.Runtime.Remoting.Security.
        SecurityClientChannelSinkProvider, 
        Microsoft.Samples.Runtime.Remoting.Security" 
    securityPackage="negotiate" impersonationLevel="impersonate" 
    authenticationLevel="call"/>

Once in the configuration file, we already know how to get this parameter to the client side channel sink. So we won't re-address those details here.

The impersonation level impacts code both on the client and server side. Let's look at the client side first. If the authentication level is Call, then effectively we don't have to do anything. We've already implemented call level authentication semantics.

If the authentication level is PacketIntegrity or PacketPrivacy, then we have work to do. As we've already seen, in IClientChannelSink.ProcessMessage() I use a helper function called SendMessageToServerSink() to actually send the message to the next sink in the chain. One of the responsibilities of this function is to sign or encrypt the message being sent if the authentication level is set to PacketIntegrity or PacketPrivacy respectively. The relevant portions of SendMessageToServerSink() look like this:

protected ChannelSinkCallContext SendMessageToServerSink(
    ClientContext clientContext, IClientChannelSink nextSink,
    ChannelSinkMessage channelSinkMessage, IMessage msg,
    ITransportHeaders requestHeaders, Stream requestStream,
    out ITransportHeaders responseHeaders, out Stream responseStream)
{
    // . . .

    // encrypt/sign as appropriate
    requestStream = EncodeStream(clientContext, _authenticationLevel, 
        requestStream);

    // . . .

    // process the message
    nextSink.ProcessMessage(msg, requestHeaders, requestStream,
        out responseHeaders, out responseStream);

    // decrypt/verify as appropriate
    responseStream = DecodeStream(clientContext, _authenticationLevel, 
        responseStream);

    // . . .
}

The heart of this function is the call to ProcessMessage() on the next sink in the chain. But note the code right before and right after this method call. If the authentication level is PacketIntegrity, for example, we need to sign the message before we send it on to the next sink in the chain. When it comes back, we need to verify it. Likewise, if the authentication level is PacketPrivacy, we need to encrypt it on the way in and decrypt it on the way out. That's what EncodeStream() and DecodeStream() are for.

EncodeStream ensures that the stream is properly signed or encrypted:

public static Stream EncodeStream(Context context, 
    SecuritySink.AuthenticationLevel authenticationLevel, Stream inStream)
{
    // if the authentication level doesn't call for special encoding,
    // then just return the original stream
    if (authenticationLevel == SecuritySink.AuthenticationLevel.Call)
        return inStream;

    // convert the stream to a byte array
    Byte[] arStream = ConvertStreamToByteArray(inStream);

    // encode the byte array
    Byte[] encodedMsg = null;
    if (authenticationLevel == 
      SecuritySink.AuthenticationLevel.PacketIntegrity)
        encodedMsg = context.SignMessage(arStream);
    else if (authenticationLevel == 
      SecuritySink.AuthenticationLevel.PacketPrivacy)
        encodedMsg = context.EncryptMessage(arStream);

    // convert the byte array to a stream
    return ConvertByteArrayToStream(encodedMsg);
}

If the authentication level is Call, the function does nothing and simply returns the stream that was passed in. Note that use of the Context object to sign or encrypt the message. If the authentication level is PacketIntegrity, the message is signed. If the authentication level is PacketPrivacy, it's encrypted.

DecodeStream does the opposite of this function:

public static Stream DecodeStream(Context context, 
    SecuritySink.AuthenticationLevel authenticationLevel, Stream inStream)
{
    // if the authentication level doesn't call for special decoding,
    // then just return the original stream
    if (authenticationLevel == SecuritySink.AuthenticationLevel.Call)
        return inStream;

    // convert the stream to a byte array
    Byte[] arStream = ConvertStreamToByteArray(inStream);

    // decode the byte array
    Byte[] decodedMsg = null;
    if (authenticationLevel == 
      SecuritySink.AuthenticationLevel.PacketIntegrity)
        decodedMsg = context.VerifyMessage(arStream);
    else if (authenticationLevel == 
      SecuritySink.AuthenticationLevel.PacketPrivacy)
        decodedMsg = context.DecryptMessage(arStream);

    // convert the byte array to a stream
    return ConvertByteArrayToStream(decodedMsg);
}

Again, if the authentication level is Call, the function does nothing. But if the authentication level is PacketIntegrity, the message is verified. If the authentication level is PacketPrivacy, the message is decrypted.

Now we have to go through a similar procedure on the server side. This works essentially the same as on the client, so I won't go into the server side details here.

Recall that the impersonation level that actually gets used is the higher of the client and server authentication level (as long as the client authentication level is equal to or higher than the server level). This means that the server channel sink must know what authentication level was configured on the server side, but also what authentication level was set on the client.

So to get the authentication level to the server side sink, we'll add to the ChannelSinkMessage object (Figure 9).

Figure 9. Adding to the ChannelSinkMessage object

When the call context object is passed to the server, the authentication level automatically flows with it.

On the server side, the channel sink must use an authentication level that's the greater of the client and server authentication level. This is a process I call normalizing. This gets done in a helper function called NormalizeAuthenticationLevel().

private SecuritySink.AuthenticationLevel NormalizeAuthenticationLevel(
    SecuritySink.AuthenticationLevel clientAuthenticationLevel)
{
    // verify that the client's authentication level is equal to or
    // greater than ours.
    if (clientAuthenticationLevel < _authenticationLevel)
        throw new Exception("Client auth level can't be less than server 
          auth");

    // return a normalized, server authorization level equal 
    // to max(client auth level, server auth level)
    if (clientAuthenticationLevel == _authenticationLevel)
        return _authenticationLevel;
    else
       return clientAuthenticationLevel;
}

Once we've calculated a normalized level, the process of encoding and decoding is much like on the client. For the server channel sink, the code we have to modify is in the ProcessRequest() helper method that I introduced in an earlier section.

void ProcessRequest(ChannelSinkMessage channelSinkMessage, 
    IServerChannelSinkStack sinkStack, IMessage requestMsg,
    ITransportHeaders requestHeaders, Stream requestStream,
    out IMessage responseMsg, out ITransportHeaders responseHeaders,
    out Stream responseStream)
{
    // . . .

    // call the server object
    sinkStack.Push(this, null);
    requestStream = DecodeStream(serverContext, _normalizedAuthLevel, 
        requestStream);

    ServerProcessing processing = _nextSink.ProcessMessage(sinkStack, 
        requestMsg, requestHeaders, requestStreaam, 
        out responseMsg, out responseHeaders, out responseStream);

    responseStream = EncodeStream(serverContext, _normalizedAuthLevel, 
        responseStream);

    // . . .
}

Notice that we decode the message before passing it on to the next sink and encode it after. This may seems backwards. But assume for example that the authentication level is PacketPrivacy. This means we need to decode the message before passing it on to the server object and encode it (re-encrypt it) before sending it across the wire back to the client. This ensures that the message is encrypted on the wire but appears in clear text when it gets to the server object.

Why not other authentication levels?   Like the impersonation level, the concept of the authentication level is taken from COM. Those familiar with COM security will notice that there are some authentication levels that are missing here. Namely, None, Connect, and Packet.

The reason None isn't included is much the same reason as for the Anonymous impersonation level. If you didn't want to authenticate, you just shouldn't use the security sinks.

Packet is not used because it doesn't make sense for .NET remoting. In COM, Packet means that every network packet sent between the client and server are authenticated (a single message may span more than one network packet). Since we don't have control down to the packet level in .NET remoting, the issue of supporting this authentication level is moot.

The Connect authentication level is more interesting. Why not support the Connect authentication level? After all, it just means that we should authenticate the first time we call into a server object (from a given proxy)—but not on subsequent method calls.

The key problem with supporting Connect is with certain .NET activation options that were never available with COM. In particular there's an issue with server activated objects. For example, what does it mean to support the Connect authentication level with a SingleCall server activated object? The notion of authenticating on the first call for an object where there is no notion of a connection is somewhat gray.

It's certainly possible to implement the Connect authentication level, but I chose not to. In COM, the Connect authentication level is typically chosen for performance (you don't have to go through the authentication handshake on every call). But experience has shown that the security sinks perform well even when authenticating every method call. So I've left the lowest supported authentication level at the more secure Call level.

Recap

  • We can now configure the client's trust for the server through the impersonation level.
  • We can manage how tightly we want to secure messages sent between client and server using the authentication level.
  • We still don't need special code on the client or server side.

Next we'll look at adding a final configuration option allowing the user to define custom principals.

Custom Principals

There is one more configuration parameter we can add to our remoting solution—and for the first time it deals with authorization rather then authentication.

If you're familiar with ASP.NET security then you've probably worked with one of the ASP.NET authentication providers (Windows, Forms, or Passport). Each of these providers gives the user a way to hook into the authentication process primarily to attach a custom IPrincipal object to the context. In ASP.NET you would typically do this by providing an authentication handler in the Global.asax file (here, you would define an event handler called WindowsAuthentication_OnAuthenticate in the case where the Windows authentication provider is used).

Why would you want to attach your own IPrincipal object to the context? You might do it so that developers can directly retrieve a reference to your custom principal and perform identity or role membership checks. But more importantly, setting your own custom principal means that you can implement declarative security based on roles that you define.

For our remoting solution, a mechanism is provided whereby you can attach your own custom principal to the context; much like how it is done in ASP.NET.

So how do we go about providing a mechanism to allow users to define their own custom principal? The first issue is configuration. Like we've done several times before, we'll allow the user to add a configuration parameter to the provider element in the server side configuration file.

<provider 
    type="Microsoft.Samples.Runtime.Remoting.Security.
        SecurityClientChannelSinkProvider, 
        Microsoft.Samples.Runtime.Remoting.Security" 
    securityPackage="negotiate" impersonationLevel="impersonate"
    authenticationLevel="call" 
    authenticationHandler="MyAuthenticationHandler, Server"/>

This configuration file fragment is similar to those seen before but the authenticationHandler attribute has been added. In this example, it specifies a type called MyAuthenticationHandler (found in the Server assembly). The type specified in the authenticationHandler attribute is dynamically created by the server side channel sink just before the server side object is called. In our case, that would be in the ProcessRequest() method:

void ProcessRequest(ChannelSinkMessage channelSinkMessage, 
    IServerChannelSinkStack sinkStack, IMessage requestMsg,
    ITransportHeaders requestHeaders, Stream requestStream,
    out IMessage responseMsg, out ITransportHeaders responseHeaders,
    out Stream responseStream)
{
    // . . .

    // determine the client identity
    clientSecurityToken = _serverContext.ClientSecurityToken;
    WindowsIdentity clientIdentity = new 
        WindowsIdentity(clientSecurityToken);

    // call the user's authentication handler if there is one
    if (_authenticationHandlerTypeName != null)
        CallAuthenticationHandler(clientIdentity, 
            _authenticationHandlerTypeName);

    // call the server object
    sinkStack.Push(this, null);
    ServerProcessing processing = _nextSink.ProcessMessage(sinkStack, 
        requestMsg, requestHeaders, requestStream,
        out responseMsg, out responseHeaders, out responseStream);


    // . . .
}

The call to CallAuthenticationHandler() does all the work. Note that this method is only called if the authentication handler name is not null (meaning there is an authenticationHandler attribute in the configuration file). The presence of that attribute in the config file is optional, and if it isn't found the call to this method is skipped:

private static void CallAuthenticationHandler(WindowsIdentity 
  clientIdentity, 
    string authenticationHandlerTypeName)
{
    // extract the assembly and type name from the fully qualified type 
      name
    Type authenticationHandlerType = 
      Type.GetType(authenticationHandlerTypeName);
    string assemblyName = authenticationHandlerType.Assembly.FullName;
    string typeName = authenticationHandlerType.FullName;

    // create the authentication handler
    ObjectHandle objHandle = Activator.CreateInstance(assemblyName, 
      typeName);
    IWindowsAuthentication windowsAuthentication =
    objHandle.Unwrap() as IWindowsAuthentication;

    // call the handler
    IPrincipal user = null;
    windowsAuthentication.OnAuthenticate(clientIdentity, out user);

    // set the thread principal
    Thread.CurrentPrincipal = user;
}

The implementation of CallAuthenticationHandler is simple. It takes the name of the authentication handler type as an argument (this is the type name right out of the configuration file). It then instantiates the type and calls into the OnAuthentate() method, passing in the identity of the client. On return, Thread.CurrentPrincipal is set equal to the custom principal instantiated by the authentication handler.

Note that the server sink calls into the authentication handler via the well defined interface: IWindowsAuthentication:

public interface IWindowsAuthentication
{
    void OnAuthenticate(IIdentity identity, out IPrincipal user);
}

The single method of this interface, OnAuthenticate, providers the callee with the identity of the authenticated caller. The implementation of this interface has the option of setting a custom principal on the context via the user output parameter. This can be done as follows:

public class MyAuthenticationHandler : IWindowsAuthentication
{
    public MyAuthenticationHandler()
    {
    }

    // IWindowsAuthentication
    public void OnAuthenticate(IIdentity identity, out IPrincipal user)
    {
        // create a custom principal and pass back to the caller
        // (the caller will set that principal for the current call)
        user = new MyCustomPrincipal(identity);
    }
}

The advantage of this is that the server object can now take advantage of role based security:

public class Foo : MarshalByRefObject
{
    public Foo()
    {
    }

    [PrincipalPermissionAttribute(SecurityAction.Demand, Role = 
      "Administrator")]
    public void AdministratorMethod()
    {
        // access custom principal directly
    IPrincipal currentPrincipal = Thread.CurrentPrincipal;

    // find out if the caller is the given role
    bool isInRole = currentPrincipal.IsInRole("Clerk");
    }
}

Note that the sample method leverages role based security in two ways. The first is declaratively through the PrincipalPermissionAttribute. The second is explicit by directly retrieving and interrogating the current principal.

Recap

  • Role based security is based on allowing the user to set custom principals—which our solution now provides for.
  • Setting a custom principal is just another configuration issue. You specify the authentication handler in the .config file.
  • Replacing the principal with our remoting security solution works just like with Passport, Forms, or Windows authentication in ASP.NET.

Next we'll look at programmatic security.

Programmatic Security

I've spent this entire article bragging about how all of the features we set out to implement can be provided through configuration—with no client or server side code. So far, that's been true. This section is about handling those times when you want programmatic access to these features, both in the client and in the server class.

The ability to configure the features of our security solution is great, but there are times when we might want to programmatically override these settings. This is done via interfaces implemented on the channel sinks that allow access to these parameters.

The client channel sink implements an interface called IClientSecurity which allows access to parameters particular to the client:

public interface IClientSecurity
{
    Credential.Package SecurityPackage{ get; set; }
    SecuritySink.AuthenticationLevel AuthenticationLevel{ get; set; }
    SecuritySink.ImpersonationLevel ImpersonationLevel{ get; set; }
    string ServerPrincipalName{ get; set; }
    bool MutualAuthentication{ get; set; }
}

As you can see, this interface allows the client to set or get values for the security package, authentication level, impersonation level, server principal name, and mutual authentication. When the client channel sink in instantiated, it attempts to pull the values for all of these parameters directly from the configuration file (we've already seen in detail how that works). If the values aren't found in the configuration file, defaults are set. If these same values are set through IClientSecurity, they override those values originally read from the configuration. Of course, this interface can also be used to retrieve these values, so it's possible to programmatically get the values of the parameters set in the configuration file.

The first three properties in IClientSecurity are probably obvious. The last two deserve some explanation.

If the security package is Kerberos (or Negotiate on a machine that supports Kerberos), then on the client side we must identify the server principal to which we wish to authenticate. This process was described in detail in the previous article. Recall that the server principal name is essentially the name of the principal for which we must obtain a Kerberos ticket.

The obvious question that comes up is, what should we set the server principal name equal to? If the remote object we're accessing is hosted in a process running under the security principal Quux\Bob; then the security principal name is "Bob@Quux". This shouldn't be too surprising after studying the Kerberos protocol in the previous article.

The interesting question is, if you don't set the ServerPrincipalName property on IClientSecurity, then how does the client channel sink figure it out? After all, programmatic access is optional, and the only way to explicitly set the server principal from the client is through IClientSecurity (or through the client configuration file).

To solve this problem, the client channel sink asks the server sink for the server principal name. This exchange was introduced earlier when we talked about the client and server state machines. This allows the SPN to be discovered without any special intervention by the client. However, the caller can override this by setting the server principal name in the client side configuration file. To override that, the caller can programmatically set the ServerPrincipalName property on IClientSecurity.

The IClientSecurity::MutualAuthentication property allows the caller to signal that they want mutual authentication (only applies if Kerberos or Negotiate is chosen as the security package). Like the other values, this can be set in the configuration file and is overridden by setting this property on IClientSecurity.

That takes care of IClientSecurity, which naturally leads us to its server side counterpart.

public interface IServerSecurity
{
    Credential.Package SecurityPackage{ get; }
    SecuritySink.AuthenticationLevel AuthenticationLevel{ get; }
    SecuritySink.ImpersonationLevel ImpersonationLevel{ get; }
} 

Note that in IServerSecurity all of the properties are read only. Like its client side equivalent, IServerSecurity also has properties for the security package, authentication level, and impersonation level. It wouldn't make sense to allow these properties to be set on the server side, since by the time IServerSecurity is called, it is too late to change their effective values anyway.

Recap

  • IClientSecurity allows users to set security package, authentication level, impersonation level, server principal name, and mutual authentication. Setting these properties overrides settings in the configuration files.
  • IServerSecurity allows the user to get (but not set) Security Package, Authentication level, and Impersonation Level.

That's it for explaining the security solution. The next step is to walk through the sample application.

Sample

Throughout this document, I've explained the details of implementing the Microsoft.Samples.Runtime.Remoting.Security assembly. This section leverages that assembly in a sample application to show off its features. Finally, we get to show off the features promised at the start of this article.

The Microsoft.Samples.Runtime.Remoting.Security assembly includes several samples that show off its various features. In this section we'll focus on the OpenFile sample.

In the simplest configuration, the OpenFile sample application consists of a client application, a server component, and a server host. The server component is the example Foo component discussed in the overview. It primarily implements the OpenFile() method which is used to open the Secure.txt file already described. Only successful authentication and impersonation of the client allows the caller to successfully open the file.

In our sample, Foo is instantiated in a console .exe which is accessed via the TCP channel with the binary formatter. Any combination of formatters/channels is supported (and other samples are included which show off these capabilities).

The last component of our sample is the client application. The client is essentially a control panel—allowing the user to instantiate Foo in a choice of hosts, as well as calling different methods on Foo with different security settings (Figure 10).

Click here for larger image

Figure 10. Sample client

Control of the client application is broken into two major areas: Configuration and Foo

Configuration

The Client Security section contains all of the controls used to configure the security package, authentication level, and impersonation level for the proxy to Foo (recall this is done through the IClientSecurity interface). Each parameter has a user interface control for each allowable value and the security package contains an extra control area for setting the server principal name; this area is enabled if Negotiate or Kerberos is chosen. You may also select Mutual Authentication (which is only valid for Kerberos or Negotiate).

Foo

The section labeled Foo allows the user to call the OpenFile() method on the remote object. This method attempts to open the secure.txt file at the given path (the correct path is already filled in). The result is either success (access granted) or a failure message (access denied). Whether the attempt is successful or not, the security package, caller identity, and thread identity in effect when the call is made is displayed in the client. This allows the user to see the effect of different security settings when making this method call.

The Impersonate Caller check box determines if Foo impersonates the caller when the OpenFile() method is called. This allows you to try different security settings with or without impersonation.

Alice and Bob Revisited

Now that we've come to the end of this article, we can revisit the scenario we described at the start. Recall in our scenario that the remote object, Foo, is instantiated by the base client and the OpenFile method is called.

If you start the client application, and press the Open File button with the Impersonate Caller box checked (with default configuration settings), Foo attempts to open Secure.txt with the client's credentials. If the ACL on secure.txt is set so that only the client's security principal has access to the file, then the call to OpenFile() succeeds. It may be interesting to note that if you change the impersonation level to identify, OpenFile fails since the client is no longer allowed to do useful work with the client's credentials.

After all the work done to build this security solution, we've achieved the result we were originally after.

The readme file accompanying the program code contains a detailed walkthrough of the sample.

Summary

This article was prompted by a simple scenario—making a component that opens a file work the same whether it's local or remote. We saw how such a simple configuration change can have such a dramatic effect on behavior.

In building the solution, we were guided by the idea of providing a familiar security experience. Namely,

  • Simple configuration—security package, authentication level, impersonation level
  • Authorization—declarative security through custom principals
  • Any combination of channels and formatters
  • No client or server code required, but supported

Basically, we got everything we asked for.

One of the most interesting parts of the solution is how neatly it fits into the .NET remoting architecture. We needed to authenticate the client to the server transparently, and we have channel sinks. We needed a way to configure custom parameters for our custom providers and we have configuration files.

If the assemblies described for this solution were installed in the GAC, you would have no idea as a developer that remoting security wasn't already built in. You'd just make a couple of changes to your configuration files and you'd be finished.

It's because of the extensibility of .NET remoting that the integration went so smoothly.

Appendix: ASP.NET Supported Remoting Security

This article describes a security solution for .NET remoting. This solution is based on the notion of authenticating a client to a server—transparently, without passing clear text credentials across the wire. There's more than one way to solve this problem. It's worth noting before this article ends that there is a solution "in-the-box" that (in a special case), allows a client to authenticate to a remote object.

For the case where your remote component is hosted in ASP.NET, you can build a client that authenticates to a remote server. The requirement is that you pass explicit credentials on the client side and Windows authentication is enabled in IIS.

Here's how it works.

Say you have a simple client that calls a remote component, Foo, hosted in ASP.NET. To make this security solution work, enable Integrated Windows Authentication on the virtual directory in which Foo is hosted, and disable Anonymous access.

If you were to stop here and attempt to call this object from your client, you would get an error: "System.Net.WebException: The remote server returned an error: (401) Unauthorized.". This is because we've now set up the virtual directory so that clients must authenticate to access it, and without further modification our client does not.

To avoid this error, you must specify explicit credentials on the client side so that the client authenticates with the server:

// create the proxy
Foo foo = new Foo();

// get a reference to channel sink properties
IDictionary prop = ChannelServices.GetChannelSinkProperties(foo);

// set specific credentials
prop["username"] = "Alice";
prop["password"] = "BillsPassword";
prop["domain"] = "Quux";

// call the remote method
foo.SomeMethod();

If you run code similar to that above, the client successfully authenticates with ASP.NET and the call is allowed to go through to the server component.

So what does this buy us? The key objective of Microsoft.Samples.Runtime.Remoting.Security is to not only authenticate the client, but allow the server to impersonate the client so work can be done with the client's credentials. If we stop here, and call WindowsIdentity.GetCurrent() in our remote method we'll see "NT AUTHORITY\SYSTEM" or whatever security principal ASP.NET is running under (not the client's credentials).

We can fix this by going into the server side configuration file and setting a flag which tells ASP.NET that we want the server to impersonate the client. For example, here's the server side configuration file (Web.Config) I used in my test:

<configuration>

  <system.web>
    <identity impersonate="true" />
  </system.web>


  <system.runtime.remoting>
    <application>

      <service>
        <wellknown mode="SingleCall" type="Server.Foo, Server" 
            objectUri="Foo.rem" />
      </service>

      <channels>
        <channel ref="http">
          <serverProviders>        
            <formatter ref="soap" />
          </serverProviders>
        </channel>
      </channels>

    </application>
  </system.runtime.remoting>
</configuration>

Note the identity tag at the top. The impersonate attribute let's ASP.NET know that it should impersonate the client before allowing the method call to go through to the server object. If we now call WindowsIdentity.GetCurrent() in Foo, it will return the identity of the caller.

This solution doesn't have all of the features of the security solution described in this article. It depends on ASP.NET and requires explicit credentials be passed on the client (which some may find too cumbersome). But it's worth bringing up for those that will find this adequate.

Appendix: References

Microsoft Platform SDK

Microsoft .NET Framework Software Development Kit Version 1.1