Identity and Access Management


Frederick Chong
Microsoft Corporation

July 2004

Summary: Fredrick Chong discusses the principles and benefits of Service Oriented Architecture (SOA), specifically as they relate to the technical challenges in identity and access management, and secondarily, to help the reader gain an understanding of the commonly encountered issues in identity management. (20 printed pages)


Anatomy of a Digital Identity
Identity and Access Management Framework
Challenges in Identity and Access Management
Entitlement Management


To date, many technical decision makers in large IT environments have heard about the principles and benefits of Service Oriented Architecture (SOA). Despite this fact, very few IT organizations are yet able to translate the theoretical underpinnings of SOA into practical IT actions.

Over the last year, a few individual solution architects on my team have attempted to distill the practical essence of SOA into the following areas: Identity and Access Management, Service Management, Entity Aggregation and Process Integration. These four key technical areas present significant technical challenges to overcome but yet provide the critical IT foundations to help businesses realize the benefits of SOA.

Note that it is our frequent interactions with architects in the enterprises that enable us to collate, synthesize and categorize the practical challenges of SOA into these areas. Our team organizes the Strategic Architect Forums that are held multiple times worldwide annually. In these events, we conduct small discussion groups to find out the pain points and technical guidance customers are looking for. The feedback from our customers has been very consistent: issues in managing identities, aggregating data, managing services, and integrating business processes have been cited over and over again as major road blocks to realizing more efficient and agile organizations.

In addition, our team also conducts proof-of-concept projects with customers to drill deeper into the real world requirements and implementation issues. It is through these combinations of broad and deep engagements with customers that we on the Architecture Strategy team derived our conclusions on the four significant areas for IT to invest in.

The key focus of this paper is to provide an overview of the technical challenges in one of those areas, namely identity and access management; and secondarily, to help the reader gain an understanding of the commonly encountered issues in this broad subject.


Identity and access management (I&AM) is a relatively new term that means different things to different people. Frequently, IT professionals have tended to pigeonhole its meaning into certain identity and security related problems that they are currently faced with. For example, I&AM has been perceived to be a synonym for single sign-on, password synchronization, meta-directory, web single sign-on, role-based entitlements, and similar ideas.

The primary goal of this paper is to provide the reader with a succinct and comprehensive overview of what I&AM means. In order to accomplish this purpose, we have structured the information in this paper to help answer the following questions:

  • What is a digital identity?
  • What does identity and access management mean?
  • What are the key technology components of I&AM?
  • How do the components of I&AM relate to one another?
  • What are the key architecture challenges in IA&M?

Anatomy of a Digital Identity

Personal identifications in today's society can take many different forms. Some examples of these forms are driver licenses, travel passports, employee cardkeys, and club membership cards. These forms of identifications typically contain information that is somewhat unique to the holder, for example, names, address and photos, as well as information about the authorities that issued the cards, for example, an insignia of the local department of motor vehicles.

While the notion of identities in the physical world is fairly well understood, the same cannot be said about the definition of digital identities. To help lay the groundwork for the rest of the discussions in this paper, this section describes one notion of a digital identity, as illustrated in Figure 1. Our definition of digital identity consists of the following parts:

  • Identifier   A piece of information that uniquely identifies the subject of this identity within a given context.1 Examples of identifiers are email addresses, X500 distinguished names and Globally Unique Identifiers (GUIDs).
  • Credentials   Private or public data that could be used to prove authenticity of an identity claim. For example, Alice enters in a password to prove that she is who she says she is. This mechanism works because only the authentication system and Alice should know what the password for Alice is. A private key and the associated X509 public key certificate is another example of credentials.
  • Core Attribute   Data that help describe the identity. Core attributes may be used across a number of business or application contexts. For example, addresses and phone numbers are common attributes that are used and referenced by different business applications.
  • Context-specific Attributes   Data that help describe the identity, but which is only referenced and used within specific context where the identity is used. For example, within a company, the employee's preferred health plan information is a context specific attribute that is interesting to the company's health care provider but not necessarily so to the financial services provider.


Figure 1. Anatomy of a digital identity

What is Identity and Access Management?

The Burton Group defines identity management as follows: "Identity management is the set of business processes, and a supporting infrastructure for the creation, maintenance, and use of digital identities."2

In this paper, we define identity and access management (I&AM) as follows:

"Identity and access management refers to the processes, technologies and policies for managing digital identities and controlling how identities can be used to access resources."

We can make a few important observations from the above definitions:

  • I&AM is about the end-to-end life cycle3 management of digital identities. An enterprise class identity management solution should not be made up of isolated silos of security technologies, but rather, consists of well integrated fabric of technologies that address the spectrum of scenarios in each stage of the identity life cycle. We will talk more about these scenarios in a later section of this paper.
  • I&AM is not just about technology, but rather, is comprised of three indispensable elements: policies, processes and technologies. Policies refer to the constraints and standards that needs to be followed in order to comply with regulations and business best practices; processes describe the sequences of steps that lead to the completion of business tasks or functions; technologies are the automated tools that help accomplish business goals more efficiently and accurately while meeting the constraints and guidelines specified in the policies.
  • The relationships between elements of I&AM can be represented as the triangle illustrated in Figure 2. Of significant interest is the fact that there is a feedback loop that links all three elements together. The lengths of the edges represent the proportions of the elements relative to one another in a given I&AM system. Varying the proportion of one element will ultimately vary the proportion of one or more other elements in order to maintain the shape of a triangle with a sweet spot (shown as an intersection in the triangle).
  • The triangle analogy is perfect for describing the relationships and interactions of policies, processes and technologies in a healthy I&AM system as well. Every organization is different and the right mix of technologies, policies and processes for one company may not necessarily be the right balance for a different company. Therefore, each organization needs to find its own balance represented by the uniqueness of its triangle.
  • An organization's I&AM system does not remain static over time. New technologies will get introduced and adopted; new business models and constraints will change the corporate governance and processes to do things. As we mentioned before, when one of the elements change, it is time to find a new balance. It is consequently important to understand that I&AM is a journey, not a destination.


Figure 2. Essential elements of an identity and access management system

Identity and Access Management Framework

As implied in the previous sections, identity and access management is a very broad topic that covers both technology and non-technology areas. We will focus the rest of this paper around the technology aspects of identity and access management.

To further contain the technical scope of this topic that is still sufficiently broad, it is useful to abide by some structure for our discussions. We will use the framework shown in Figure 3, which illustrates several key logical components of I&AM to lead the discussions on this subject.

This particular framework highlighted three key "buckets" of technology components:

  • Identity life cycle management
  • Access management
  • Directory services

The components in these technology buckets are used to meet a set of recurring requirements in identity management solutions. We will describe the roles that these components play in the next few sections.

Directory Services

As mentioned previously, a digital identity consists of a few logical types of data—the identifier, credentials and attributes. This data needs to be securely stored and organized. Directory services provide the infrastructure for meeting such needs. Entitlements and security policies often control the access and use of business applications and computing infrastructure within an organization. Entitlements are the rights and privileges associated with individuals or groups. Security policies refer to the standards and constraints under which IT computing resources operate.

A password complexity policy is an example of a security policy. Another example is the trust configuration of a business application which may describe the trusted third party that the application relies upon to help authenticate and identify application users. Like digital identities, entitlements and security policies need to be stored, properly managed and discovered. In many cases, directory services provide a good foundation for satisfying these requirements.

Access Management

Access management refers to the process of controlling and granting access to satisfy resource requests. This process is usually completed through a sequence of authentication, authorization, and auditing actions. Authentication is the process by which identity claims are proven. Authorization is the determination of whether an identity is allowed to perform an action or access a resource. Auditing is the accounting process for recording security events that have taken place. Together, authentication, authorization, and auditing are also commonly known as the gold standards of security. (The reasoning behind this stems from the periodic symbol for Gold, 'Au'; the prefix for all three processes.)

There are several technical issues that solutions architects may encounter when designing and integrating authentication, authorization, and auditing mechanisms into the application architecture:

  • Single Sign-On
  • Trust and Federation
  • User Entitlements
  • Auditing

We will describe these challenges and their solutions in more detail later on in this document.


Figure 3. Logical components of I&AM

Identity Life Cycle Management

The life cycle of a digital identity can be framed in similar stages to the life cycles of living things:

  • Creation
  • Utilization
  • Termination

Every stage in an identity's life cycle has scenarios that are candidates for automated management. For example, during the creation of a digital identity, the identity data needs to be propagated and initialized in identity systems. In other scenarios an identity's entitlements might need to be magnified when the user represented by the identity receive a job promotion.

Finally when the digital identity is no longer put to active use, its status might need to be changed or the identity might need to be deleted from the data store.

All events during the life cycle of a digital identity need to be securely, efficiently, and accurately managed, which is exactly what identity life cycle management is about.


Figure 4. Levels of identity life cycle management requirements

The requirements for identity life cycle management can be discussed at several levels, as represented in Figure 4. The types of data that need to be managed are shown at the identity data level. Based on our previous definitions of digital identity, the relevant data includes credentials, such as passwords and certificates; and user attributes, such as names, address and phone numbers. In addition to credentials and attributes, there are also user entitlements data to manage. These entitlements are described in more detail later on, but for now, entitlements should be considered as the rights and privileges associated with identities.

Moving up a level in the illustration, the requirements listed reflect the kinds of operations that can be performed on identity data. Create, Read, Update, and Delete (CRUD) are data operation primitives coined by the database community. We reuse these primitives here as they provide a very convenient way for classifying the kinds of identity management operations. For example, we can classify changes to account status, entitlements, and credentials under the Update data primitive.

The next level in the illustration shows two identity life cycle administration models: self-service and delegated. In traditional IT organizations, computer administration tasks are performed by a centralized group of systems administrators. Over time, organizations have realized that there may be good economic and business reasons to enable other kinds of administration models as well. For example, it is often more cost effective and efficient for individuals to be able to update some of their personal attributes, such as address and phone number, by themselves. The self-service administration model enables such individual empowerment. The middle ground between the self-service and centralized administration models is delegated administration. In the delegated model, the responsibilities of identity life cycle administration are shared between decentralized groups of administrators. The common criteria used to determine the scope of delegation are organization structure and administration roles. An example of delegated administration based on organization structure is the hierarchy of enterprise, unit and department level administrators in a large organization.

The above life cycle administration models can be used to support a variety of business scenarios, some of which are listed in Figure 4. For instance, new employees often require accounts to be created and provisioned for them. Conversely, when an employee is no longer employed, the existing account status might need to be changed. Job change scenarios can also have several impacts on the digital identities. For example, when Bob receives a promotion, his title might need to be changed and his entitlements might need to be extended. Now that we have a better understanding of identity life cycle management requirements, we are ready to drill into the challenges involved in meeting those requirements.


Figure 5. Current state of affairs: multiple identity systems in the enterprise

The illustration in Figure 5 speaks to the fact that a typical user in an enterprise typically has to deal with multiple digital identities which might be stored and managed independently of one another. This current state of affairs is due to the ongoing evolution that every business organization goes through. Events such as mergers and acquisitions can introduce incompatible systems into the existing IT infrastructure; and evolving business requirements might have been met through third party applications that are not well integrated with existing ones.

One may presume that it would have been much easier for enterprises to deprecate the current systems and start over. However, this solution is seldom a viable option. We have to recognize that existing identity systems might be around for a long time, which leads us to find other solutions for resolving two key issues arising from managing data across disparate identity systems:

  1. Duplication of information Identity information is often duplicated in multiple systems. For example, attributes such as addresses and phone numbers are often stored and managed in more than one system in an environment. When identity data is duplicated, it can easily get out of sync if updates are performed in one system but not the others.
  2. Lack of integration The complete view of a given user's attributes, credentials and privileges are often distributed across multiple identity systems. For example, for a given employee, the human resource related information might be contained in an SAP HR system, the network access account in an Active Directory and the legacy application privileges stored in a mainframe. Many identity life cycle management scenarios require identity information to be pulled from and pushed into several different systems. We refer to the above issues as the identity aggregation challenge, which we will describe in more detail in a later section.

Challenges in Identity and Access Management

Single Sign-On

A typical enterprise user has to login multiple times in order to gain access to the various business applications that they use in their jobs. From the user's point of view, multiple logins and the need to remember multiple passwords are some of the leading causes of bad application experiences. From the management point of view, forgotten password incidents most definitely increase management costs, and when combined with bad user password management habits (such as writing passwords down on yellow sticky notes,) can often lead to increased opportunities for security breaches. Because of the seemingly intractable problems that multiple identities present, the concept of single sign-on (SSO); the ability to login once and gain access to multiple systems, has become the 'Holy Grail' of identity management projects.

Single Sign-On Solutions

Broadly speaking, there are five classes of SSO solutions. No one type of solution is right for every application scenario. The best solution is very much dependent on factors such as where the applications requiring SSO are hosted, limitations placed by the infrastructure (e.g. firewall restrictions), and the ability to modify the applications. These are the five categories of SSO solutions:

  1. Web SSO
  2. Operating System Integrated Sign-On
  3. Federated Sign-On
  4. Identity and Credential Mapping
  5. Password Synchronization

Web SSO solutions are designed to address web application sign-on requirements. In these solutions, unauthenticated browser users are redirected to login websites to enter in user identifications and credentials. Upon successful authentication, HTTP cookies are issued and used by web applications to validate authenticated user sessions. Microsoft Passport is an example of Web SSO solutions.

Operating system integrated sign-on refers to authentication modules and interfaces built into the operating system. The Windows security subsystem provides such capability through system modules such as Local Security Authority (LSA) and Security Specific Providers (SSP) SSPI refers to the programming interfaces into these SSP. Desktop applications that use the SSPI APIs for user authentication can then 'piggyback' on Windows desktop login to help achieve application SSO. GSSAPI on various UNIX implementations also provide the same application SSO functionality.

Federated sign-on requires the application authentication infrastructures to understand trust relationships and interoperate through standard protocols. Kerberos and the future Active Directory Federation Service are examples of federation technologies. Federated sign-on means that the authentication responsibility is delegated to a trusted party. Application users need not be prompted to sign-on again as long as the user has been authenticated by a federated (i.e. trusted) authentication infrastructure component.

Identity and credential mapping solutions typically use credential caches to keep track of the identities and credentials to use for accessing a corresponding lists of application sites. The cache may be updated manually or automatically when the credential (for example password) changes. Existing applications may or may not need to be modified to use identity-mapping solutions. When the application cannot be modified, a software agent may be installed to monitor application login events. When the agent detects such events, it finds the user credential in the cache and automatically inputs the credential into the application login prompt.

The password synchronization technique is used to synchronize passwords at the application credential databases so that users and applications do not have to manage multiple passwords changes. Password synchronization as a silo-ed technology does not really provide single sign-on, but results in some conveniences that applications can take advantage of. For example, with password synchronization, a middle tier application can assume that the password for an application user is the same at the various systems it need access to so that the application does not have to attempt looking up for different passwords to use when accessing resources at those systems.

Entitlement Management

Entitlement management refers to the set of technologies used to grant and revoke access rights and privileges to identities. It is closely associated with authorization, which is the actual process of enforcing the access rules, policies and restrictions that are associated with business functions and data.

Today's enterprise applications frequently use a combination of rolebased authorization and business rules-based policies to determine what a given identity can or cannot do.

Within a distributed n-tiered application, access decisions can be made at any layer in the application's architecture. For example, the presentation tier might only present UI choices that the user is authorized to make. At the service layer of the architecture, the service might check that the user meets the authorization condition for invoking the service. For example, only users in the manager role can invoke the 'Loan Approval' service. Behind the scenes at the business logic tier, there might need to be fine grain business policy decisions such as 'Is this request made during business hours'; at the data layer, the database stored procedure might filter returned data based on the relationship between the service invoker's identity and the requested data.

Given the usefulness, and often intersecting use, of both role and rulebased authorization schemes, it is not always clear to application architects how to model an entitlement management framework that integrates both schemes cleanly. Many enterprises have separate custom engines to do both.


Figure 6. Integrating role and rule-based authorization illustrates a representation of how both schemes might be combined and integrated.

In this representation, we can envision a role definition (which typically reflects a job responsibility) with two sets of properties. One set of properties contain the identities of people or systems that are in the given role. For example, Alice, Bob and Charlie may be assigned to the Manager role. The second set of properties contains the set of rights that a given role has. Rights can represent business functions or actions on computing resources. For example, transfer fund defines a business function and read file refers to an operation on a computing resource. Furthermore, we can assign a set of conditional statements (business rules) for each right. For example, the transfer fund right may have a conditional statement to allow the action if the current time is within business hour. Note that the conditional statement might base its decision on dynamic input data that can only be determined at application runtime.

It is also important for most organizations to have a consolidated view of all the rights that a given identity possesses. To meet this requirement, entitlement management applications typically leverage a centralized policy store to help facilitate centralized management and reporting of users' rights.

Identity Aggregation

Enterprise IT systems evolve organically over the course of an organization's history. This is often due to reasons such as mergers and acquisitions, or preferences and changes of IT leaderships. The consequences of this are often manifested through hodge-podges of disconnected IT systems with undesirable architectural artifacts. Identity-related systems are no exceptions to such IT evolutions.

Frequently, the enterprise will have not just one identity systems, but several, each serving different business functions but storing duplicated and related data. Applications that need to integrate with those business functions are then forced to reconcile the differences and synchronize the duplications.

For example, a banking customer service application might need to obtain customer information from an IBM DB2 database, an Oracle based authorization database and a homegrown CRM. In this case, the application's concept of 'Customer' is defined by three different systems. Attributes that describe customer such as customer name, address, and social security number might be stored and duplicated in multiple systems. On the other hand, non-duplicated data such as the financial products that the customer has purchased and the customer's bank balance might be kept in separate systems. The application will need to aggregate this data from different systems to get the necessary view of the customer.

Moving the underlying identity data into one huge giant identity system might seem like an obvious answer to this problem. However, there are many real world issues (for example, the risk of breaking legacy applications) that prevent such solution from being broadly adopted any time soon. Identity aggregation therefore refers to the set of technologies that help applications aggregate identity information from different identity systems, while reducing the complexity of data reconciliation, synchronization and integration.

There are several technical challenges that identity aggregation technologies should help address:

  • Maintaining relationships for data transformations.
  • Optimizing data CRUD operations.
  • Synchronizing data.

The next few sub-sections provide an overview of these design issues.

(a) Maintaining Relationships for Data Transformations

An identity aggregation solution can provide several benefits to applications with disparate views of identity information that manipulate data in different systems. The first benefit involves providing applications with a consolidated view or an aggregated view of the data in the individual systems. In order to transform and represent data in different views, the identity aggregation solution needs to maintain meta-data describing the schema representing the consolidated view and its relationship with the data schema in the various identity systems.

Let's look at a specific example of an identity life cycle management application that is managing data across a few existing systems. The consolidated view of the management application is represented by a new schema which is made up of new and existing identity attributes.


Figure 7. Reconciling identity schemas

Figure 7 illustrates an example where a new identity schema is defined for the application. The new schema refers to attributes in two existing identity schemas and also defined new ones that are not currently specified (Account Number, Cell phone and Preferences).

In order for applications to query and update data in existing stores, we will need to maintain data relationships between the attributes defined in the new schema and the corresponding attributes in the existing schemas. For example, the following relationships will need to be maintained:

  • Reference   A reference refers to a piece of information that unambiguously identifies an instance of data as represented by a particular data schema. For example, the 'Customer ID' attribute allows a data instance as represented by the application scheme in Figure 7 to find its corresponding data instance as represented by existing schema 1. Different schemas may use different references to identify their own data instances.
  • Ownership   A data attribute may be defined in more than one existing schema. Using the same example shown in Figure 7 again, we can see that the 'Name' and 'Address' attributes are defined in both existing schemas. In the event of a data conflict, the application needs to know which version holds the authoritative copy to keep. In addition, there may be scenarios where an attribute can get its value from a prioritized list of owners. In those scenarios, when the value for an attribute is not present in the first authoritative source, the aggregation service should query the next system in the prioritized list of owners.
  • Attribute Mapping   Attributes defined in multiple data store may have the same semantic meaning, but have the same or different syntactic representations. When querying or updating a data attribute, the identity aggregation service needs to know the attributes that are semantically equivalent to one another. For example, although customer id is defined in all three schemas, it is named differently as CustID in identity schema 2. When the application performs an update for an identity's customer id number, the system must also know to update the CustID attribute for the data instance represented by schema 2.

(b) Optimizing Data CRUD Operation

As previously identified, CRUD is a database acronym that stands for Create, Read, Update and Delete. CRUD defines the basic primitive operations for manipulating data. The reason why CRUD is raised as a technical challenge is because the performance for completing an aggregation-related activity that involves CRUD operations across multiple data backend systems can vary significantly depending on the data relationships.

In the best-case scenario, CRUD operations can be parallelized. This is mostly true in situations where the data instances can be resolved using the same reference and the data reference always resolves to a unique instance. For example, if the social security number is the only key used for querying and aggregating data across data stores, that particular query can be issued in parallel.

In the worst-case scenario, the CRUD operations are serialized across data stores. Serialized operation is common in situations where the references used for resolving data instances have dependencies on other data instances. As a simple illustration, let's suppose we need to aggregate data from the HR and Benefits databases and the instance references are employee ID and social security ID respectively. If the only initial key we have for the query is the employee ID, and the social security ID can only be obtained from the HR database, then we will need to serialize the query in the following order:

  1. Query the HR database using the employee ID as the key.
  2. Extract the social security number from the above query results.
  3. Query the benefits database.
  4. Aggregate the query results.

Replication is a common technique used to address performance degradation due to CRUD operations across data stores. To address the performance issue, a portion of the backend data attributes may be replicated to a store maintained by the identity aggregation service. In addition, the local copy of the replicated data can be further de-normalized to help improve the CRUD performance.

(c) Synchronizing Data

Data synchronization is needed in situations when one or both of the following conditions are true:

  • Duplicate identity attributes exist in multiple backend stores.
  • Data is replicated to an intermediate identity aggregator store.

However, the use of data synchronization may also introduce other design issues:

  • Data conflict resolution. In situations where the data can be updated from more than one source, it is often easy to introduce conflicting data. Some common practices to help mitigate the situations are as follows:
    • Assign data ownership priority so that in the event of conflict, we will use the assigned authority.
    • In a conflict, the last writer wins.
    • Synchronization triggers. A couple of common approaches are scheduled updates and event notification based.

Trust and Federation

As mentioned in the single sign-on section, federation offers a form of single sign-on solution. However, federation is more than just single sign-on. Federation implies delegation of responsibilities honored through trust relationships between federated parties. Authentication is just one form of delegated responsibility. Authorization, profile management, pseudonym services, and billing are other forms of identity-related functions that may be delegated to trusted parties.

There are three technology elements that are crucial to the concept of federation:

  • A federation protocol that enables parties to communicate.
  • A flexible trust infrastructure that supports a variety of trust models.
  • An extensible policy management framework that supports differing governance requirements.

Federation protocols are the 'languages' that are used by federating parties to communicate with each other. Since federation implies that a responsibility is delegated to and performed by a different party, the protocol must allow individuals to obtain 'capabilities'—essentially tamper-proof claims that a given identity has successfully completed an action or is entitled to a collection of privileges. For example, in the case of federated sign-on, an authenticated identity obtains a capability that proves that the individual has successfully authenticated with an approved authentication service.

In a complex business world, it is possible to have relatively complex trust schemes involving multiple business parties. Federation technology must be able to capture the essence of those real world trust relationships into simple to understand but powerful trust models that will help enable various business scenarios. Some common trust models, illustrated in Figure 8, are as follows:

  • Hub-and-spoke
  • Hierarchical
  • Peer-to-peer Web of Trust


Figure 8. Common trust models

The hub-and-spoke model is the simplest to understand. In this model, there is a central broker that is directly trusted by the federating parties. The European Union is an example of this federation model where the EU countries directly trust the EU body to provide common economic guidelines and trade opportunities to federated parties.

In the hierarchical model, two parties have an indirect trust relationship if they both have a trust path in their respective branches in the hierarchical tree to a common root authority. The American political system demonstrates this trust model. This political system has federal, state, county and local city political bodies, each existing at various levels of the hierarchy.

The peer-to-peer model represents a collection of ad-hoc direct trust relationships. Personal relationships between friends in the physical world are good examples of this trust model. Note that it is also possible to extend and form new networks of federations that consist of different trust models, or hybrid models.

A basic policy management framework must allow policies to be created, deleted, modified and discovered. In order to promote federated systems that enable new business models and partnerships to be quickly integrated, the policy management framework must also be extensible to reflect the dynamicity of the environment it aims to support.

Some examples of application policies that are relevant in the federated world are:

  • Trusted issuer of identity-related capabilities.
  • The types of capabilities required to invoke an application's operation.
  • The kinds of identity information that the application expects in capabilities.
  • The kinds of privileges that an identity must demonstrate in order to invoke a service.


Auditing in the context of I&AM, is about keeping records of 'who did what, when' within the IT infrastructure. Federal regulations such as the Sarbanes-Oxley Act are key drivers of the identity-related auditing requirements.

(a) IT Audit Process

The IT audit process typically involves the following phases as illustrated in Figure 9:

  • Audit generation
  • Data collection and storage
  • Analysis and feedback


Figure 9. IT audit process

Audit trails can be generated by different infrastructure and application components for different purposes. For example, firewalls and VPN servers can generate events to help detect external intrusions; middleware components can generate instrumentation data to help detect performance anomalies; and business applications can produce audit data to aid debugging or comply with regulatory audit requirements.

After the audit data has been produced, it needs to be collected and stored. There are two main models to consider here: distributed and centralized. In the distributed model, audit data typically remains in the system where the data is generated. With the centralized approach, data is sent to a central collection and data storage facility.

Once the data is collected, they may be processed and analyzed automatically or manually. The audit analysis is designed to lead to conclusions on what corrective actions, if any, are needed to improve the IT systems and processes.

(b) Audit Systems Design Considerations

Given the typical auditing process described in the previous sections, there are several considerations that are important to the design of auditing systems:

  • Locality of audit generation and storage.
  • Separation of auditor's role.
  • Flow of audited events.

Once an audit trail is generated, the audit data can be stored locally on the same system or transferred to another storage location. This consideration is important from a security perspective, as audit data can be easier to change and modify if it is stored on the same system that generates it. This point can be illustrated by a simple example: in the event that a system has been compromised, the attacker might be able to modify the audit trail on the local system to cover up the hacking attempt. Therefore, for higher assurance against such attacks, you might want the audit system to store the data separately on a remote machine.

Role separation is a common best practice to help minimize the occurrence of illegal activities resulting from actions that might circumvent accountability checks. For example, a corporate acquisition officer should not be able to approve purchasing requests. In the field of IT audit, it is common practice to separate the system administrator role from the auditor's role. Doing so prevents the system administrator who usually has 'god status' on computers to cover up audit trails of unauthorized activities.

In a distributed design model, where audit data is generated and stored in different systems, only allowing data to flow out of the audit generation systems can add another level of safeguard. This preventive measure reduces the chances of tampered audit data replacing actual trails.

In addition, identity auditing infrastructures are also expected to be:

  • Efficient (typically means a message-based asynchronous communication channel).
  • Available (typically means distributed, clustered and fault tolerant).
  • Accurate (keep accurate records and traces).
  • Non-repudiated (can be admitted into the courts of law as evidence, typically means the records need to be digitally signed).


Organizations are made up of people and systems, represented within the IT systems as digital identities. Everything that occurs in businesses is the consequence of actions initiated by and for those identities. Without identities (even anonymous identities) there would be no activities and businesses would be lifeless constructs.

At the highest level, SOA can be seen as a way to organize the business IT infrastructure that executes and manages business activities. Therefore, enterprises seeking to realize SOA must figure out how to resolve the identity and access management challenges they face today. We have provided an overview on a few keys technology areas:

  • Achieving user and application single sign-on.
  • Aggregating, transforming, synchronizing, and provisioning identity data from various identity systems.
  • Managing access to business functions and data using roles and rules.
  • Federating with business partners.
  • Auditing identity-related activities.

We also hope that the technical discussions on challenges have helped the readers gain some appreciation of the products and solutions in this space.

Our final conclusion is: Identity and access management is a cornerstone to realizing SOA. Show me an enterprise that claims to be "SOAccessful" and I'll show you an enterprise that has good handle on its identity and access management.


1. Context defines the boundary inside of which an identity is used. The boundary could be business or application related. For example, Alice may use a work identity with the identifier to identify herself at her Wall Street employer (Wall Street Ace) as well as to execute a stock trade in NYSE. Her Wall Street employer and the NYSE would be two different business contexts where the same identifier is used. SOA Challenges: Entity Aggregation, Ramkumar Kothandaraman, .NET Architecture Center, May 2004 (URL:

2. Enterprise Identity Management: It's About the Business, Jamie Lewis,The Burton Group Directory and Security Strategies Report, v1 July 2nd 2003

3. The term "life cycle" when used in the context of digital identities is somewhat inappropriate since digital identities are typically not recycled. However, since the phrase "identity life cycle management" is well entrenched in the IT community, we will stick with the "life cycle" terminology. Microsoft Identity and Access Management Series, Microsoft Corporation, May 14th, 2004 (URL:

4. Enterprise Identity Management: Essential SOA Prerequisite Zapflash, Jason Bloomberg, Zapflash, June 19th, 2003 (URL:


About the author

Frederick Chong is a solutions architect in the Microsoft Architecture Strategy Team where he delivers guidance on integrating applications with identity and access management technologies. He discovered his interest in security while prototyping an electronic-cash protocol in the network security group at the IBM T J Watson Research Center. Since then, he has implemented security features and licensing enforcement protocol in Microsoft product teams and collaborated with various enterprise customers to architect and implement a variety of security solutions including web single sign-on, SSL-VPN and web services security protocol. He can be reached at

This article was published in the Architecture Journal, a print and online publication produced by Microsoft. For more articles from this publication, please visit the Architecture Journal website.

© Microsoft Corporation. All rights reserved.