Share via


SQL Server Technical Article

Writers: Mary Deyo (Unisys Corporation)
Technical Reviewer: Dave Wickert
Project Editor: Jeannine Nelson-Takaki
Designer: Nate Gunderson
Applies To: SQL Server 2005

Summary: This paper describes the system and network instrumentation and monitoring used for Project REAL, focusing on the tools used, their installation and configuration, and the lessons learned.

On This Page

About Project REAL
Introduction
Tools for Monitoring and Instrumentation
Using the Tools
Conclusion
Appendix A. Hosts Files
Appendix B. Unisys ES7000 Management
Appendix C:  Storage Architecture
Appendix D. Application Consolidation with 32-bit Processors
Appendix E:  Resources

About Project REAL

Project REAL is an attempt to discover best practices for creating business intelligence applications that are based on Microsoft® SQL Server™ 2005. To work through the same issues that the customers face during deployment, we have created reference implementations that are based on actual customer scenarios and actual customer data.

The following are some of the issues that Project REAL addresses:

  • Design of schemas, both relational and for Analysis Services.

  • Implementation of a data extraction, transformation, and loading (ETL) process.

  • Design and deployment of client front-end systems, both for reporting and for interactive analysis.

  • Sizing of systems for production.

  • Management and maintenance of the systems on an ongoing basis, including incremental updates to the data.

By working with real deployment scenarios, we gain a fuller understanding of how to work with SQL Server 2005 business intelligence (BI) tools. Our goal is to address the gamut of concerns that a large company would face during their own deployment.

Project REAL is a cooperative effort between Microsoft and a set of partner companies known for their expertise in their respective fields. The following partners committed resources to the project and agreed to perform technical work that focused on developing general best practices, not on distributing marketing information.

  • Apollo Data Technologies

  • Barnes & Noble

  • EMC

  • Emulex

  • Intellinet

  • Panorama

  • Proclarity

  • Scalability Experts

  • Unisys

This white paper presents a summary of the monitoring and instrumentation infrastructure used by Project REAL, with a particular focus on the monitoring tools that were used. This paper describes some deployment issues experienced during this project, and provides some examples of how these tools have been used to prevent or solve problems.

For an overview of Project REAL, see the white paper at Project REAL: Technical Overview (https://www.microsoft.com/technet/prodtechnol/sql/2005/projreal.mspx). Project REAL is an ongoing effort and additional papers, tools, and samples may be produced over the lifetime of the project. To find the latest information, see the Project REAL Web site at https://www.microsoft.com/sql/bi/ProjectReal/

Introduction

Project REAL is a reference business intelligence (BI) implementation that uses SQL Server 2005 with actual large-scale data from a customer. Project activities include the following:

  • Data extraction, transformation, and loading (ETL), using SQL Server 2005 Integration Services, to initialize the SQL Server 2005 data warehouse.

  • OLAP analysis and data mining, using SQL Server 2005 Analysis Services.

  • Presentation, using SQL Server 2005 Reporting Services.

  • Ongoing system and data management.

  • Performance measurement and analysis.

Figure 1 shows a high-level overview of the Project REAL network midway through the project; a later section describes server changes that were made during the project.

Cc966423.REALMonInst01(en-us,TechNet.10).jpg

Figure 1. Project REAL network overview

Figure 1 is a composite; in practice, the Distributed Architecture Servers and the Unisys ES7000 Consolidated Architecture Servers are used at different times, even though the servers are not separated either logically or physically. The distributed and consolidated servers were separated by design so that we could explore the differences between distributed and consolidated architectures, and the differences between 32-bit and 64-bit server performance, and eventually the differences between two different 64-bit CPU architectures (IA-64 and x64), when running BI applications. That is, three basic server configurations were used to run the same tests at different times:  distributed, consolidated on a single 32-bit server, and consolidated on a single 64-bit server.

The servers that are used for each of these three configurations are described in Table 1. Project REAL Servers. The BI-REAL-ES32 server was replaced during the project with the BI-REAL-EX64 server as the team became more interested in the differences between the two 64-bit CPU architectures.

Table 1. Project REAL Servers

Role

Server name

Model

CPU

Cache

Memory

Distributed relational DW

BI-REAL-DW

Unisys ES3040L

4x 2.2 GHz

(hyperthreaded)

2 MB

8 GB

Distributed Integration Services

BI-REAL-IS

Unisys ES3040L

4x 2.2 GHz

(hyperthreaded)

2 MB

4 GB

Distributed Analysis Services

BI-REAL-AS

Unisys ES3040L

4x 2.2 GHz

(hyperthreaded)

2 MB

4 GB

Distributed Reporting Services

BI-REAL-RS

Unisys ES3040L

4x 2.2 GHz

(hyperthreaded)

2 MB

4 GB

Consolidated 32-bit server

BI-REAL-ES32

Unisys ES7000/540

16x 3.0 GHz Xeon MP

4 MB

32 GB

Consolidated 64-bit IA-64 server

BI-REAL-ES64

Unisys ES7000/420

16x 1.3 GHz Itanium-2 (IA-64)

3 MB

32 GB

Consolidated 64-bit x64 server

BI-REAL-EX64

Unisys ES7000/600

16x 3 GHz EM64T (dual core)

2 MB

128 GB

Figure 1 also shows network communications between tiers going through the Cisco PIX 515E firewalls. This is one of three network scenarios used by Project REAL. Each server has additional network adapters that bypass the firewalls so that intranet scenarios can also be tested. The network adapters that are not required for a given scenario can be disabled or disconnected during testing. In addition to the private connections used for the test scenarios, all the Project REAL servers are connected to the Microsoft corporate network (abbreviated as “Corpnet” in this paper) and are members of an Active Directory domain. Table 2:  Project REAL Network Scenarios describes these network scenarios in more detail.

Table 2:  Project REAL Network Scenarios

Network scenario

Description

Corpnet access

All servers have Corpnet links to allow for easy software installation and remote access by domain users through the Remote Desktop application. The Project REAL servers belong to the Microsoft SYS-SQLSVR domain, which has a one-way trust relationship with the REDMOND domain; therefore, the Project REAL users on the REDMOND domain can be authenticated by using Active Directory.

Originally the servers were in the REDMOND domain, but were moved to a separate domain so that Project REAL could control the timing of updates to the operating system and software. IP addresses are assigned by using DHCP. Corpnet links are primarily used during the development phases of the project and the private links are reserved for high data volumes.

Three-tier Internet simulation

Firewalls are used to separate the Client tier (Internet), Web Server tier, and the Data tier from each other. Each tier resides on a separate VLAN and IP addresses are static. Corpnet links are bypassed.

Client/server Intranet simulation

Firewalls are bypassed. Corpnet links are bypassed or disabled. All servers reside on the same VLAN and use static IP addresses.

Figure 2 shows a high-level logical representation of the different tiers and network connections. Physically, all local network links, including the tiered network links, go through one of two Extreme Summit switches. The Storage Area Network (SAN) is actually a complex network that has multiple switches and storage arrays.

Cc966423.REALMonInst02(en-us,TechNet.10).jpg

Figure 2. Network architecture

Figure 3 shows a high-level view of the SAN. Like the distributed and consolidated servers, the two storage arrays are used at different times. This enabled us to explore differences between the distributed and both consolidated server architectures. To support the three different architectures, each array has a complete instance of all data which can be moved between any of the servers. The storage arrays used in the SAN are listed in Table 3:  Project REAL SAN Components. Two 32-port, 2 GB, McData fiber channel switches are also used.

Cc966423.REALMonInst03(en-us,TechNet.10).jpg

Figure 3. SAN diagram

Table 3:  Project REAL SAN Components

Component

Model

Physical disks

Capacity (raw)

Cache

EMC CLARiiON

CX700

244

20 TB

8 GB

EMC Symmetrix

DMX1000

144

18 TB

64 GB

Monitoring System Health

Although the Project REAL data center is complex, it is not a production system. However, similar problems can appear in either environment, including unusual loads, software upgrades, configuration issues and human error. The main differences are that the Project REAL infrastructure has been deliberately stressed through scalability and performance test scenarios, and failure of Project REAL BI applications will have less impact on business operations than a similar failure in a production system. The combination of multiple architectures and complex test scenarios still makes the Project REAL data center a rapidly changing and complex test and development environment that needs essentially the same types of preventive maintenance processes as a production environment.

The Project REAL partners depend on the health of the underlying hardware and system software in the data center to meet their testing goals and timelines. Because the system is complex, to assess the health of the system, we must ask questions with potentially complicated answers:

  • What characterizes a healthy system?

  • How do we build a healthy system, and how can the health of the hardware and software be measured?

On a more practical level, we must ask these questions:

  • Can problems be caught and reported quickly enough to avoid or minimize unproductive time?

  • How can this be done with minimal resources?

In this section, we answer the first two questions. In the next section, we provide some details on our implementation that help answer the second set of questions.

It is tempting to think that the health of a system can be characterized just by determining whether it is successfully completing its function. However, a data center is a system with many parts. It is possible that individual parts are working properly but the overall system is not—such as when a network is incorrectly configured and components cannot communicate. It is also possible to have a non-functioning component but a functioning system—for example, when a redundant component fails. Neither system would be considered completely healthy. An analogy might be an apparently healthy individual who has high blood pressure but no other symptoms of ill health. The individual might be alive, which is the basic function, but not necessarily healthy.

Ideally, each component in a complex system should be healthy before the overall system can be considered healthy; additionally, interactions between components should be correct. In addition to this static definition of health, we might also require that each component should work over time and under varying conditions. High loads, power fluctuations and human error should have minimal impact; in effect, the data center should have a healthy immune system.

A healthy data center starts from a good architectural design relying on tested best practices, and the data center thrives when best practices are implemented in the ongoing operations. Microsoft and several partners, including some of those involved in Project REAL, have devoted significant resources to defining and documenting best practices for both data center design and operations. More information about best practices is available at the following Web sites:

  • Windows Server System Reference Architecture

    (www.microsoft.com/technet/itsolutions/wssra
    /raguide/default.mspx).

    This set of best practices evolved from an earlier project that was based on based on Windows 2000. For more information about the original Windows 2000 project, see Microsoft TechNet: MSA: Enterprise Data Center (www.microsoft.com/resources/documentation/msa/edc/all/solution/en-us/default.mspx). Both of these initiatives provide in-depth guidance about data center design and implementation.

  • Microsoft Operations Framework (MOF) 

    www.microsoft.com/technet/itsolutions/cits/mo/mof/default.mspx

    MOF provides operational guidance that enables organizations to achieve mission critical system reliability, availability, supportability, and manageability of Microsoft products and technologies.

To help you assess your current IT service management maturity, prioritize processes, and apply proven principles and best practices to optimize management of the Windows Server operating system, MOF includes documentation in the following areas:

  • MOF Team Model

  • MOF Process Model for Operations

  • MOF Risk Management Discipline for Operations

  • Service Management Functions

  • Microsoft Operations Framework Operations Management Reviews

The Project REAL team considered many of the issues documented in previous references and developed a monitoring and instrumentation infrastructure that fits the needs of the project. Because both the project needs and the available tools have changed over time, this paper presents a snapshot of the monitoring and instrumentation infrastructure that is used by Project REAL as of April 2006. In this paper, we focus on the tools used, describe deployment issues peculiar to the project, and provide examples of how these tools were used to prevent or solve actual problems.

Tools for Monitoring and Instrumentation

Some of the tools used for Project REAL are listed here:

  • Tools for accessing the servers. Includes Terminal Services and KVM applications that let users access and manage the Project REAL applications remotely.

  • Tools for monitoring and diagnostics. Includes Microsoft Operations Manager 2005 (MOM 2005) and Event Viewer. Also includes vendor-specific tools such as Unisys Server Sentinel, EMC NaviSphere, and EMC Control Center.

  • Tools for performance monitoring and measurement. Includes MOM 2005, PerfMon, Task Manager, and the SQL Server 2005 Profiler.

Tools for Accessing the Servers

The Project REAL servers are located on two separate floors of an environmentally controlled lab on the Microsoft campus. Physical access to either floor of the lab is controlled by badge readers. Although members of the Project REAL team working in or near this lab can access the servers directly, it is more convenient for users to access the servers remotely, and remote access is the only option for users who are not working on the Microsoft site. Remote access to the Project REAL servers is accomplished by using various remote access tools and the Microsoft corporate network (Corpnet).

Terminal Services and Remote Access

For Project REAL team members, the standard method of operating the system is to use a workstation or laptop that is connected to Corpnet, and log on to the Project REAL servers by using the Remote Desktop application. This method of remote access works as long as the Project REAL servers are also connected to Corpnet. This method also requires that the user have a local account on the target server. Team members who are not in or near Redmond can access the Project REAL servers by using a VPN connection to Corpnet.

As described in the Introduction, each Project REAL server is provided with three different network connections for the three network architectures that are being used by the project. When you run a test in any one of the simulation modes, the Corpnet link must be disabled on the servers involved in the tests. This disconnects any users who are using Remote Desktop over Corpnet to access these servers.

We tested this scenario and worked around the problem of Remote Desktop session disconnects in one of two ways. One way was to select a gateway server in the lab that was not disconnected from Corpnet but was connected to the simulated network being tested. We would log on to the gateway server (usually one of the servers in the Web Server tier) by using a Remote Desktop connection and then open a Remote Desktop connection from the gateway server to one of the isolated servers, either by using a static IP address or by using the name in the Hosts file for this static IP address (see Appendix A:  Hosts Files for these names). The second way that we bypassed the lack of a Corpnet link to a server was to use a standard KVM application. A third mechanism, a secure remote console card provided with Unisys ES7000 servers, was not used.

There are two Terminal Services modes: Remote Desktop for Administration, which allows only two remote users to connect to the server at one time, and Terminal Server (formerly called Application Server when released with Windows 2000), which provides application-sharing, multi-user capabilities and process scheduling features that are not provided by Remote Desktop for Administration. Terminal Server allows more users to connect at one time but requires a separate licensing server and is more of a resource drain, especially when a large number of remote sessions are active at the same time.

The Project REAL servers are configured to use Terminal Server so that many team members can access the same server at the same time. However, during performance testing, the servers are stressed to their theoretical limits and the additional server resources required by Terminal Server can prevent some tests from running. For this reason it is preferable during load testing to disable Terminal Server and use Remote Desktop for Administration instead, even though this method limits access to two users at a time.

For more information about Terminal Services, including a detailed overview of the Terminal Services architecture, see How Terminal Services Works (https://technet2.microsoft.com/windowsserver/en/library/
2cb5c8c9-cadc-44a9-bf39-856127f4c8271033.mspx?mfr=true).

Tools for Health Monitoring and Diagnostics

The Project REAL network of servers is in heavy use by the Project REAL team and its partners, who often stress the systems through use of pre-release software, untried queries, occasional configuration errors, and high loads by simulated users. One of the goals of Project REAL was to determine best practices for similar systems, and we did this by trial and error, knowing that the “error” part especially stresses the system. Because we could not anticipate the problems that would arise, it was critical that at a very early stage we establish a way to catch system problems and capture enough information to quickly determine the underlying causes.

To monitor the health of the Project REAL servers and applications, we installed Microsoft Operations Manager 2005 (MOM 2005) on a designated management server in the Project REAL lab. We then configured MOM 2005 to use a combination of existing product management packs and customized rules that let us capture information about REAL and about the health of MOM 2005 itself. An early version of a management pack for SQL Server 2005 was also installed to catch problems in the pre-release code and problems in the management pack design. This early version of the management pack was replaced by the released version of the management pack when it became available.

For further exploration into problem causes, the project and product teams depend heavily on such administrative tools as Event Viewer and the Windows Debugger. Hardware vendor partners rely on their own tools to detect and resolve possible problems relating to their systems.

Microsoft Operations Manager 2005 (MOM 2005)

A detailed description of the MOM 2005 product is beyond the scope of this document, but a helpful overview can be found at Microsoft Operations Manager 2005 Product Overview (www.microsoft.com/mom/evaluation/overview/default.mspx).

The MOM 2005 Server application was deployed on the BI-MORDOR server, which has connections to all three tiers of Project REAL servers as well as to Corpnet. Specifications of the BI-MORDOR server, which exceeds the minimum requirements for MOM 2005 servers described at Microsoft Operations Manager 2005 System Requirements (https://www.microsoft.com/mom/evaluation/sysreqs/default.mspx), are shown in Table 4:  MOM 2005 Server Specifications. The complete contents of the Hosts file are listed in Appendix A. Hosts Files.

Table 4:  MOM 2005 Server Specifications

Server name

RAM

CPUs

IP addresses

Hosts alias

Disk space

BI-MORDOR

4 GB

4 x 2.40 GHz Intel Xeon

192.168.3.206

BI-MORDOR-OUT

546.91 GB

BI-MORDOR

4 GB

4 x 2.40 GHz Intel Xeon

192.168.2.206

BI-MORDOR-IN

546.91 GB

BI-MORDOR

4 GB

4 x 2.40 GHz Intel Xeon

192.168.1.206

BI-MORDOR-PRI

546.91 GB

BI-MORDOR

4 GB

4 x 2.40 GHz Intel Xeon

10.xxx.xx.xx (DHCP-assigned)

(corpnet link)

546.91 GB

Deploying MOM 2005 in the Project REAL Lab

There were five basic steps in the MOM 2005 deployment in the Project REAL lab:

  1. Install the MOM 2005 Server components.

  2. Deploy the MOM 2005 Agents.

  3. Install MOM 2005 Reporting.

  4. Import Management Packs.

  5. Customize the Management Packs.

These steps are covered in detail in the following sections. We also provide some tips about how to manually install MOM 2005 agents to fail over to alternative network links.

Installing MOM 2005

The installation process for MOM 2005 is documented in the MOM 2005 Deployment Guide (https://www.microsoft.com/technet/prodtechnol/mom/mom2005/
Library/b7b0c768-64d1-486e-b9ed-7292c9e545f9.mspx). Prerequisites included SQL Server 2000 SP3a Enterprise Edition and SQL Server 2000 Reporting Services, which is used by MOM 2005 Reporting. Later, MOM 2005 Service Pack 1 (SP1) was released), and the Project REAL system was upgraded to use MOM 2005 SP1 and SQL Server 2000 SP4. Because the MOM Server needed to be able to operate without Corpnet access, we created the following local user accounts for the various MOM 2005 roles and for database administration:

  • MOMadmin

  • DASuser

  • SQLadmin

REDMOND domain user accounts for Project REAL team members were added to the local Administrators group on the MOM Server just to simplify the installation process. However, later it became clear that domain accounts were required to support some of the desired functionality, such as access to shared resources and support for authenticated e-mail notifications.

By default, MOM 2005 uses communications ports 1270-1272 for agent-to-server communications through firewalls, but these port numbers are configurable. The Project REAL lab was set up to use the defaults.

For additional guidance about the security requirements for MOM 2005, see MOM 2005: Microsoft Operations Manager 2005 Security Guide (https://www.microsoft.com/technet/prodtechnol/mom/mom2005/
Library/3e039637-4639-46f7-9f5f-518e0c04795e.mspx).

Server Discovery and Agent Deployment

MOM 2005 was designed to be easy to deploy in an Active Directory domain, and Project REAL took advantage of this for the initial installation of the MOM 2005 Server and Agents, including a “push” install of the MOM Agents to the managed servers. Unfortunately, when the link to Corpnet was severed for load testing with the alternative network architectures described in the Introduction, the MOM 2005 Server lost contact with the managed servers. This was unacceptable; therefore, we sought a way to preserve MOM 2005 functionality with or without Active Directory authentication.

The solution was to configure the MOM Agents to fail over to MOM 2005 Server aliases that represent alternative network paths to the server, BI-MORDOR. The MOM 2005 server has four network adapters, each representing a different virtual LAN (VLAN) with a different static IP address. Each Project REAL server resides on two of the private VLANs and has a Hosts file linking the MOM 2005 Server aliases to the appropriate static IP addresses (see Appendix A:  Hosts Files for the contents of the Hosts files on the Project REAL servers).

The Project REAL team used the following process to manually install agents with failover paths:

  1. Update the Hosts file on each managed computer. Add a name/IPaddress pair for each of the network interfaces to BI-MORDOR. Although the Hosts file is not required because you can fail over to an IP address instead, the Hosts file is a convenient repository for keeping track of static IP address configurations.

  2. Change the state of the managed computer to unmanaged. This step is required only if an automatic installation of the agent has already occurred. To change the state of a computer, use the Administrator Console of the MOM Server.

  3. Configure MOM agents for failover. Manually configure the MOM agent on each managed computer to fail over to one or more of the BI-MORDOR aliases. Each alias represents a different network link. For more information, see the next section.

  4. Verify the failover paths . First, disable network links on the managed computers and use the Network Monitor tool to verify UDP heartbeat traffic on the failover links. Next, re-enable the network links, and verify that traffic returns to the Corpnet link.

Manual Installation of MOM Agents

A manual installation of the MOM Agent on a managed computer can be performed by running the MOMAgent.msi installer on the managed computer. You can run the MOMAgent.msi installer interactively, and use the wizard to define most configuration parameters, or you can use the msiexec.exe utility, a client command-line interface that is included with the Windows Installer, to pass parameters to the MOMAgent.msi installer. For more information about the command-line syntax, see MOM 2005: MOMAgent.MSI (https://www.microsoft.com/technet/prodtechnol/mom/mom2005/Library/e830c5cb-8a68-4c61-8ac2-9edbc69a315e.mspx). Because failover parameters cannot be entered in the interactive wizard, we used the command-line utility to configure the failover paths. However, the initial installation of the agent was performed by using the interactive wizard, and the failover servers were entered later by using the command-line interface.

Table 5, which is excerpted from the MOM documentation described in the previous paragraph, describes the command-line options of particular importance to Project REAL, with additional notes on the Project REAL implementation shown in bold. In addition to using these parameters, we had to disable mutual authentication for all agents by using the MOM Administrator Console.

Table 5:  MOM Agent Command-Line Options

Option

Description

CONFIG_GROUP

The management group name. For Project REAL, this is “MOM Administrator Scope”.

MANAGEMENT_SERVER

The primary Management Server for the agent. For Project REAL, this is BI-MORDOR.

AM_CONTROL

The agent control level.

  • Full for full control.

  • Group for no control.

The default is “Group” and the value is case sensitive.

For Project REAL, this option is set to “Group” so that the MOM Server cannot undo the work of the manual installation by pushing default parameters out to the agent. “Group” is equivalent to specifying “None” in the interactive wizard.

CONFIG_GROUP_OPERATION

Used to remove an agent from a management group. Set this value to one of the following:

“RemoveConfigGroup”

“AddConfigGroup”

“ModifyConfigGroup”

For the Project REAL installations this option was set to “ModifyConfigGroup”.

REQUIRE_AUTH_COMMN

To enable mutual authentication, set this value to 1. If this value is set to 0, mutual authentication is not enabled.

The default value is 0.

When the Project REAL servers were moved to the SYS-SQLSVR domain but the MOM server remained on the REDMOND domain, mutual authentication had to be disabled or heartbeats could not be sent from the managed servers to the MOM server. Despite the documented default being 0, at least one of the servers had mutual authentication enabled when the agent was installed. To correct this, we manually set this option to 0.

ALT_MANAGEMENT_SERVER

The name of another Management Server in the same management group. The agent contacts this Management Server only if the primary Management Server (MANAGEMENT_SERVER) is unavailable during the installation process.

Note: After the agent has established communication with the Management Server, the value of this option will be determined by the primary Management Server.

Because this option lets you specify multiple servers, for Project REAL we used this option to designate the failover paths. The specific values depended on the VLAN that the managed computer resided on, but all values represented alternative paths to a single MOM Management Server. Notice that the wizard and the MOM Operator Console will display only the last failover path designated.

The alternate management server names that were used as values for the ALT_MANAGEMENT_SERVER option were taken from the Hosts file on the managed computer and represent alternate names for BI-MORDOR. For an example, see Appendix A:  Hosts Files.

Table 6:  Sample MOM Agent Command shows a version of the command that was used to configure a managed computer to fail over first to the BI-MORDOR-CS alias and then to the BI-MORDOR-IN alias of BI-MORDOR. Notice that this command should be typed on a single line.

Table 6:  Sample MOM Agent Command

Note: The line has been split into multiple lines for readability.However, while trying it out on a system you must enter it as one line without breaks.

msiexec /i MOMAgent.msi CONFIG_GROUP=”MOM Administrator  
Scope” MANAGEMENT_SERVER=”BI-MORDOR” AM_CONTROL=”Group”  
CONFIG_GROUP_OPERATION=”ModifyConfigGroup” REINSTALL= 
”ALL” REQUIRE_AUTH_COMMN =0 ALT_MANAGEMENT_SERVER=   
”BI-MORDOR-CS” ALT_MANAGEMENT_SERVER=”BI-MORDOR-PRI” /q

Note:  Instead of using aliases for BI-MORDOR, we could have used static IP addresses.

Installing MOM 2005 Reporting

Deployment of MOM 2005 Reporting requires prior installation of SQL Server 2000 Reporting Services, which requires that the Enterprise Edition of SQL Server 2000 be installed first.

Note:  Service Pack 2 of SQL Server 2000 Reporting Services is incompatible with the MOM 2005 Reporting installer. For a description of the problem and one suggested workaround, see article KB902804,“You may receive an "Incorrect version of Microsoft SQL Server Reporting Services is installed" error message when you try to install MOM 2005 Reporting,” in the Microsoft Knowledge Base at https://support.microsoft.com. In the Project REAL lab, we just uninstalled SQL Server 2000 Reporting Services and reinstalled it without the service pack.

The Project REAL lab opted for the following additional installation options:

  • Visual Studio .NET was installed so that we could customize MOM reports.

  • To avoid authentication problems when the Project REAL network was isolated from Corpnet during performance testing, Reporting Services setup was performed by using the local Administrator account on BI-MORDOR. After setup, ownership of the Reporting databases defaulted to this user account. Because of this decision, all MOM Reporting Subscriptions must be created under the local Administrator account.

In one of our initial experiments installing SQL Server 2000 Reporting Services, the report server failed to open a connection to the report server database. This was fixed by running the rsconfig utility to reset the user account to the login designated as the dbo for the report server database.

Note:  MOM 2005 Reporting uses a domain account and previously stored credentials for unattended reports to enable reports to access domain resources. However, Microsoft corporate policies require that the password for domain accounts be changed periodically, and any password change causes subscriptions to stop working. You can update the domain account password stored by MOM 2005 Reporting by using the rsconfig –e option.

Importing MOM 2005 Management Packs

Several management packs are available as part of MOM 2005 server setup. These management packs are not imported automatically unless you install the Workgroup edition of MOM 2005. To import a management pack, determine which pack you want, open the MOM 2005 Administrator Console, and then run the wizard by clicking the link to Import/Export Management Packs under Setup and Configuration Tasks.

For Project REAL, we imported the management packs that are listed in the following table.

Table 7:  Management Packs Used in the Project REAL Lab

Management pack

Description

Microsoft Operations Manager

Monitors MOM 2000 (SP1), MOM 2005, and required subsystems. This management pack monitors the MOM Server and its agents.

Microsoft SQL Server

Monitors instances of SQL Server version 7.0, SQL Server 2000, and SQL Server 2005. The only server in the Project REAL network that uses SQL Server 2000 is the MOM Server.

Early in Project REAL, a pre-beta version of a separate management pack was provided by the development team for SQL Server 2005.

Microsoft Windows Internet Information Services

Monitors the performance and availability of Microsoft Internet Information Services (IIS) version 5.0, or a later version. This management pack is used to monitor the middle-tier Web servers and the SQL Server 2005 Reporting Services servers in the Project REAL network and to monitor SMTP on the MOM Server.

Microsoft Windows Servers Base Operating System

Monitors the performance and availability of Microsoft® Windows® Operating Systems version 4.0 and later versions. Every server in the Project REAL network is running Windows Server 2003 SP1, Enterprise Edition or Datacenter Edition. Rules for other operating system versions have been disabled.

Unisys Server Sentinel

Monitors events from ES7000 service processors where Unisys Server Sentinel is being used. This management pack is available to Unisys customers at support.unisys.com. For more information, see the section of this paper on Unisys ES7000 Management Tools .

These management packs also include report definitions for use with MOM 2005 Reporting.

Customizing Management Packs

MOM 2005 management packs are released with some rule groups disabled and with parameters that are designed to meet the needs of typical data centers. The Project REAL team tuned these rule groups and added other rule groups to meet the needs of the project. Some particular changes are listed here:

  • E-mail notification groups were configured so that team members would receive real-time notification of critical problems. More information about real-time alert notifications is provided in the next section.

  • Notification responses were added for some of the rules that had lacked notifications, such as the rules that detect missing MOM agent heartbeats. Adding new notifications was workable for Project REAL only because of the small number of servers.

  • Custom computer groups were defined so that notifications and reports could be restricted to particular computers of interest.

  • Custom rules were created for particular events that the Project REAL team found interesting, such as events specific to the Unisys ES7000 hardware platform.

  • MOM 2005 Report subscriptions were generated to periodically send reports, such as Disk Performance Analysis, to members of the Project REAL team by using e-mail as described in the following section.

The process of tuning management packs is ongoing as the needs of the project change.

Real-Time Alert Notifications

Because Project REAL team members were focused on preparing the environment for performance testing, no one had time to watch for system problems by regularly examining MOM Alerts. Occasionally, problems such as full volumes were not noticed until something stopped working. Although MOM periodically checks for full volumes and posts alerts that could have prevented the down time associated with these problems, the alerts are useful only when they come to the attention of the team.

To help in the prompt communication of possible problems to the members of the team who are most likely to be affected by the issue, we spent some time implementing two features of MOM 2005: e-mail notifications of alerts, and automatic generation of MOM Reports.

For security reasons, the Microsoft IT group requires any applications that generate e-mail messages, such as MOM 2005, to send these messages through a tightly secured gateway server. The IT group provided instructions on how to use their “smart host” on the corporate intranet. As instructed, we configured IIS on the MOM Server to forward authenticated messages through the smart host. Additionally, all messages must be “FROM” a valid domain account. Using the MOM 2005 Administrator Console, we added team member e-mail addresses to the predefined notification Groups. After this was done, Project REAL team members started receiving e-mail notifications of the most critical alerts as soon as they were detected by the MOM Server.

Not all the e-mail notifications indicate errors. As the sample e-mail content in Table 8 illustrates, if you use the default alert settings, many “critical error” alerts that are not really errors are generated during load testing:

Table 8:  Example of E-Mail Notification of Critical Error

Note: Some of the lines in the following code have been displayed on multiple lines for better readability.

Severity:  Critical Error 
Status:  New    
Source:  Memory:  % Committed Bytes In Use:   
Name:  Performance Threshold: Memory\% Committed 
bytes In Use threshold exceeded. 
Description:  Extremely high percentage of physical 
memory in use. Memory:  % Committed Bytes In Use:    
value = 81.5063003778612. The average over last 3  
samples is 81.5063. 
Domain:  REDMOND 
Agent:  BI-REAL-AS  
Time:  12/5/2005 22:25:00 
Owner:

When these notification e-mail messages started arriving, we began weeding out the unhelpful notifications and adding other alerts that would be more useful. You can add alerts by using the Administrator Console of MOM 2005. Creating and monitoring these notifications is an ongoing activity, because the focus of activity in the lab frequently changes. The section titled Tuning MOM Management Packs describes this tuning in the Project REAL lab in more detail.

MOM 2005 Reports

MOM 2005 performs daily maintenance on its alerts database and moves the historical data on alerts and performance counters into a separate reports database. MOM 2005 Reporting can generate summary reports that are helpful for detecting configuration or performance problems. Other reports list the current software versions, which is particularly helpful in a development environment where software versions frequently change. This makes it possible to correlate a particular behavior such as unusually high memory usage with a particular software level, if the historical data is preserved.

The Project REAL team decided that it could be valuable to have historical versions of selected MOM 2005 Reports available on demand. Therefore, we explored the SQL Server 2000 Reporting Services feature that lets users subscribe to reports on a timed schedule. Because we had earlier made the decision to set up the MOM Server for database access using the local Administrator account, to prevent authentication problems when disconnected from Corpnet during performance testing, we also had to configure subscriptions under the local Administrator account. To make the reports available to other users, some of these subscriptions generate the daily reports and then export them to a file share on the MOM Server or send them in e-mail to specific users.

Event Viewer

Microsoft Windows Event Viewer is a powerful tool for diagnosing problems on Project REAL servers. Any user who has appropriate authority and access to the domain to which the Project REAL servers belong can display Event Viewer on his or her computer, change the destination computer to the Project REAL server, and view the contents of that server’s event logs.

Because the filtering capabilities of Event Viewer are limited, it can be helpful to export the contents of one or more event log files to a tab-delimited text file or a comma-delimited (.csv) file by using the Export List... option in the Action menu. You can then import this file into an Excel spreadsheet, manipulate the event data in more advanced ways, or correlate events on different servers.

Cc966423.REALMonInst04(en-us,TechNet.10).gif

Figure 4. Connecting to another computer in Event Viewer

Performance Counters

The SQL Server 2005 product teams added many new performance counters for SQL Server 2005, and many of these performance counters are regularly captured during Project REAL performance testing. We also captured many of the base OS counters, such as CPU usage counters, disk usage statistics, and memory statistics. The Project REAL team defined several counter logs, and these logs are enabled as needed to collect performance data during test runs.

The data captured during performance testing can be analyzed either by manually importing it into a spreadsheet and using Excel features to create custom graphs, or through use of other performance tools. One such tool used by Project REAL is an enhanced version of System Monitor, which is named PerfMonPlus and was developed and used internally by Unisys personnel. Figure 5 shows a sample graph generated by PerfMonPlus.

Cc966423.REALMonInst05(en-us,TechNet.10).jpg

Figure 5. PerfMonPlus

PerfMonPlus was written to save the author time and effort in his analysis of performance data. As the scale of computer systems has changed, the default scaling factors that developers chose for many of the performance counters have become obsolete to the point that most counters have to be re-scaled to make the graphs visible in the graphical display of System Monitor (also known as PerfMon). PerfMonPlus wraps the System Monitor in logic that adjusts the scaling factors automatically. After the wrapper was built, the author realized that there were other functions he performed very frequently, such as deleting all the insignificant instances, and added some of those functions.

This application was implemented in Visual Basic 6.0 by putting a System Monitor control on a form and manipulating its methods and properties. For more information about how to use the System Monitor control, see the MSDN Library at https://msdn2.microsoft.com/library/

Task Manager

Task Manager is convenient for viewing real-time performance. For Project REAL, we frequently enabled Task Manager during performance runs to give the users a feel for what was going on or to demonstrate activity during presentations.

SQL Server 2005 Profiler

Microsoft SQL Server 2005 Profiler is a tool available with SQL Server 2005 that can be used to trace SQL Server 2005 activity and replay it later. The Project REAL team uses this tool together with Performance Monitor to identify performance bottlenecks and optimize overall performance.

Network Monitoring

Project REAL servers and network equipment reside in two access-controlled labs on the Microsoft campus and coexist with equipment being used for several other projects. These labs have full-time staff to monitor and maintain the network infrastructure and the labs use their own suite of tools. The following two applications are used for managing Project REAL network performance:

  • Extreme EPI Center Management Suite

  • SolarWinds Network Management

Extreme Networks EPICenter Management Suite

Project REAL uses two Summit 400-48t network switches (www.extremenetworks.com/libraries/prodpdfs/products/Summit400_DS.asp), provided by Extreme Networks (www.extremenetworks.com), for 10 gigabit Ethernet connectivity. The tool suite used to manage these switches is the EPICenter network management software. This management suite lets the lab staff monitor information such as:

  • Whether a switch is alive

  • When a switch reboots

  • Fan failures

  • Power supply failures

Figure 6 shows a page from this tool that lists some of the alarms that this tool can report. For more information about the EPICenter management suite, at http://www.extremenetworks.com.

Cc966423.REALMonInst06(en-us,TechNet.10).jpg

Figure 6. EPICenter Network Management tool

SolarWinds Network Management Software

The lab staff monitors bandwidth utilization to determine when more capacity should be provided. SolarWinds Network Management Software is used to take periodic snapshots of network activity so that the lab staff can chart the data and monitor trends. Information about this tool can be found at www.solarwinds.net

Partner Tools

The Project REAL partners who provided hardware for the project have also provided proprietary software tools to manage these systems.

EMC Storage Management Applications

The EMC storage products are managed by using EMC tools, Navisphere Analyzer for CLARiiON SANs and EMC ControlCenter (ECC). EMC Workload Analyzer, a component of ECC, is used for Symmetrix/DMX performance monitoring. ECC can also link SAN management to MOM 2005. Additional information about EMC products and tools can be found at www.emc.com/

Figure 7 is a screenshot from ECC, and shows details about one of the disks that is attached to server BI-REAL-ES32.

Cc966423.REALMonInst07(en-us,TechNet.10).jpg

Figure 7. EMC Control Center tool

Emulex

The HBAnyware application was used in the Project REAL lab to manage Host Bus Adapters (HBAs) provided by Emulex. For more information about this application, at Emulex Products (http://www.emulex.com/products/hba/software.jsp).

Unisys ES7000 Management Tools

Each Unisys ES7000 server is delivered with an external or embedded service processor that is used to configure and monitor the hardware environment. For example, by using the service processor, you can partition a 32-CPU server into two servers that use 16 CPUs or possibly a 24-CPU partition and an 8-CPU partition. The Unisys Server Sentinel management solution is designed to provide comprehensive monitoring and automatic control of the service processors and, through these processors, management of the ES7000 partitions.

A full installation of Unisys Server Sentinel as a freestanding management product requires a Sentinel Management Server that plays a role similar to that of the MOM 2005 Server used by Project REAL. For more information about the features provided by the Server Sentinel product, see Appendix B:  Unisys ES7000 Management. Although the Project REAL team could have installed the full Unisys Server Sentinel solution to integrate with MOM 2005, instead the team decided to use MOM 2005 for management of the ES7000 servers, and add the newly-released Unisys Server Sentinel Management Pack for MOM 2005 to monitor the ES7000 service processors. This management pack includes rules that generate MOM 2005 alerts based on SNMP traps generated by the service processors when there are system problems.

For more information about the Unisys systems software, see Unisys: Server Sentinel and Self-Managing Server Software (http://www.unisys.com/products/enterprise__servers/high_d_end__servers/system__software/unisys__server__sentinel/features.htm).

Using the Tools

The Project REAL environment was intended to simulate a realistic data center that gathers and interprets business data, but it is actually closer to a research facility or development environment in its day-by-day use. Some of the more obvious differences include the following:

  • Until the release of SQL Server 2005, Project REAL used pre-release versions that still contained bugs that have since been fixed. Consequences included occasional crashes and frequent upgrades to the software. Sometimes different servers were running different levels of the software.

  • Project REAL intentionally stresses the servers and software by running high loads to test the limits of the systems.

  • Because one of the main goals of Project REAL is to assess trade-offs between configuration choices, the hardware and software are frequently reconfigured. Some of the configurations that were tested include the following:

    • Distributed versus consolidated server architectures

    • 32-bit processors versus 64-bit processors

    • Different SAN architectures

    • Table partitioning methods

  • Queries and scripts that run the tests used to stress the system were also developed and debugged by the Project REAL team on the same servers.

Generally, monitoring tools such as MOM 2005 are used to monitor production environments instead of development environments; therefore, Project REAL has seen an unusual mix of alerts, many of which could be ignored because they represented an intentional restart or configuration change. In such an environment, the challenge is to determine which alerts reflect real problems and which are safe to ignore. When problems do occur, the ongoing monitoring by different tools can be valuable for tracking down the causes or verifying that the problems have been fixed. The following sections describe some of the ways in which these tools have been used for Project REAL. Some particular scenarios include the following:

  • Monitoring ongoing operations

  • Tuning MOM 2005 management packs

  • Performance testing

  • Capturing system configurations in preparation for flattening and rebuilding the environment.

Monitoring Ongoing Operations

When not running high-load performance tests, Project REAL team members are usually working on various development projects, reloading software or upgrading to a newer version, running non-load tests, or using the Project REAL network as a development environment. Many different pre-release versions of SQL Server 2005 were installed and configured, including both publicly released versions and development builds. MOM was especially helpful in quickly flagging the occasional configuration error, such as when tempdb was accidentally put on drive C. E-mail notifications were also useful for quickly finding the problem when something went wrong. Periodic MOM Reports can include regular snapshots of the software levels that are running on the various servers, which can help tie problems to particular software levels.

Tuning MOM 2005 Management Packs

The MOM 2005 management packs incorporate detailed knowledge of the health models of the products they are designed to monitor, but because software products are used in different ways in different environments, Microsoft expects that customers will customize the management packs to suit their needs. This is done through the authoring features of MOM 2005, which allow management pack rules to be created, deleted, enabled, disabled or modified. You can also create new management packs from scratch to manage custom software products. Microsoft provides a free Alert Tuning Solution Accelerator (https://www.microsoft.com/downloads/details.aspx?FamilyID=f6ac090e-a594-4eb5-96d9-2a5feb827bcc\&displaylang=en) that provides prescriptive guidance and tools for minimizing alert “noise”, or excessive alert volume. We installed this tool, which provides MOM 2005 Reports that show alert volumes by alert type. Table 9 shows an example of one of these reports. After viewing this report, we would probably want to investigate the processing rule, “SQL Server Database Health – High impact database is unhealthy”, to see why so many alerts are being generated. We might want to disable this rule if there is nothing we can correct.

Table 9:  Alert Tuning Solution Accelerator Report

Alert Counts By Processing Rule

Alert Counts By Processing Rule

Alert Counts By Processing Rule

Processing Rule

Raised Alerts

Total Alerts

SQL Browser Services Instance Health Check

3

3

An error occurred while the query log table was being created.

3

6

The service cannot be started.

3

6

SQL Integration Services Instance Health Check

2

2

Error reported from Message Manager.

2

2

SQL Server Instance Health Check

2

4

SQL Reporting Services Instance Health Check

1

1

SQL Server Database Health - High impact database is unhealthy

1

107

An error occurred while starting the query log.

1

2

SQL Agent Instance Health Check

1

1

SQL Full-Text Search Instance Health Check

1

2

The OLE DB provider reported an error.

1

2

A SQL job failed to complete successfully.

1

59

SQL Server terminating because of system shutdown

1

1

Tuning the MOM 2005 Management Pack rules to the Project REAL environment is an ongoing task. We disabled some obviously inappropriate rule groups such as those for Windows NT 4.0 or Windows 2000, because there are no servers in the Project REAL network that are running either of these operating systems, but the tuning process also requires periodic review of the alerts generated by MOM to determine which alerts are uninteresting in the Project REAL context. Unnecessary alerts are disabled whenever they are detected. Several performance rules for Microsoft SQL Server 2005 were disabled by default when we imported this management pack, and we later enabled them.

Some alerts appear regularly but are not useful and cannot be disabled. One example is a critical alert that complains about a full volume on one of our test servers. This volume is not currently used by Project REAL and is left over from a previous project. However, if we disabled this rule, we would lose the ability to detect other full volumes. For example, on rare occasions a user can unintentionally configure a tempdb database on the C: volume and quickly fill up the volume.

Because so much of the Project REAL work focuses on performance analysis, we have added performance rules that are designed to capture some especially interesting performance counters at intervals more frequent than the usual 15 minutes. The tradeoff is that we generate much more data—60 times as many data points for the same rule when the collection interval is changed from 15 minutes to 15 seconds—but have a more detailed picture of rapidly changing performance characteristics. We have also added some performance rules to capture processor usage of processes specific to Microsoft SQL Server 2005.

Another kind of alert tuning focused on adding or removing notifications depending on the alert and the server that was the subject of the alert. For example, before we modified the settings, the loss of an agent’s heartbeat generated only a generic MOM Server alert saying that the heartbeat status had changed, and the MOM alert did not state which managed server’s agent had stopped communicating. However, the more specific alerts did not generate notifications. Therefore, we modified alerts individually, by using the MOM 2005 Administrator Console, to generate e-mail notifications that had more specific information. Other changes that we made included enabling e-mail notifications for some computer groups but not for other less critical groups, such as the computers in the client and web server network tiers. We also took advantage of the ability to define notification groups and directed alerts to different groups depending on the interests of team members—for example, some groups received notification of OS alerts, while other groups received the alerts generated by SQL Server 2005 applications.

Performance Testing

Much of the work performed by the Project REAL team involves performance analysis for various functional scenarios. In order to compare the results of different test runs, large amounts of data are collected and analyzed. This data includes performance counters that were collected by enabling counter logs, performance statistics from MOM 2005, and data collected by the SQL Server 2005 Profiler. Task Manager is frequently used to provide real-time performance displays while tests are running.

Using MOM 2005 for Performance Analysis

The MOM 2005 management packs installed for Project REAL collected a number of performance counters by default, but performance was averaged over relatively long time intervals, such as 15 minutes. For Project REAL, we were interested in collecting a few select performance counters over much shorter intervals—15 seconds instead of 15 minutes. We used the MOM 2005 Administrator Console to customize the management packs and add performance rules as needed.

For SQL Server 2005 Analysis Services, instead of modifying existing management packs, we decided to create a new rule group that has four performance processing rules:

  • Free System Page Table Entries.   Sampled at 1 minute intervals.

  • Processor-% Processor Time-_Total.   Sampled at 15-second intervals.

  • MSAS 2005 Rows read/sec.   Sampled at 15-second intervals.

  • msmdsrv (AS) process %processor time.   Sampled at 15-second intervals.

To limit the application of these rules to only those servers actually running Analysis Services, we created a computer group, named SQL Server 2005 Analysis Services, that contained the following servers:

  • BI-REAL-AS

  • BI-REAL-ES32

  • BI-REAL-ES64

  • BI-REAL-EX64 (after this server was added to the configuration)

The SQL Server 2005 Analysis Rules were configured to apply only to this computer group.

In MOM 2005, you can collect and work with performance data interactively, and display performance graphs for recent data by using the Operator Console interface. The operational database is partitioned daily, and the performance data is moved to the archive where it can be displayed using MOM 2005 Reporting. The following figures illustrate the differences between the two displays.

Figure 8 shows an interactive performance graph displayed on the Operator Console. This figure shows activity after the BI-REAL-ES32 server was added to the SQL Server 2005 Analysis Services computer group, and illustrates the difference between collecting performance counters at more frequent intervals. The graph initially shows data collected every 15 minutes, followed by data collected every 15 seconds.

The following counters are shown:

  • Processor-% Processor Time-_Total (yellow line).   Sampled at 15-second intervals.

  • msmdsrv (AS) process %processor time (green line).   Sampled at 15-second intervals.

    Cc966423.REALMonInst08(en-us,TechNet.10).jpg

    Figure 8. MOM 2005 operator console

Figure 9 shows a performance graph that was generated from historical data by using MOM 2005 Reporting Services. The peaks in this graph show the high processor usage by the msmdsrv service during Analysis Services load testing on the BI-REAL-AS server.

Cc966423.REALMonInst09(en-us,TechNet.10).jpg

Figure 9. MOM 2005 report

MOM 2005 Reporting Services is helpful for monitoring long-term trends that can indicate performance problems. For example, Figure 10 shows the amount of pool non-paged memory that was used by servers BI-REAL-ES32 and BI-REAL-RS over time. During load testing in the Project REAL lab, we saw some very strange symptoms that were eventually attributed to a lack of sufficient non-paged memory for kernel-mode processing.

Cc966423.REALMonInst10(en-us,TechNet.10).jpg

Figure 10. MOM 2005 performance report

Capturing System Configurations

Before they started performance testing, the Project REAL team decided that the servers should be flattened and rebuilt from scratch to provide a consistent baseline for the testing. Many of the servers had applications or data that was outdated or no longer necessary. However, before we started to rebuild the servers, we needed to collect configuration data to help make the rebuild as quick and smooth as possible.

Microsoft provides tools that allow automation of much of the system build process, and some of these were used to rebuild the servers. To “clean house” on the servers, we first captured and studied the current configurations, including installed applications and services, networking configurations, and boot parameters. From this information, we determined what had to be reproduced and what did not. Tools used to collect this information included MOM 2005 and Windows Server 2003 tools, such as the Netsh and IPCONFIG utilities. We also exported lists of active services.

MOM 2005

The MOM 2005 management packs that were installed in the Project REAL lab include reports that conveniently list much of the configuration information that the team was interested in analyzing. Some MOM 2005 reports of particular help to the analysts are described in the following sections.

Operating System Configuration

This report summarizes the configurations of multiple servers in a single table. We exported the report into an Excel spreadsheet for convenience. A part of this spreadsheet appears in 10. Notice that this report was generated before the BI-REAL-AS server was moved from the REDMOND domain to the SYS-SQLSVR domain.

Table 10. Operating System Configuration (Excerpt)

Server Name

Server Name

Operating System

Operating System Version

Service Pack Version

Processor Manufacturer

Processor Speed

Processor Identifier

Number of Processors

REDMOND\BI-REAL-AS

REDMOND\BI-REAL-AS

Microsoft(R) Windows(R) Server 2003, Enterprise Edition

5.2.3790

1.0

Genuine Intel

2192 MHz

x86 Family 15 Model 2 Stepping 6

8

 

Total Physical Memory

Serial Number

Time Zone Bias

Time Zone Bias

BIOS Date

BIOS Date

BIOS Manufacturer

BIOS Manufacturer

 

3583 MB

69713-071-6226766-42793

-8:00

-8:00

4/20/2004

4/20/2004

UNISYS Corp.

UNISYS Corp.

 

System Drive

Operating System Language

Locale

Locale

Install Date

Install Date

BIOS Version

BIOS Version

 

C:

1

English (United States)

English (United States)

9/21/2004 5:26:48 PM

9/21/2004 5:26:48 PM

DELL   - 1

DELL   - 1

Software and Application Installations (by Server)

This report lists software and applications that are installed on the servers in the selected group, including hot fixes and security updates. One useful exercise is to export this report to an Excel spreadsheet and sort by application version to see which servers are out of date. This report can also be used to create a list of the applications that must be included in the clean build but might otherwise be forgotten. This report was generated before the BI-REAL-AS server was moved from the REDMOND domain to the SYS-SQLSVR domain.

Table 11. Software and Application Installations by Server – (Excerpt)_

Server

Application Name

Application Version

Application Vendor

Installation Date

REDMOND\BI-REAL-AS

Emulex Fibre Channel HBAnyware Version 2.0A13

2.00.19

Emulex Corporation

20041022

REDMOND\BI-REAL-AS

Windows Media Player 9 Hotfix [See KB885492 for more information]

Data Unavailable

Microsoft Corporation

Data Unavailable

REDMOND\BI-REAL-AS

SMS Advanced Client

2.50.3174.1152

Microsoft Corporation

20050909

REDMOND\BI-REAL-AS

Microsoft Operations Manager 2005 Agent

5.0.2749.0

Microsoft Corporation

20050621

REDMOND\BI-REAL-AS

EMC PowerPath 4.4.0 (32bit)

4.4.0

EMC Corporation

20050714

REDMOND\BI-REAL-AS

Microsoft SQL Server 2005 Analysis Services (MT_STD_SUBJPART)

9.00.1399.06

Microsoft Corporation

20051208

REDMOND\BI-REAL-AS

Navisphere Agent

Data Unavailable

Data Unavailable

Data Unavailable

REDMOND\BI-REAL-AS

Microsoft .NET Framework 2.0

2.0.50727

Microsoft Corporation

20051111

REDMOND\BI-REAL-AS

SQLXML4

9.00.1399.06

Microsoft Corporation

20051128

REDMOND\BI-REAL-AS

Microsoft XML Parser

8.80.1006.0

Microsoft Corporation

20050302

REDMOND\BI-REAL-AS

Windows Server 2003 Service Pack 1

20050324.163929

Microsoft Corporation

Data Unavailable

REDMOND\BI-REAL-AS

MSXML 6.0 Parser

6.00.3883.8

Microsoft Corporation

20051128

REDMOND\BI-REAL-AS

Microsoft SQL Server 2005

Data Unavailable

Microsoft Corporation

Data Unavailable

REDMOND\BI-REAL-AS

Security Update for Windows Server 2003 (KB905414)

1

Microsoft Corporation

20051127

REDMOND\BI-REAL-AS

Microsoft SQL Server 2005 Analysis Services

9.00.1399.06

Microsoft Corporation

20051128

Windows Server 2003 Utilities

There are too many useful tools available with Windows Server 2003 to document them all here, but some that have been particularly useful for capturing configuration data include Netsh and ipconfig.

Netsh

The Netsh tool can be used to record configuration data in a text file. This text file can then be “replayed” on a rebuilt system to restore the network configuration. For example, the following command was used to capture the IP configuration script shown in Table 11:

> netsh interface ip dump > netcnfg.nsh

To configure the newly flattened server, we would execute this script by using the following command:

> netsh –f netcnfg.nsh

Additional information on the Netsh tool is available on TechNet. See The Netsh Command-Line Utility (technet2.microsoft.com/WindowsServer/en/Library/fd1e2fbe-15a6-413b-b712-28afb312c92f1033.mspx).

Table 12. Sample IP Configuration

Note: Some of the lines in the following code have been displayed on multiple lines for better readability.

# ----------------------------------  
# Interface IP Configuration          
# ----------------------------------  
pushd interface ip 
# Interface IP Configuration for \"SIM-PRI\" 
set address name="SIM-PRI" source=static  
addr=192.168.1.104 mask=255.255.255.0 
set dns name="SIM-PRI" source=static  
addr=none register=PRIMARY 
set wins name="SIM-PRI" source=static addr=none 
# Interface IP Configuration for \"SIM-CS\" 
set address name="SIM-CS" source=static  
addr=192.168.10.104 mask=255.255.255.0 
set dns name="SIM-CS" source=static  
addr=none register=PRIMARY 
set wins name="SIM-CS" source=static addr=none 
# Interface IP Configuration for \"Corpnet\" 
set address name="Corpnet" source=dhcp  
set dns name="Corpnet" source=static  
addr=10.193.8.10 register=PRIMARY 
add dns name="Corpnet" addr=10.193.20.10 index=2 
add dns name="Corpnet" addr=157.55.254.211 index=3 
add dns name="Corpnet" addr=157.54.5.109 index=4 
set wins name="Corpnet" source=static  
addr=157.55.254.201 
add wins name="Corpnet" addr=157.54.7.34 index=2 
popd 
# End of interface IP configuration
IPCONFIG

The ipconfig utility is especially useful for capturing IP configuration data in a text file. We used the command ipconfig /all > textfile and then imported the resulting text file into an Excel spreadsheet that we sorted by IP address or by VLAN. An excerpt from this spreadsheet is shown in Table 13.

Table 13. IPCONFIG Data

Host Name

Adapter Name

Description

Physical Address

DHCP Enabled

IP Address

Subnet Mask

bi-real-ds

Ethernet adapter SIM-PRI

Intel(R) PRO/100 S Server Adapter

00-02-B3-BB-65-A5

No

192.168.1.101

255.255.255.0

bi-real-es32

Ethernet adapter SIM-PRI-1000

Intel(R) PRO/1000 MT Dual Port Server Adapter #2

00-07-E9-11-7A-41

No

192.168.1.102

255.255.255.0

bi-real-es64

Ethernet adapter SIM-PRI-1000

Intel(R) PRO/1000 MT Dual Port Server Adapter #3

00-07-E9-11-7A-AC

No

192.168.1.103

255.255.255.0

bi-real-dw

Ethernet adapter SIM-PRI

Intel(R) PRO/1000 MT Server Adapter

00-07-E9-11-54-4D

No

192.168.1.104

255.255.255.0

bi-real-is

Ethernet adapter SIM-PRI

Intel(R) PRO/100 S Server Adapter

00-02-B3-B5-A1-CA

No

192.168.1.105

255.255.255.0

bi-real-as

Ethernet adapter SIM-PRI

Intel(R) PRO/100+ Server Adapter (PILA8470B)

00-03-47-0C-EC-50

No

192.168.1.106

255.255.255.0

bi-real-rs

Ethernet adapter SIM-PRI

Intel(R) PRO/100 S Server Adapter

00-02-B3-97-7D-B9

No

192.168.1.107

255.255.255.0

Conclusion

This white paper provides a quick tour of the tools used by Project REAL to instrument and monitor the servers and other infrastructure used by the project team. We described the unusual architecture deployed by Project REAL to allow testing of distributed versus consolidated server deployments, different storage architectures and networking scenarios. We explained how this architecture affected tool installation and usage, particularly the Microsoft Operations Manager 2005. We also provided overviews of other useful monitoring tools that were used either for ongoing operations or performance analysis, and explained how we used tools to capture configuration data in order to rebuild the data tier from scratch.

For more information about the Hosts files that were used for the unusual network architecture of Project REAL, see Appendix A.

For more information about Project REAL partner tools, see Appendix B and Appendix C.

For a description of an actual system problem involving memory allocation on a consolidated 32-bit system and how it was resolved by the Project REAL team, see Appendix D.

Links to additional resources can be found in Appendix E.

Appendix A. Hosts Files

Table 14 lists the contents of the Hosts file for the two ES7000 servers in the Project REAL network. All other servers have the same contents except for the private links to the ES7000 service processors at the end.

Table 14. Sample Hosts File

# Copyright (c) 1993-1999 Microsoft Corp. 
# 
# This is a sample HOSTS file used by Microsoft TCP/IP for Windows. 
# 
# This file contains the mappings of IP addresses to host names. Each 
# entry should be kept on an individual line. The IP address should 
# be placed in the first column followed by the corresponding host name. 
# The IP address and the host name should be separated by at least one 
# space. 
# 
# Additionally, comments (such as these) may be inserted on individual 
# lines or following the machine name denoted by a '#' symbol. 
# 
# For example: 
# 
#      102.54.94.97     rhino.acme.com          # source server 
#       38.25.63.10     x.acme.com              # x client host 
 
127.0.0.1       localhost 
 
# This is the Project REAL HOSTS file 
# Subnet mask is always 255.255.255.0 
# 
############################################# 
 
# simulated private network 
 
192.168.1.1     BI-FW-FRODO-PRI 
192.168.1.8     BI-VLAN-PRI-1367 
192.168.1.9     BI-VLAN-PRI-2367 
192.168.1.101   BI-REAL-DS-PRI      
192.168.1.102   BI-REAL-ES32-PRI 
192.168.1.103   BI-REAL-ES64-PRI 
192.168.1.104   BI-REAL-DW-PRI 
192.168.1.105   BI-REAL-IS-PRI 
192.168.1.106   BI-REAL-AS-PRI 
192.168.1.107   BI-REAL-RS-PRI 
192.168.1.201   BI-BAGEND-PRI 
192.168.1.202   BI-SHIRE-PRI 
192.168.1.203   BI-HELMS-DEEP-PRI 
192.168.1.204   BI-RIVENDELL-PRI 
192.168.1.205   BI-MORIA-PRI 
192.168.1.206   BI-MORDOR-PRI 
192.168.1.211   BI-CLIENT-01-PRI 
192.168.1.212   BI-CLIENT-02-PRI 
192.168.1.213   BI-CLIENT-03-PRI 
192.168.1.214   BI-CLIENT-04-PRI 
192.168.1.215   BI-CLIENT-05-PRI 
192.168.1.220   BI-GANDALF-PRI 
192.168.1.221   BI-ARWEN-PRI 
 
# simulate DMZ (IN-side, i.e. between the DMZ and the Private subnet) 
 
192.168.2.1     BI-FW-FRODO-DMZ 
192.168.2.8     BI-VLAN-IN-1367 
192.168.2.9     BI-VLAN-IN-2367 
192.168.2.101   BI-REAL-DS-IN 
192.168.2.102   BI-REAL-ES32-IN 
192.168.2.103   BI-REAL-ES64-IN 
192.168.2.104   BI-REAL-DW-IN 
192.168.2.105   BI-REAL-IS-IN 
192.168.2.106   BI-REAL-AS-IN 
192.168.2.107   BI-REAL-RS-IN 
192.168.2.201   BI-BAGEND-IN 
192.168.2.202   BI-SHIRE-IN 
192.168.2.203   BI-HELMS-DEEP-IN 
192.168.2.204   BI-RIVENDELL-IN 
192.168.2.205   BI-MORIA-IN 
192.168.2.206   BI-MORDOR-IN 
192.168.2.211   BI-CLIENT-01-IN 
192.168.2.212   BI-CLIENT-02-IN 
192.168.2.213   BI-CLIENT-03-IN 
192.168.2.214   BI-CLIENT-04-IN 
192.168.2.215   BI-CLIENT-05-IN 
192.168.2.220   BI-GANDALF-IN 
192.168.2.221   BI-ARWEN-IN 
 
# simulate DMZ (OUT-side, i.e. between the DMZ and the Public subnet) 
 
192.168.3.1     BI-FW-SAM-DMZ 
192.168.3.8     BI-VLAN-OUT-1367 
192.168.3.9     BI-VLAN-OUT-2367 
192.168.3.101   BI-REAL-DS-OUT 
192.168.3.102   BI-REAL-ES32-OUT 
192.168.3.103   BI-REAL-ES64-OUT 
192.168.3.104   BI-REAL-DW-OUT 
192.168.3.105   BI-REAL-IS-OUT 
192.168.3.106   BI-REAL-AS-OUT 
192.168.3.107   BI-REAL-RS-OUT 
192.168.3.201   BI-BAGEND-OUT 
192.168.3.202   BI-SHIRE-OUT 
192.168.3.203   BI-HELMS-DEEP-OUT 
192.168.3.204   BI-RIVENDELL-OUT 
192.168.3.205   BI-MORIA-OUT 
192.168.3.206   BI-MORDOR-OUT 
192.168.3.211   BI-CLIENT-01-OUT 
192.168.3.212   BI-CLIENT-02-OUT 
192.168.3.213   BI-CLIENT-03-OUT 
192.168.3.214   BI-CLIENT-04-OUT 
192.168.3.215   BI-CLIENT-05-OUT 
192.168.3.220   BI-GANDALF-OUT 
192.168.3.221   BI-ARWEN-OUT 
 
# simulate public network  
 
192.168.4.1     BI-FW-SAM-PUB 
192.168.4.8     BI-VLAN-PUB-1367 
192.168.4.9     BI-VLAN-PUB-2367 
192.168.4.101   BI-REAL-DS-PUB 
192.168.4.102   BI-REAL-ES32-PUB 
192.168.4.103   BI-REAL-ES64-PUB 
192.168.4.104   BI-REAL-DW-PUB 
192.168.4.105   BI-REAL-IS-PUB 
192.168.4.106   BI-REAL-AS-PUB 
192.168.4.107   BI-REAL-RS-PUB 
192.168.4.201   BI-BAGEND-PUB 
192.168.4.202   BI-SHIRE-PUB 
192.168.4.203   BI-HELMS-DEEP-PUB 
192.168.4.204   BI-RIVENDELL-PUB 
192.168.4.205   BI-MORIA-PUB 
192.168.4.206   BI-MORDOR-PUB 
192.168.4.211   BI-CLIENT-01-PUB 
192.168.4.212   BI-CLIENT-02-PUB 
192.168.4.213   BI-CLIENT-03-PUB 
192.168.4.214   BI-CLIENT-04-PUB 
192.168.4.215   BI-CLIENT-05-PUB 
192.168.4.220   BI-GANDALF-PUB 
192.168.4.221   BI-ARWEN-PUB 
 
# simulate client-server network (no firewalls) 
 
192.168.10.8    BI-VLAN-CS-1367 
192.168.10.9    BI-VLAN-CS-2367 
192.168.10.101  BI-REAL-DS-CS 
192.168.10.102  BI-REAL-ES32-CS 
192.168.10.103  BI-REAL-ES64-CS 
192.168.10.104  BI-REAL-DW-CS 
192.168.10.105  BI-REAL-IS-CS 
192.168.10.106  BI-REAL-AS-CS 
192.168.10.107  BI-REAL-RS-CS 
192.168.10.201  BI-BAGEND-CS 
192.168.10.202  BI-SHIRE-CS 
192.168.10.203  BI-HELMS-DEEP-CS 
192.168.10.204  BI-RIVENDELL-CS 
192.168.10.205  BI-MORIA-CS 
192.168.10.206  BI-MORDOR-CS 
192.168.10.211  BI-CLIENT-01-CS 
192.168.10.212  BI-CLIENT-02-CS 
192.168.10.213  BI-CLIENT-03-CS 
192.168.10.214  BI-CLIENT-04-CS 
192.168.10.215  BI-CLIENT-05-CS 
192.168.10.220  BI-GANDALF-CS 
192.168.10.221  BI-ARWEN-CS 
 
############################################# 
# This is the private HOSTS for the ES7000's 
# Subnet mask is always 255.255.255.0 
# 
192.168.222.200 BI-REAL-ES32-MS 
192.168.222.1   BI-REAL-ES32-SP

Appendix B. Unisys ES7000 Management

Basic management functions for Unisys ES7000 servers are provided by the Server Sentinel management tool. This tool interacts with the service processor for each server and carries out basic configuration functions, but can also be enhanced to carry out more advanced management functions. Some of the basic system commands are shown in Figure 11. This figure shows the version of Server Sentinel that is used on an ES7000/600 server.

Cc966423.REALMonInst11_small(en-us,TechNet.10).gif

Figure 11. Server Sentinel system commands

The ES7000/600 system resources can be allocated to various partitions as shown in Figure 12.

Cc966423.REALMonInst12_small(en-us,TechNet.10).gif

Figure 12. Server Sentinel system inventory

Table 15 lists the advanced functions provided by the Unisys Server Sentinel management solution together with a brief description.

Table 15. Unisys ES7000 Server Sentinel Functions

Capability

Description

System Health Monitoring

Provides a real-time view of hardware and software conditions and properties throughout the ES7000 server.

System Health Advisor

Provides a historical view of system availability and performance information.

Self Healing

Provides configuration options that enable you to automate corrective actions in response to hardware and software faults, thereby allowing the system to operate without human intervention.

Unattended Operations

Provides configuration options that enable you to automate manual tasks, thereby allowing the system to operate without human intervention.

Anytime/Anywhere Remote Management

Provides ability to remotely control the system from any location. This includes system hardware control, boot, and normal operations.

Hardware and Software Inventory

Provides views of various hardware, software, and firmware inventories throughout the ES7000 server.

Call Home

Provides capability to detect and automatically report system service events to the Unisys Support Center for rapid problem resolution.

Configuration Management and Setup

Provides intuitive configuration and setup procedures for the server platform and Server Sentinel software.

Appendix C:  Storage Architecture

Figure 13 shows a detailed view of part of the storage area network used for Project REAL, including the fault tolerant data paths between the arrays and servers. This figure does not include the consolidated ES7000 servers, which access the same data as the distributed servers shown here. Also not shown in this figure are the Emulex Host Bus Adapters (HBAs) in the servers. The SPx and FAxx objects represent data ports on the EMC CLARiiON CX700 and Symmetrix DMX1000, respectively. The McData objects are optic fiber switches. Further information on the EMC products can be found at www.emc.com.

Figure 13. Distributed SAN architecture

Figure 13. Distributed SAN architecture

Appendix D. Application Consolidation with 32-bit Processors

Although 64-bit CPUs are the processors of choice for large database and BI applications, one of the configurations explored by Project REAL was a consolidated scenario with sixteen 32-bit x386 CPUs and up to 32 GB of memory (memory was configurable). In this scenario, the same server was used for SQL Server 2005 relational database processing and for Analysis Services, Integration Services, and Reporting Services. Lessons learned from these experiments included the limits imposed by the memory constraints of a 32-bit architecture and some mechanisms for working around these limits.

A basic constraint on the use of 32-bit processors with large amounts of memory is that with 32 address bits, it is possible to directly address only the first 4 GB of memory. Of these 4 GB, the operating system normally reserves 2 GB for kernel space, leaving the other 2 GB available for user-mode applications. SQL Server 2005 and SQL Server 2005 Analysis Services both need more than the default 2 GB of user space for reasonable performance with a data set as large as the one used by Project REAL. Each application has a mechanism in place to obtain additional user space:

  • SQL Server 2005 uses Physical Address Extension (PAE) through a set of application programming interfaces known as Address Windowing Extensions (AWE) to extend addressing beyond the 4 GB limit. This feature is managed using kernel space.

  • SQL Server 2005 Analysis Services works within the 4 GB limit, but the user documentation recommends that servers be booted with the /3GB boot switch in the BOOT.INI file to reserve 3 GB for user space, restricting kernel space to 1 GB. Additional fine-tuning of user space reserves can be done through the /USERVA boot switch, which limits user space to the number of megabytes specified via this switch.

In the Project REAL lab, for early experiments we configured the BI-REAL-ES32 server to use 16 GB of memory and set the /3GB switch so that Analysis Services would run properly. However, this configuration did not provide enough kernel space to manage all 16 GB of PAE-managed memory, and memory was reconfigured to 8 GB, by using the following options:

/3GB /PAE /USERVA=2990 /MAXMEM=8192 /NOEXECUTE=OPTOUT

Table 16 lists the BOOT.INI switches that were used, with a description of what each switch does:

Table 16. BI-REAL-ES32 Boot Switches

Switch

Purpose

/3GB

Reserves 3 GB for user space

/PAE

Enables Physical Address Extension (PAE). PAE is enabled by default in systems that support hot-add memory.

/USERVA=2990

Restricts user space to 2990 MB.

/MAXMEM=8192

Restricts total memory to 8 GB.

/NOEXECUTE=OPTOUT

Windows Server 2003 SP1 sets this switch by default. If the Project REAL team needed to further reduce kernel memory requirements, the value of this switch could be set to AlwaysOff. For further information, see the article “A detailed description of the Data Execution Prevention (DEP) feature in Windows XP Service Pack 2, Windows XP Tablet PC Edition 2005, and Windows Server 2003.” (support.microsoft.com/kb/875352).

To further tune memory resources, we examined the role of the hot-add memory capability of the ES7000/540 server. This feature allows users to add memory to a running system, and requires the operating system to reserve enough kernel space at boot time to manage whatever hot-add memory might be added.

During the boot process, system information is passed from the BIOS to the operating system that tells the operating system whether hot-add memory capabilities are enabled. The operating system uses this information to determine how much kernel space to reserve to accommodate future hot added memory. Windows Server 2003 can be configured to release these kernel resources reserved for hot add memory by using the DynamicMemory registry parameter shown in Table 17.

Table 17. Dynamic Memory Registry Key

Path

HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management

Value

DynamicMemory

Type

REG_DWORD

Data

0x1

The value for this configuration parameter is the number of gigabytes for the system to allow for hot-add memory. For example, by setting the value to 1, the operating system reserves kernel space to support either 1 GB of system memory or the total amount of physical memory installed at boot time, whichever is greater. For Project REAL, the DynamicMemory parameter was set to a value of 0x1. Because the physical memory was always greater than 1 GB, kernel space was reserved to support just the memory available at boot time.

For further information on hot-add memory configuration, see the article titled “Hot-add memory configuration” on Microsoft TechNet (https://www.microsoft.com/technet/prodtechnol/exchange/Analyzer/2b0d4c6e-92b7-410b-876b-. Although this article applies specifically to the Microsoft Exchange Server Analyzer, the discussion is applicable to the problem faced by Project REAL and SQL Server 2005.

Although the boot switches and registry key described above allowed some of the Project REAL tests to run, the system was still unstable under greater loads or when other applications, like Terminal Services, required additional kernel resources.

Appendix E:  Resources

Project REAL

Microsoft SQL Server 2005

Operations and Management

Terminal Services and Remote Access

Microsoft Operations Manager 2005 (MOM 2005)

Installing MOM 2005

Manual Installation of MOM Agents

MOM 2005 Reporting

MOM 2005 Management Packs

Customizing MOM 2005 Management Packs

MOM 2005 Operations

EMC Storage Management Applications

Emulex Products

Unisys ES7000 and Management Applications

For more information:

https://www.microsoft.com/technet/prodtechnol/sql/default.mspx

Download

Cc966423.icon_Word(en-us,TechNet.10).gif REALMonInst.doc
989 KB
Microsoft Word file

Get Office File Viewers