How Do They Do It?
A Look Inside the Security Development Lifecycle at Microsoft
This article discusses:
|This article uses the following technologies:
Security Development Lifecycle
Leadership and Education
The Design Phase
The Development Phase
Starting a Security Push
Final Security Reviews
The Security Response
Does SDL Work?
The goals of the Security Development Lifecycle (SDL), now embraced by Microsoft, are twofold: to reduce the number of security-related design and coding defects, and to reduce the severity of any defects that are left. This follows our oft-cited motto, "Secure by Design, Secure by Default, Secure in Deployment and Communication" (also known as SD3+C). SDL focuses mainly on the first two elements of this motto. Secure by Design means getting the design and code secure from the outset, and Secure by Default is a recognition that you never will. To be realistic, you never will get the code 100 percent correct, but more on this later when I discuss attack surface reduction.
This article outlines how to apply the SDL to your own software development processes. I will explain how you can take some of the lessons we have learned at Microsoft when implementing SDL, so you can use these concepts in your own development process. But before I get started, I want to make clear that SDL is process-agnostic as far as how you go about developing software. Whether you use a waterfall model, a spiral model, or an agile model, it really doesn't matter; you can still use the process improvements that come from SDL. SDL involves modifying a software development organization's processes by integrating measures that lead to improved software security. The really great news is SDL does improve software quality by reducing security defects.
SDL adds security-specific checks and measures to any existing software development process. Figure 1 shows how SDL maps onto a "generic" process. If it makes you happy, wrap the SDL around a spiral or down a waterfall.
Figure 1 Mapping SDL onto a Generic Process
I'll take a look at each major phase and outline what you can do within your own organization to implement SDL.
Leadership and Education
I'm often asked why SDL has been so successful at Microsoft. The answer is very simple: executive support, and education, and awareness. Getting Bill Gates and Steve Ballmer committed to SDL was critical, but just as critical is an educated engineering workforce.
For leadership, you need to nominate one or more individuals to be the point people for security. Their jobs include staying on top of security issues, pushing the security practices on the development organization and being the voice of reason when it comes to making tough security decisions. (If you're reading this, that person is probably you.) The leadership person or people should monitor the various security related newsgroups, such as Bugtraq (www.securityfocus.com).
If your engineers know nothing about the basic security tenets, common security defect types, basic secure design, or security testing, there really is no reasonable chance they could produce secure software. I say this because, on the average, software engineers don't pay enough attention to security. They may know quite a lot about security features, but they need to have a better understanding of what it takes to build and deliver secure features. It's unfortunate that the term security can imply both meanings, because these are two very different security realms. Security features looks at how stuff works, for example the inner operations of the Java or common language runtime (CLR) sandbox, or how encryption algorithms such as DES or RSA work. While these are all interesting and useful topics, knowing that the DES encryption algorithm is a 16-round Feistel network isn't going to help people build more secure software. Knowing the limitations of DES, and the fact that its key size is woefully small for today's threats, is very useful, and this kind of detail is the core tenet of how to build secure features.
The real concern is that most schools, universities, and technical colleges teach security features, and not how to build secure software. This means there are legions of software engineers being churned out by these schools year after year who believe they know how to build secure software because they know how a firewall works. In short, you cannot rely on anyone you hire necessarily understanding how to build security defenses into your software unless you specifically ask about their background and knowledge on the subject.
Some good sources for online and instructor-led security education include Microsoft eLearning (www.microsoftelearning.com/security). The security guidance for developers is derived from some of the security basics material we present at Microsoft.
You should also build a library of good security books, such as those listed in Figure 2.
Figure 2 Recommended Security Books
|Writing Secure Code, 2nd Ed by David LeBlanc and Michael Howard|
|19 Deadly Sins of Software Security by Michael Howard, David LeBlanc, and John Viega|
|Building Secure Software by John Viega and Gary McGraw|
|Gray Hat Hacking by Shon Harris, et al.|
|How to Break Software Security by James Whittaker and Herbert Thompson|
Some groups at Microsoft have set up book clubs where they each read a chapter of a given book and then discuss it with the group. To further their knowledge, they search for examples of defects or design issues from common security stomping grounds, like Bugtraq.
Consider having security folks within your company hold presentations on security topics, including common security code defects (buffer overflow, cross-site scripting, SQL injection, integer arithmetic, weak crypto, and so on), secure design, threat modeling, and security testing. To add more depth and relevance to the discussions, find defects in your own code and use them as examples for the rest of the developers.
Finally, engineering staff should be encouraged to update their skills at least once a year—whether on their own or by attending staff development events—and this should be tracked so people don't fall between the cracks. It's really important that people stay on top of the security landscape, as it changes rapidly. A common bug today may be a significant security vulnerability tomorrow.
The Design Phase
The best opportunity to influence the security design of a product is early in the product lifecycle. This is the phase in which architects, developers, and designers traditionally do the bulk of the feature design. Design typically begins with a very short description of the intended functionality (sometimes in the form of a short functional summary), followed by a functional specification that describes the customer's view of the feature, which is, in turn, followed by a design specification that describes the technical details outlining how the feature will be implemented.
If you're using an agile development method, you may just opt for the shorter one-page summary, but this page should include the security aspects of the component. Functional specifications may need to describe security features that will be directly exposed to customers, such as requiring end user authentication to access specific data or advanced functionality. Design specifications will need to describe how to implement security features, as well as to ensure that all functionality is implemented as secure features. Notice the use of the word secure rather than security. Secure features are defined as ensuring that all functionality is well engineered with respect to security, such as robust use of Crypto APIs, use of managed code wherever possible, rigorously validating all data before processing it, and several other considerations. When designing features, it is crucial to carefully consider security concerns in order to avoid trying to bolt security onto the product near the end of the design process.
All functional and design specifications, regardless of document size, should contain a section describing how the component impacts security. To get some ideas on what to add to this section, you should review RFC 3552 "Guidelines for Writing RFC Text on Security Considerations".
An important part of the design process is to understand how you will reduce the attack surface of your application or component. Anonymously accessible and open UDP ports to the Internet represent a larger attack surface than, say, an open TCP port accessible only to a restricted set of IP addresses. I don't want to spend too much time on the subject here; instead you should refer to my article on attack surface reduction "Mitigate Security Risks by Minimizing the Code You Expose to Untrusted Users" in MSDN®Magazine November 2004. Figure 3 should give you a quick reference for reducing the attack surface of your code.
Figure 3 Attack Surface Reduction
|Higher Attack Surface||Lower Attack Surface|
|Code executing by default||Code off by default|
|Open socket||Closed socket|
|Anonymous Access||User Access|
|Internet Access||Local Subnet Access|
|Running as SYSTEM or admin||Not running as SYSTEM or admin!|
|Weak ACLs||Strong ACLs|
Threat modeling must be completed during the product design process. A team cannot build a secure product unless it understands the assets the product is trying to protect (customers' personal information such as credit card numbers, not to mention their computers), the threats and vulnerabilities introduced by the product, and details of how the product will mitigate those threats. Additionally, it is also important to consider threats and vulnerabilities present in the environment in which the product is deployed or those that arise due to interaction and interfacing with other products or systems in end-to-end real world solutions. To this end, the design phase of a product cannot be considered complete until a threat model is in place. Threat models are critical components of the design phase and will reference both a product's functional and design specifications to describe both vulnerabilities and mitigations.
Understanding the threats to your software is a critical step to creating a secure product. Too many people bolt security technology to their app and declare it secure, but the code is not secure unless the security countermeasures are really resolving real-world threats. That's the goal of threat modeling. To get a good feel for the process in less than 30 minutes, I would recommend you read the blog entry "Guerrilla Threat Modeling" written by my colleague Peter Torr.
The Development Phase
During the development phase you should implement security tools, security checklists, and secure coding best practice to help implement a secure design. Remember, a secure design could quite easily be rendered insecure by a weak implementation.
Before I get started on this, I want to point out something very important. Security tools will not make your software secure. They will help, but tools alone do not make code resilient to attack. There is simply no replacement for having a knowledgeable work force that will use the tools to enforce policy. The new version of Visual Studio® 2005 Team System Developer's Edition includes some very, very useful security tools:
PREfast PREfast is a static analysis tool for C/C++ code. It can find some pretty subtle security defects, and some egregious bugs, too. This is lint on security steroids.
Standard Annotation Language (SAL) Of all the tools we have added to Visual Studio 2005, this is the technology that excites me the most because it can help find some hard to spot bugs. Imagine you have a function like this:
void *function( char *buffer, DWORD cbBufferLength);
You know that buffer and dwBufferLength are tied at the hip; buffer is cbBufferLength bytes long. But the compiler does not know that—all it sees is a pointer and a 32-bit unsigned integer. Using SAL, you can link the two. So the header that includes this function prototype might look like the following:
void *function( _in_bytecount(cbBufferLength) char *buffer, DWORD cbBufferLength);
Please note the final syntax used for SAL may change before Visual Studio 2005 ships.
FxCop You may already know of FxCop—it's a tool to find defects, including security defects in managed code. It's available as a download from www.gotdotnet.com, but the version in Visual Studio 2005 is fully integrated, and includes some new issues to watch out for.
Application Verifier AppVerifier is a runtime tool that operates on a running application. It can be used to trap memory-related issues at run time, including heap-based buffer overruns.
Other tools and requirement at Microsoft include:
- All unmanaged C/C++ code must be compiled with the /GS stack overrun detection capability.
- All unmanaged C/C++ code must be linked using the /SafeSEH option.
- All RPC code must be compiled with the MIDL /robust flag.
- Security issues flagged by FxCop and PREfast must be fixed.
- The functions shown in Figure 4 are banned for new code, and should be removed over time for legacy code.
Figure 4 Sample Banned Functions
|Banned API||Strsafe Replacement||Safe C and C++ Libraries|
|strcpy, wcscpy, _tcscpy, _mbscpy, lstrcpy, lstrcpyA, lstrcpyW, strcpyA, strcpyW||String*Copy or String*CopyEx||strcpy_s|
|strcat, wcscat||String*Cat or String*CatEx||strcat_s|
|wnsprintf, wnsprintfA, wnsprintfW||String*Printf or String*PrintfEx||sprintf_s|
|_snwprintf, _snprintf||String*Printf or String*PrintfEx||_snprintf_s or _snwprintf_s|
|wvsprintf, wvsprintfA, wvsprintfW, vsprintf||String*VPrintf or String*VPrintfEx||_vstprintf_s|
|_vsnprintf, _vsnwprintf||String*VPrintf or String*VPrintfEx||vsntprintf_s|
|strncpy, wcsncpy||String*CopyN or String*CopyNEx||strncpy_s|
|strncat, wcsncat||String*CatN or String*CatNEx||strncat_s|
|strlen, wcslen, _mbslen, _mbstrlen||String*Length||strlen_s|
You can read about the Strsafe string replacement code in "Strsafe.h: Safer String Handling in C". The Safe C library is the new C runtime library replacement built into Visual Studio 2005. You can read about it at "Safe! Repel Attacks on Your Code with the Visual Studio 2005 Safe C and C++ Libraries".
A very useful testing technique for finding security defects is "fuzzing," which means taking valid data, morphing that data, and then observing an application that consumes the data. In its simplest form, you could build a library of valid files that your application consumes, and then use a tool to systematically corrupt a file and have your application play or render the file. Run the application under application verifier with heap checking enabled to help uncover more errors. Examples of morphing data include:
- Exchanging random bytes in a file
- Writing a random series of bytes of a random size at a random location in file
- Look for known integers to change the sign, or make them too large or too small
- Look for ASCII or Unicode characters and set the trailing NULL character to be non-NULL
Michael Sutton and Adam Greene gave an interesting session at Blackhat USA 2005 about fuzzing. You can read it at The Art of File Format Fuzzing. Ejovi Nuwere and Mikko Varpiola also gave an interesting presentation on fuzzing the VoIP networking protocol, available at The Art of SIP Fuzzing and Vulnerabilities Found in VoIP.
Starting a Security Push
A security push is a team-wide focus on threat model updates, code review, testing, and documentation scrub. Note that the push is not a quick fix for a process that lacks security discipline; rather, it is a concerted effort to confirm the validity of the information in your Security Architecture documentation, to uncover changes that may have occurred during the development process, and to identify and remediate any remaining security vulnerabilities you may discover. It is not possible to build security into software with just a security push alone.
There is no easy way to determine the length of time needed for a security push; the push duration is ultimately determined by the amount of code that needs to be reviewed for security, as all pushes to date have been gated by code quantity. Teams are strongly encouraged to try to conduct security code reviews throughout the development process, once the code is fairly stable, as the quality of code reviews will suffer from trying to condense too many code reviews into too short of a time period.
The rule of thumb is to determine critical code, using heuristics such as exposure to the Internet, handling sensitive or personally identifiable information and so on, and mark that as priority one code. That code must be reviewed during the push, and the push cannot be complete until that code is reviewed. Assign a name to every code file, and assign a name and priority to the code before the push.
Final Security Reviews
Once the end of the project draws near, a very important question must be answered: from a security viewpoint, is the software ready to ship? The Final Security Review (FSR) answers this question. It's performed by the central security team with help from the product team, but not by the product team alone. It's usually done a few months before the software is complete and includes:
- Completing a questionnaire, which helps the security team focus its effort. Questions might include:
- Do you have any ActiveX® controls marked Safe for Scripting?
List all the protocols you fuzzed.
- Does your component listen on unauthenticated connections?
- Does your component use UDP?
- Does your component use an ISAPI application or filter?
- Does any of your code run in System context? If yes, why?
- Do you set ACLs in setup code?
- Review bugs that are deemed "won't fix" and make sure they are not mismarked security defects.
- Analyze security defects from other versions of the product and even competitors' products to make sure you have addressed all these issues. One common question we ask is, "how have you mitigated this broad category of security issues?"
- Perform penetration testing of high-risk components, perhaps by a third-party company.
At the end of the FSR process, the findings are written up and a decision is made about releasing the software or reworking sections.
The Security Response
There are two major parts to this process. One is responding to security defects and working with people who find security issues in your code. The other aspect is learning from these mistakes. At Microsoft, we have dedicated staff whose job it is to perform root-cause analysis of defects. These documents are then used to influence future iterations of the SDL. Each postmortem doc includes:
- Name and version of file.
- Was this a design issue?
- Was this a coding issue?
- Source code diff if it's a coding bug.
- Does the bug affect other versions of the product? If yes, why? If not, why not? (They are both very valid questions.)
- What tests could have found this defect?
- What tools could have found this defect?
- Do we need to change our education, tools, or process to find these issues?
The answers are then used to modify SDL. The SDL at Microsoft is updated twice a year, January 1 and July 1.
Does SDL Work?
So the big question is "does SDL work? Does the employment of Security Development Lifecycle techniques result in software that is more secure?" The answer is a resounding Yes! We have seen the number of security defects be reduced by approximately 50 to 60 percent when we follow SDL. The simple fact is that every product touched by SDL has fewer security defects. Period. And that certainly makes it worth pursuing.
Michael Howard is a senior security program manager at Microsoft focusing on secure process improvement and best practice. He is the coauthor of 19 Deadly Sins of Software Security (McGraw-Hill Osborne, 2005) and Processes to Produce Secure Software (Dept. of Homeland Security National Cyber Security Division).