Trustworthy Computing

Lessons Learned from Five Years of Building More Secure Software

Michael Howard

This article discusses:

  • Prioritizing code by age
  • Using analysis tools and automation
  • Looking at threats from multiple angles
  • The importance of education
This article uses the following technologies:

Contents

It's Not Just the Code
Fix Old Code First
Deprecate! Eliminate! Eradicate!
Tools Are Critical ... to a Point
Automate!
You'll Never Reach Zero Security Vulnerabilities
Security Is a Never-Ending Battle
There Is No Security Silver Bullet
The "Many Eyeballs" Mantra Is Right!
Today's Denial of Service Is Tomorrow's Exploit
Final Thoughts

Five years ago, Bill Gates issued a memo to all Microsoft employees explaining the importance of building more secure software. Since then, many people across Microsoft have worked to improve the security of their products. In doing so, we've learned a lot about what it takes to build more secure software.

Security is not a static field—it constantly evolves as attackers attack, defenders defend, and each party learns more about the other's techniques. Security is an arms race. To stay ahead and anticipate the attackers, we the defenders must learn from our mistakes and create better ways to secure users from being compromised.

So what have we learned in the past five years? Most of these lessons are actually quite obvious, but like so many apparent things, sometimes it helps to have someone point them out.

It's Not Just the Code

The software industry, or more accurately the software quality industry, is fixated on getting the code right. I really don't have a problem with that, but many security vulnerabilities are not coding issues at all. Many are design issues. If you focus solely on finding security issues in the code, you'll miss an entire class of vulnerabilities. This is one of the reasons Microsoft mandates threat modeling and attack surface analysis as part of the Security Development Lifecycle (SDL) Process. Threat Modeling is an analysis technique that helps identify and mitigate design weaknesses in a product. Attack surface analysis focuses on which portions of a software product are exposed to untrusted users, be they local or remote. A product with a large attack surface has more code exposed to untrusted users than a product with a small attack surface. (Read more at msdn.microsoft.com/msdnmag/issues/04/11/AttackSurface.)

The DNS RPC buffer overrun vulnerability documented in Microsoft Security Bulletin MS07-029 (see microsoft.com/technet/security/Bulletin/MS07-029.mspx) could allow an attacker to take complete control of an affected system. There was certainly a code issue in this case. However, the code was accessible to anonymous and remote users when it should really have been restricted to administrators. Thus, the problem was a combination of code and design vulnerability. I analyzed this vulnerability on the SDL blog (see blogs.msdn.com/sdl/archive/2007/06/28/lessons-learned-from-ms07-029-the-dns-rpc-interface-buffer-overrun.aspx).

Lesson Learned It's essential to build threat models to uncover potential design weaknesses and determine your software's attack surface. You need to make sure that all material threats are mitigated and that the attack surface is as small as possible.

Fix Old Code First

When it comes to code review, I like to rank code by potential for vulnerability. I wrote an article for IEEE Security & Privacy, titled "A Process for Performing Security Code Reviews," that explains the metrics I use to prioritize code review (you can find a link to the paper on my blog at blogs.msdn.com/michael_howard/archive/2006/08/01/686029.aspx).

The first priority is old code because old code is far more likely to have more security vulnerabilities than newer code. Threats are constantly evolving. Old code—even code built just a few years ago—was built when the threats were different than they are today. Furthermore, the techniques used to build old code lack the latest defensive techniques and best practices. Likewise, legacy code was built using older, less secure libraries. And finally, old code was built with less situational awareness, at a time when most developers had little to no security expertise.

My advice is simple: all old code must be hand-reviewed for security vulnerabilities. In fact, this is the purpose of the security push phase of the SDL here at Microsoft. While the primary goal of the SDL is to reduce the chance of developers adding new vulnerabilities to a product, the security push is designed to force the development team to look for issues in old code.

No other metric that we measure is as valuable when prioritizing code review—not code complexity, not line number count, not code churn. The number-one indicator of potential vulnerability density is simply the age of the code.

Lesson Learned Identify all your source code files and rank them by age, where age is the code's "born on" date. Perform static analysis and manually review the code, starting with the oldest code first.

Deprecate! Eliminate! Eradicate!

Sometimes a feature may not be secure enough in the current threat landscape. The feature may have been fine a few years ago, but it is unsecure today due not to code vulnerabilities but instead due to changes in today's computer environment.

A good example of this is the Alerter service in Windows® that would show print status and send little messages that would pop-up on another user's screen. This turned into a spam mechanism pretty quickly so we made the hard decision to disable the service by default in Windows XP SP2, and then we removed it entirely from Windows Vista®.

Another good example is the legacy IPX and SPX protocols. (Yes, I know that IPv4 is ancient too, and it has its own issues.) In Windows Vista, we simply removed support for the Microsoft® Client for NetWare Networks because the code was old and unused by most users.

Over time, you will see Microsoft carefully remove older features. Since some users rely upon features, it's important to balance the risk versus the usefulness.

Lesson Learned Identify old features that may be long-term security headaches and come up with a plan to deprecate the features. Perhaps the first step is to continue to ship the feature, but disable it by default. Then, in the next version of the product, the feature can be removed entirely, but made available as a Web download for those who absolutely depend on it. Finally, stop support for the feature. Just remember to keep your customers informed.

Tools Are Critical ... to a Point

In the past, I have been highly critical of tools. Actually, not of the tools themselves, but of the over-reliance some developers have on tools. By tools, I mean static code analysis, binary analysis, and the like that can help pinpoint security vulnerabilities. In my old age, I've softened somewhat on this opinion.

If you have a lot of code—say, over a million lines—it becomes very hard to review all that code by hand. Tools are handy because they can analyze great swaths of code rapidly. Tools, however, are no replacement for human intellect, and many tools tend to miss vulnerabilities because they are concerned with keeping the false positive and false negative rate as low as possible. And, to be totally honest, many security analysis tools create such a vast quantity of errors and warnings that it can be very hard to determine which bugs are real and which are noise.

Of course, if you do see a large number of issues, this doesn't mean you can simply ignore the tool output! When we perform a root cause analysis of a security vulnerability, we always ask why the issue wasn't discovered by our tools. There are three possible reasons: the tool did not find the vulnerability, the tool found it but mistakenly triaged the issue as low priority, and the tool did actually find the issue and humans mistriaged it. This analysis allows us to fine-tune our tools and education over time.

Analysis tools are also very good at determining the potential for security vulnerabilities in code. Say you have two products, each comprising about 100,000 lines of C++ code. You run your tools on each code base—for this example, assume the tools you use are /W4 and the /analyze compiler switches. The first code base yields 121 /W4 warnings and 19 /analyze warnings, while the second code base has 235 /W4 warnings and 65 /analyze warnings. Which set of code do you think needs more review?

Finally, tools are excellent when run on new or modified code prior to check-in, as they can act as code cops and find certain classes of bugs early in the process.

Lesson Learned Analysis tools can help you determine how much care you need to give your code. You can also use the output of the analysis to determine overall code riskiness. Use tools at check-in time to catch bugs early on. And run the tools often so you can deal with any new issues quickly—if you run the tools only every few months, you may end up having to deal with hundreds of warnings at a time.

Automate!

At the outset of a security improvement regimen, there is a great deal of manual work—manual code review, manual design reviews, and so on. To really elevate your work, you need to automate as much of the process as possible.

On our team, we built many bespoke tools and created an internal Web site to gather data from the tools so the central security team can review the output from tools. When an SDL improvement is proposed, the proposal must include a way to automate the improvement. The prime motivators for automation are scalability and constant use. If you have a great deal of code, you need to automate. And if you want parts of a process to be repeated constantly, then automation is obviously the way to go.

Lesson Learned Always strive for automation where possible. Build or buy tools that scan code and upload the results to a central site for analysis by security experts.

You'll Never Reach Zero Security Vulnerabilities

It's sad but true, but you'll never get to zero security vulnerabilities. I remember when we issued one of the first security updates for Windows Vista. Some users were surprised because they thought Microsoft claimed to have solved the security problem with Windows Vista. First, I don't know of anyone who made the claim and second, zero security vulnerabilities just isn't achievable.

While zero security vulnerabilities would be nice, thinking you can reach such a state is folly. The fact is the technology landscape is always in flux, threats are a moving target, and security research is ongoing. I said earlier security is an arms race. We add defenses to our products and the attackers adapt.

Your code might seem utterly vulnerability-free today, but that could all change tomorrow when a new type of vulnerability is discovered. For instance, on October 15, 2003, Microsoft issued a security bulletin that fixed a cross-site scripting (XSS) vulnerability in Outlook® Web Access included with Microsoft Exchange 5.5. On March 4th the following year, Sanctum (since purchased by Watchfire and now IBM) released a paper that outlined a new vulnerability akin to cross-site scripting called HTTP response splitting. Six months later, Microsoft issued another security update for Outlook Web Access in Microsoft Exchange 5.5 to fix an HTTP response splitting vulnerability. So what happened? Simply put, at the time the first bulletin was issued, response splitting issues were unheard of, but the landscape changed.

Lesson Learned Make sure people within the organization realize that your goal is to reduce security vulnerabilities, but that you will never reach zero security issues as long as there are still attackers looking for new techniques and vulnerabilities.

Security Is a Never-Ending Battle

There will never come a time when you can say, "we're done with security; the problem is now solved." This is an extension of the previous point, but with a slightly different angle. It's critically important that you maintain on-going training for your engineers. If you don't, skills might fade and urgency might dissipate over time. Security is critical and protecting systems is paramount. As I said in the previous point, new threats and vulnerabilities are constantly being discovered, so you must always treat security as a task that is never completed.

You should also realize that there is nothing special about security anymore. Security is simply part of getting the job done.

Lesson Learned Continue to provide on-going training for your engineers, and ensure that they are always aware of the importance of addressing security issues.

There Is No Security Silver Bullet

People often ask me "what is the most important element of the SDL?" The answer: everything. If a portion of the SDL proved not to be useful, it would be cut from the SDL. Every portion of SDL leads to a reduction in security vulnerabilities. So when you hear a person say he has a single, magical task to address security, such as running static analysis tools or educating people, he isn't covering all his security bases. Sure, static analysis tools have their place, and educating users is essential, but not on their own. Security must be thorough, and it must be part of the process.

Lesson Learned Make sure you are addressing security from every angle available. If not, you need to change your process!

The "Many Eyeballs" Mantra Is Right!

Famed open source advocate Eric Raymond has said, "given enough eyeballs, all bugs are shallow." He's right. But I think this simplified statement misses a key point. It should continue, "so long as the eyeballs are incented and educated."

At Microsoft, we can require people to adopt process improvements since these requirements are a part of the job. That's incentive. We offer education in many forms, such as live training, labs, and online training. Thus, there's plenty of material to keep engineers attune to what is happening on the security front.

Lesson Learned The more eyes on your code the better, but this must be done within a certain framework. You must ensure that the developers providing code review have an incentive to care about adapting to your current process, and they must be educated about the latest security threats.

Today's Denial of Service Is Tomorrow's Exploit

Many software vendors, including Microsoft, have learned this lesson the hard way. Engineers might look at a code issue and quickly dismiss it because the issue is "just a DoS." Remember that the security research world is constantly evolving and assumptions about certain classes of vulnerability can change overnight.

Lesson Learned Don't be quick to dismiss a denial of service issue. With a little work, malicious users can turn some DoS vulnerabilities into real code execution vulnerabilities.

Final Thoughts

If there is one thing I've learned in the past few years, it's that when it comes to security, you need to be prepared to have your ideas and viewpoint constantly challenged and you must be proactive about staying abreast of the latest issues. If you take the lessons learned in this article to heart, you should do just fine.

Michael Howard is a Principal Security Program Manager at Microsoft focusing on secure process improvement and best practices. He is the coauthor of many security books, including Writing Secure Code for Windows Vista, The Security Development Lifecycle, Writing Secure Code, and 19 Deadly Sins of Software Security.