Best Practices: Code Reviews

Josh Poley

Microsoft Corporation

October 2007

Contents

Foreword

Preface

Setting It Up

Patterns to Look For

Maintainability and Robustness

Additional References

About the Author

Foreword

The first version of this document was issued in September 2006 to all of the testers in the Microsoft Xbox organization. Its intent was to help increase the productivity during code reviews (of test code and product code), as well as highlight some common mistakes that should be watched for. After it was sent out, it quickly spread through the development teams as a good reference.

This document has helped our testers learn some nuances of the C and C++ languages, and the Microsoft Windows APIs that can lead to hard-to-diagnose bugs. It is much cheaper to find bugs during a code review, so it makes it worthwhile to arm yourself with as many tools and as much knowledge as possible.

Preface

This document points out some of the common areas in C and C++ code that frequently have defects that are easy to spot or need special attention during reviews. When you see these patterns in the source, you should give them special consideration and verify that the intention is coded correctly.

This is by no means intended to be an exhaustive list of issues. Use this document as a basic primer or guide to aid you in finding common issues, and to help you think about additional defects. The goal is to improve your odds of finding defects in a formal or informal review.

Setting It Up

When you are setting up a formal code review, be sure to invite both new team members and more senior developers/testers. This way, you can help maximize the learning and bug-finding processes with a better distribution of experiences.

Hand out hard copies several days before the walkthrough meeting. This gives everybody a chance to go through the code and look for issues before the meeting. The walkthrough meeting should be spent going over problems and answering specific questions about the code; it shouldn't be spent reading. The printouts must contain page numbers, line numbers, and the file name; otherwise, everyone will get lost, as you jump around. Printing out two pages per side works very well in landscape mode. This allows for longer code lines, without wasting paper.

When you hand out the printouts, be sure also to send a link to the actual source code. This will allow reviews to search online and run additional tools: The compiler's /W4 switch is the easiest way to catch potential problems, and should be the first thing performed. The next step would be to run through Microsoft's PREfast or your favorite static-code analyzer, such as Gimpel Software's PC-Lint. These analysis packages can find an amazing number of problems.

(Note that PREfast is currently available in Microsoft Visual Studio 2005 Team Suite and Microsoft Visual Studio 2005 Team Edition for Software Developers. Driver developers also have access to it through the Windows Driver Development Kit. Additional information on PREfix and PREfast can be found by clicking here.)

After you've given out the source code, you must prevent edits to the files. Merging-in changes becomes more difficult if there have been large-scale changes. Also, it wastes the reviewers' time, as they look for bugs that either are obsolete because of code changes or have been fixed already.

Patterns to Look For

First of all, don't review for code style; there are much better things on which to spend your time, than arguing over the placement of white spaces. Looking for bugs and performance issues is the primary goal and is where the majority of your time should be spent (during both the review and the walkthrough). Suggestions on maintenance best practices (covered in the "Maintainability and Robustness" section) can be brought up after the bugs have been discussed.

Memory

Memory Leaks

Memory leaks are the obvious scenario to which to pay close attention. Be sure to match up every allocation with a deallocation. Also, be sure to match calls to bracketed new[] with bracketed delete[]. It is also very important to do this at every point in the function in which it returns or can possibly generate an exception; memory leaks are very common during error handling.

VirtualFree Leaks

Watch out for VirtualFree calls that do not specify 0 for the size when also passing in MEM_RELEASE. The size must be 0 when using this form; otherwise, the memory is not freed and you will end up with a leak, even though it looks as if the memory is being deallocated.

NULL vs. throw

char *foo = new char[3];
if(!foo) return;

This code looks good, except for when new doesn't return NULL on error, but instead throws a C++ exception. It is recommended always to use the placement new syntax to enforce a behavior:

char *foo = new(std::nothrow) char[3];

This code tells the compiler to ensure that the new call will return a NULL and not throw an exception. The other solution is to control which version of libc is used at link time, which will define the behavior globally. But this can be very bad if you are using libraries (such as the STL) that require the throwing version of new.

If this code is in a library, it must use the placement syntax explicitly to enforce a behavior, as there is no guarantee which version of libc will be linked in with the final application.

Is alloca NULL?

char *foo = (char*)_alloca(3);
if(!foo) return;

As with the "NULL vs. throw" pattern, this code is wrong, because _alloca does not return NULL on failure. Instead, it throws an OS exception (not a C++ exception). If you must handle out of stack conditions, allocations should be wrapped in an appropriate exception handler. Also, note that (depending on the platform) _resetstkoflw() must be called if an exception is thrown; otherwise, your stack's guard page will be left in a bad state. Depending on your application, it may be better to architect it such that you can guarantee that enough space is available.

Integer Overflow to Buffer Overflow

if(a+b < 10) return;
char *buffer = new char[a+b];
if(buffer)
    {
    strncpy(buffer, input, a);
    buffer[a-1] = '\0';
    }

Here, it looks as if the code is doing the right thing. We check to make sure that our sizes are within expected limits. Then, after the allocation, we copy some data into the buffer. Unfortunately, if 'a' is significantly large, we can overflow the addition, so we allocate a buffer fewer than 10 characters, but then we overflow that buffer during the string copy.

When checking sizes for validity, they should be done separately. Also, using the "safe" version of strncpy is a better way to perform the copy (strncpy_s or StringCbCopyN).

During reviews, every time that you see an allocation in which the size goes through a series of mathematical operations, be sure to check for possible integer overflows.

Integer Overflow Inside new

Be aware that there is a multiply operation that happens inside the call to the new operator. Make sure to look for code checks, to ensure that sizeof(type) * number-of-objects does not overflow. The following code produces an overflow to zero (on a 32-bit platform), which is a valid allocation amount, but one that will always lead to a buffer overflow:

int size = 1073741824;
int *buffer = new int[size];

At this point, "buffer" is a valid pointer, but one that points to a zero-sized chunk of memory. Any write attempts to buffer[n] will result in memory corruption.

Note that some compilers (such as Microsoft Visual Studio 2005) will cap the buffer size, preventing an internal overflow of the value.

Array Indexing and Pointer Arithmetic

When you are working with pointers and arrays, it is almost too easy to walk off the end, or even the beginning, of a buffer. When a pointer is being indexed, look at the ranges that are being used, and verify (if possible) against the size of the buffer. You should be wary also of open-ended loops:

for(;;)
for(i=0; ; i++)
while(1)
while(*ptr)

With the last sample, we terminate based on the data itself, and not a buffer size. If the input is untrusted, this can cause a problem if the program is being handed corrupt or malicious data.

For and while loops that do have bounds are often sources of "off-by-one" errors. Check to make sure that the correct usage of "less than/greater than" versus "less than or equal/greater than or equal" is used. Be sure to watch also for integer overflows, as these will lead to reading or writing to the wrong memory address.

IsBadPtr

The general consensus is that the IsBad family of functions (IsBadReadPtr, IsBadWritePtr, and so forth) is broken and should not be used to validate pointers. For one thing, they are not thread-safe, which means that the APIs themselves can cause bugs if the address is in use by another thread at the same time. They only try to find one class of "bad" pointers; these APIs can tell you that everything is fine, when in fact it isn't.

It is not recommended to use this type of parameter validation, as it will often hide bugs. Any code that relies on these APIs for validation probably must be rewritten to be safer in the first place. However, if you must probe for access using a __try / __except pattern, it is better to do it yourself. In this way, you can catch guard-page exceptions and handle them correctly. The IsBad APIs do not do this, and they can leave your stack's guard page in a bad state.

Initializing sizeof(pointer)

When zeroing out a structure via a pointer, it is all too easy to specify the wrong size.

PBITMAPFILEHEADER bmp;
...
ZeroMemory(bmp, sizeof(bmp));

By passing sizeof(bmp) instead of sizeof(*bmp), we zero out only the size of a pointer instead of the struct. This simple bug can be hard to track down, as stale data on the stack can look valid. See also the "Initialized by Disjunct Size" section.

Handles

Close

As with memory allocations, handles must be freed when done. Every time that a handle is created, be sure to verify that there is a matching close. Also, be sure to check every place that the code returns or can possibly throw an exception.

Handles that are created by CreateThread must be closed. Note, however, that they can be closed while the thread is still running.

Related to this, you should watch also for double closes. Closing a handle twice can cause bad side effects, if the second close actually freed a valid handle that was created by somebody else.

INVALID_HANDLE_VALUE vs. NULL

Unfortunately, some Win32 APIs return INVALID_HANDLE_VALUE on error, and some return NULL. Be sure to read the documentation for each API that returns a handle, to see which value is used to represent an error. A common mistake is to check for the wrong identifier when looking for failures.

The following table shows a list of typical Win32 functions and what they return on error.

CreateEvent

NULL

CreateFile

INVALID_HANDLE_VALUE

CreateMutex

NULL

CreateThread

NULL

FindFirstFile

INVALID_HANDLE_VALUE

Macros

Parenthesizing

When looking at macros, the first thing to check is proper parenthesizing.

#define MUL(a, b) a*b
#define ADD(a, b) a+b

These are broken because the input parameters must be wrapped in parentheses to preserve proper evaluation. The entire expression also should be wrapped in parentheses, to ensure that more complex expressions are evaluated correctly.

Multiple Expansion

A macro that expands and evaluates an input multiple times is a common source of problems. These are good candidates for promotion to functions, whether inline or otherwise.

Multithreading

Critical Sections

With critical sections, you want to pay attention to the scope and the order in which resources are protected. Scope is important, because the resource must be protected for a sufficient amount of time to prevent concurrent accesses, but you also must keep the time that is spent in a lock to a minimum. The order in which locks are obtained and released is very important in preventing dead locks. Be sure to review critical sections carefully, and look for ways that multiple threads can end up blocking each other.

It is also a good idea to verify that a critical section is needed. Other mechanisms, such as Mutexes or the Interlocked APIs, may be better suited.

Unprotected Resources

Be sure to verify that all accesses to shared resources are synchronized. Memory is the most common, but read and writes to files, the network, and so forth may need to be controlled, too.

Freed Resources

A common problem with multithreading comes from using a resource on one thread that has been released by another. During reviews, look for places in which a thread can keep using memory, handles, and so forth, after they have been released.

Orphaned Threads

Another issue for which to watch out is when a program exits (or a DLL gets unloaded) without waiting for all of its worker threads to finish first. Be sure to review the thread-management scheme, and look for places in which the main application does not wait on worker threads to finish.

Single-Threaded CRT

If the application is using threads, be sure to check the project settings, to ensure that it is not using the single-threaded CRT library. Using the non-thread-safe CRT functions in a multithreaded environment can lead to some very difficult-to-diagnose bugs. It is much easier to find and fix this through a review.

Note that this is not an issue on Visual C++ 2005 and Microsoft Windows projects, as the single-threaded CRT no longer exists.

Logic

AND OR

When you see an expression that is using logical or bitwise operators, be sure to investigate the intent. It is common to use & accidentally, when && was meant, or to interchange && and ||.

Overloaded AND OR

If a class overloads the && or || operator, be aware that expression short-circuiting rules will be broken. In the following example, if 'string' is a character pointer, strlen will not be called if the pointer is equivalent to NULL:

if(string && strlen(string))

Now, if 'string' is actually a class that happens to overload the && operator, the resulting code will actually look to the compiler as the following:

if(string.operator&&( strlen(string) ))

In this case, strlen(string) is always evaluated, so the semantics of this code have completely changed.

Precedence

People often mistakenly interpret a given operator as having a lesser or greater importance, depending on the context. Unfortunately, this can lead to precedence errors when expressions are evaluated. Most of the time, it is better to have too many parentheses than too few. For example, the following two lines are equivalent:

(x * 2) + 1;
(x << 1) + 1;

However, the following two are not:

x * 2 + 1;
x << 1 + 1;

This problem is often more frequent in C++, when a class overloads the operators as expectations of operator importance may change. For example, the "streaming" operators that are used with the I/O classes often give users a different impression of the precedence:

cout << x&0xFFFF;

The preceding line is incorrect, because the shift operator will be evaluated before the bitwise AND operator.

Sequence Points

In the following code, there is no guarantee which function will be called first:

DoSomething() + DoSomethingElse()

The following statement, however, guarantees that DoSomething will be called first:

DoSomething() , DoSomethingElse()

This is so, because the standard does not define (outside of "sequence points") which substatements/expressions are evaluated first. The following operators and conditions create sequence points:

  • && – The left-hand side is fully evaluated before the right-hand side.
  • || – The left hand side is fully evaluated before the right-hand side.
  • , – The comma operator when used in an expression. It is not a sequence point when used to separate function parameters, which means that function inputs are not evaluated in any specific order.
  • Function call – All inputs are evaluated before entering the function.
  • ? : – The expression before the question mark is fully evaluated before any of the conditionals.
  • ; – The statement end is a sequence point.
  • if, switch, for, while, and do while – The "conditional" portion is fully evaluated before the body.

Note that operator overloading will convert these operators into the "Function call" semantics. Note also that parentheses do not introduce sequence points.

What this boils down to is: Watch out for expressions with side effects (function calls and assignments) that happen between sequence points. If you see these, the code is buggy, and the order of evaluation may change with compiler updates. The following are common examples of code to watch for:

Function inputs are not evaluated in any specific order, so the first parameter in this sample may be zero, or it may be something else.

x = 0;
f(x++, x++, x++);

In the following expression, there is no guarantee that the first value that is read from the stream will be multiplied by 256:

x = ReadFromStream() * 256 + ReadFromStream();

Multiple assignments that use the same variable between sequence points are bad; these should always be avoided:

x = x++;
array[x] = ++x;

Miscellaneous

Resource Use After Release

Watch out for cases in which resources are used after they have been released. This can happen with pretty much any type of resource: memory, handles, critical sections, and so forth. It is good defensive practice to invalidate the association to the resource after it has been released. Doing so will help catch these instances without trying to debug something that fails only a percentage of the time (because the data that is being reused "looks" correct).

Calling Convention

As new public APIs are added to a library, it is not uncommon to forget to state explicitly which calling convention is used. When you review the header file, ensure that all the public functions have an explicit declaration of the calling convention.

Formatting

When an application uses sprintf style formatting, be sure to validate the number of arguments and the types. The following are some of the common issues:

64-Bit Values

When you write out a 64-bit value, you must use the proper type modifier:

printf("%u", val64bits);

This should be the following:

printf("%I64u", val64bits);
Untrusted Inputs

If the input buffer comes from an untrusted source, this can result in a security attack, as the input can contain formatting specifiers that will pull data off the stack. Additionally, it can cause buffer overflows by expanding the resulting string beyond expected limits:

printf(buffer);

These should always be converted to the following:

printf("%s", buffer);
NULL Termination

Also, be sure to verify the size that is passed to snprintf. This function does not terminate the output buffer with a NULL character, if the resulting formatted text maxes out the buffer. It should be a habit always to add in a NULL terminator after calling this family of functions:

char output[SPECIAL_NUMBER];
snprintf(output, ARRAYSIZE(output), "%s", input);
output[ARRAYSIZE(output)-1] = '\0';

strncmp

When you compare the beginning of a buffer with a static string, be sure to validate that the passed-in size is correct. Also, watch for hard-coded values or improper uses of sizeof:

char prefix[] = "HELLO";
strncmp(string, prefix, 5);
strncmp(string, prefix, sizeof(prefix));

Using the hard-coded value of 5 is dangerous, as the prefix may change later. The sizeof statement is incorrect, as it will return 6.

Using strlen is often the safest approach, although it may incur a performance loss:

strncmp(string, prefix, strlen(prefix));

Safe String Functions

Moving away from the standard CRT string-handling functions to the safer StringCb functions is a good practice. However, be sure to review the sizes that are passed to these functions. They are no safer than the original versions, if they are lied to.

Switch Fall-Through

Review all switch cases for accidental fall-through conditions. Every place that is explicitly designed to fall through should be commented as such.

Equals TRUE

When dealing with pseudo Booleans (ones represented as integers under the covers), be aware of code that checks equivalence to TRUE:

result = Funct();
if(result == TRUE)

The problem with this is that any nonzero value is "true," so that comparing to a single value is probably not correct. Instead, the code should be written as follows:

if(result != FALSE)
if(result)

Code that uses the C++ bool built-in type does not suffer from this, as the value can be only true or false. When reviewing code, be sure that the source does not try to mix and match between representations (BOOL vs. bool), as they are different types and shouldn't be cross-pollinating.

Also be aware that sizeof(BOOL) is 4 (on 32-bit platforms), whereas sizeof(bool) is always 1.

Equals S_OK

HRESULT result = Funct();
if(result == S_OK)

The preceding code suffers the same problems that were outlined in the previous pattern. Successful HRESULTs can be represented with more than one numerical value. Use the SUCCEEDED or FAILED macro to check for result status on returned HRESULT types.

Return S_FALSE

If a function returns S_FALSE, be sure to trace through the call paths and review how the return values are interpreted. It may not have the same meaning as "the function completed through to success," and so checking only for SUCCEEDED or FAILED may not be the correct thing to do. It may be better to check for S_FALSE explicitly.

Also of note is that S_FALSE does not equal zero and, thus, does not equal the definition for FALSE (or, for that matter, lowercase "false").

__assume and __restrict

These keywords allow the compiler to make some very powerful optimizations. Unfortunately, if you lie to the compiler, it can go and generate some very wrong code. These bugs can be hard to find, as a cursory look through the C source code will not indicate a bug; the algorithm can be coded correctly, but the output is incorrect.

When you see the __restrict keyword being used on input parameters, make sure to review everywhere that the function is called, to ensure that any buffers can never overlap. If they do overlap, any operations that are performed on the memory may be performed with stale and incorrect data.

The __assume keyword can be just as dangerous. When you see the code author making an assumption about the value of a variable, be sure that you can validate and confirm the theory.

__assume(counter <= 10);

If 'counter' can ever be greater than 10, the generated code may or may not do the expected thing. Trying to debug when the "wrong" thing happens can be very difficult. It is also good practice to use compile-time asserts to help validate the assumptions: Everywhere that you use an __assume, you should use an assert.

Signed vs. Unsigned

When creating local variables, especially ones that are used to index through a loop, look for instances in which the developer used a signed integer instead of an unsigned type. Most often than not, the value has no reason to go negative.

Rethrowing Objects by Name

If you catch and then rethrow the caught object, it is improper form to use the name of the object in the throw statement:

catch(MyClass &b)
    {
    throw b;
    }

Doing so will create a temporary of the caught type. If the thrown object is actually that of a derived class, but caught by the type of the base class, you will lose any information that is associated with the derived object.

By using the "throw" keyword by itself, you will no longer rethrow a temporary and, thus, retain the full identity and data of the original object:

catch(MyClass &b)
    {
    throw;
    }

Maintainability and Robustness

Here, we border on some style issues, but ones that can affect the maintainability of the code as it is passed down the line to different developers. Patterns here don't necessarily represent immediate issues, but instead potential bugs.

/GS

Make sure that the project is being compiled with the /GS flag. This option will add some basic stack overflow checks to the code. There is no reason not to run with this extra bit of application protection always.

Multilined Macros

Macros that get so complex that they contain multiple statements probably should be turned into functions. Many developers use complex macros to "simplify" their code or to make it "easier to read." Unfortunately, doing so will increase the learning curve for others who now have to dig through a bunch of header files in order to understand the program flow. It is often better to have lengthier code, in which it is easier to follow the actual execution flow, instead of having macros that make the source look "cleaner." Another drawback with macros is that they make debugging harder. It is much better to have a function that can be traced through, instead of a bunch of instructions that get inserted and don't allow source-code stepping.

Array-Size Macros

Many projects have an ARRAYSIZE or NUMELEMENTS macro defined, which is used to generate (at compile time) the number of elements that a static array has. The common implementation looks something like the following:

#define ARRAYSIZE(x) (sizeof(x)/sizeof(x[0]))

This macro works great, assuming that you don't pass in a pointer accidently. The better solution is to use the following code, which will generate a compiler error if you don't actually pass in an array:

template <typename T, size_t N>
char (& SAFE_ARRAYCNT(T (&)[N]))[N];
#define ARRAYSIZE(x) sizeof(SAFE_ARRAYCNT(x))

Delayed GetLastError

If a call to GetLastError is not invoked immediately following the API call that failed, the error returned may be overwritten by someone else.

hFile = CreateFile(...);
if(hFile == INVALID_HANDLE_VALUE)
    {
    ErrorEvent("Could not open file, err: %u",
        GetLastError());
    return GetLastError();
    }

This pattern is very common in test code. If any additional calls are added to the 'if' statement or the ErrorEvent() grows in complexity, there is a greater chance that the second GetLastError() will not return the failure from CreateFile. It is much better to call GetLastError once, immediately following the main API, and cache the value (if necessary).

Hiding Errors

Watch out for code that obfuscates errors:

hr = OtherFunction();
if(FAILED(hr))
    return E_FAIL;

On error conditions, functions should provide as much detail about the failure as possible. This pattern is also very common when developers call Win32 APIs, look for code that does not call GetLastError, and percolate this value back to the caller.

Also, make sure that the error codes that are returned from a function are correct with respect to what happened. Especially, watch out for functions that return a "success" code when they actually failed.

Initialized by Disjunct Size

When you are initializing a buffer or struct, it is always best to use the name of the variable, instead of a size that later can become unrelated to the buffer:

#define BUFF_SIZE 1024
char buffer[BUFF_SIZE];
...
memset(buffer, 0, BUFF_SIZE);

For example, in this sample, if someone changes the buffer length to BUFF_SIZE+1 to allow for a terminating NULL character, the memset won't set that last byte. Instead, the code should be the following:

memset(buffer, 0, ARRAYSIZE(buffer));

Changing the underlying type during a rewrite is less common, but it does happen. For consistency and stability, it is probably better just to use the variable's name:

MyStruct item;
memset(&item, 0, sizeof(MyStruct));

The preceding should be the following:

memset(&item, 0, sizeof(item));

Be sure to watch out for instances of the "Initializing sizeof(pointer)" bugs.

Where appropriate, it is best to use the compiler to initialize objects and buffers:

MyStruct item = {0};

Additional References

Maguire, Steve. Writing Solid Code. Redmond, Washington: Microsoft Press, 1993.

McConnell, Steve. Code Complete 2nd Edition. Redmond, Washington: Microsoft Press, 2004.

Meyers, Scott. Effective C++ Second Edition. Boston, Massachusetts: Addison-Wesley, 1998.

Meyers, Scott. More Effective C++. Boston, Massachusetts: Addison-Wesley, 1996.

About the Author

Josh Poley has been a tester at Microsoft since 1998. He initially worked on the very first version of the Passport authentication service (currently called Windows Live ID). Then, in the spring of 2000, Josh moved over and joined a small handful of people who were starting to work on a project code-named "Xbox." His initial responsibilities covered various pieces of the low-level operating system (file systems, peripheral communication, and so on). Shortly after the Xbox game console launched in 2001, Josh took over as lead of the Kernel Test Team and remained in charge of validating the core operating system throughout the development and launch of the Xbox 360.