Code reviews. Stay awake at the back there

Article
07/11/2007

Code reviews. What could be duller? It is very easy to put the brain in neutral and read the code in a daze. The eyes move but no information reaches the brain. Testing is also dull. The good thing about these dull things is that they avoid exciting times. Being called at 3 AM because your code has caused a power station‘s monitoring system to crash is very exciting. Having allowed a hacker in to a bank’s system is even more thrilling. Excitement is not all that it is cracked up to be.

The least effective type of code review is when you read your own code back to check it is correct. I write and edit fiction as a sideline and reviewing code is a lot like proofreading a document. If you read it while you can remember it, you read what you thought that you wrote. The only time when a developer should proof read his own code is when there is no-one else available to do it. The original developer is least likely to spot errors and least likely to see where implementation could be improved – and also the least inclined to fix issues found. About all that can be said for self review is that it is better than no review at all. It can be improved by letting some time elapse between writing the code and reading it and improved further by writing other code before reviewing your own code. It is still only a good choice when the only other option is no review at all.

So, ideally code should be peer reviewed. Some of the features that you are looking for in a reviewer are not necessary those that you would want in a colleague. Who wants to work with someone suspicious, fussy and pedantic who always sees the glass as half empty? If you need a reviewer, you could do worse. So, what to look for in a review:

1. Does the code do what it is supposed to do? Bizarrely, this is in some ways the least important thing because you will be testing for this as well. A lot of places only test for this, which is a shame.

2. Does the code cover all cases? I like to comment code with a block at function start saying what the function does AND the acceptable range of inputs. These should be clear from the design. If the comment says that the third parameter is in the range -100 to +100, does the code check? If so, what does it do when out of range? Have the boundary conditions been handled well? That is where most errors lurk.

3. Is the code quality good? Well written code is more likely to be reliable code. Dull code is generally an asset as well. The code that dazzles you with its cleverness is the hardest to understand, both for you and the person who wrote it. It is likely to be the hardest to design tests for and hardest to spot bugs in. Incidentally, it is also the hardest for the compiler to optimize so it may not be terribly efficient.

4. Does the code do too much? Here be dragons. There was a lovely feature of the VBx date functions – they were actually implemented in OLEAUT32.dll but people called them from VB. If you called (for example) IsDate with a string, it would try to evaluate it as a date in local format which is DD/MM/YYYY here in the UK. If that didn’t work, it would try US format. If either worked then it was fine. This results in otherwise perfectly good applications that in Europe will accept the 10^th day the 14^th month.

5. Does the code do too little validation? Not just covering all cases but validating everything that can be validated at the entry point to the component – if your component is at a trust boundary. I will talk more about trust boundaries in future entries but essentially, a trust boundary is a point where data comes from somewhere you don’t control or goes to a point that you don’t control. These are the points where you need to focus your paranoia. Typically you will have little checking deeper in your code.

6. Does the code expose too much? All modern languages support some sort of visibility modifier on functions, variables and types. DLLs don’t have to expose all that they can do. Classes can have public, private and friend members. The visibility of these should have been defined at least to some degree in the design but there is a good chance that the compiler will not mind if you are more liberal with the specifications. Debug code is especially to be watched. It is great to have debug code. It is perfectly legitimate to have hooks in place to allow a test harness to manipulate internal state. Of course, conditional compilation should be used to prevent this code being present in the release code. It is the easiest thing in the world to forget to bracket some code that was added during debugging with the correct pre-processor commands. A review should check.

7. Are all exceptions handled? It may not be a bad thing if some are not – exceptions bubble up after all. However, there is something very important to consider. Do they bubble up to the user? If they do, it is not just sloppy. It can be a security hole. Exception handling is part of the design as much as the coding but a lot of software houses blur the lines – and with code that is being maintained rather than written fresh, the code review may be the only check that there is. Now, why would it be a security hole if you throw a raw exception at the user? In managed code, they will get a stack from you. You have just told them something about the structure of your application and that a routine that you really didn’t expect to throw an exception can do so. That isn’t good on the desktop and it is a very bad thing indeed over the web where you want your application to be as much of a black box as possible. Hackers love getting information about implementation. It tells them where to look for holes. ASP.NET tries to help you with this since you can and should disable rich error information by setting <customerrors> in Web.config appropriately.

8. When modifying code, has a new path into the internals of the code been created? As I mentioned in point 5, you need good validation at your borders. Sometimes, modifications allow access to functionality that was previously hidden – normally because another component wants to use yours in a way that you had not considered. Even if you trust that component, it is very hard to know who can call that interface. Caution is advised. The new code should have the same level of validation as the old.

9. This last might seem like a silly point but I have seen code fail so many times because of it. You often see a comment saying what the code will do and then the line of code below that does what the comment says… well, that is the theory. Often what the code does is subtly different. When that happens, the code or the comment is wrong. This could be misleading or it could break the application. Either way, it is not best practice

So, I hope that these points help next time you have to review some code. As always, if you think that I have missed something, please let me know. Comments are welcome

Until next time

Mark

Code reviews. Stay awake at the back there

Additional resources