Chicken and the Egg... (aka. Read vs. Writing code)
So I have been asked to deliver training for a group of people on how to “read” source code. I guess I should frame this request a bit. With very large product such as Exchange there are millions of lines of code and no one knows everything about what is happening in source. Most developers are very focused on the pieces of the puzzle they own. As part of the Escalation Team for Exchange we have to “reverse engineer on the fly” so-to-speak to understand and develop steps towards resolving customer issues. This typically involves jumping quickly from one code base to another depending upon where the investigation takes us... A large portion of our time is spent simply reading source code, not writing it.
So how do you teach people this “art” of digging deep very quickly into unfamilar code that you had no hand in writing? I myself, I come from a very traditional process of learning how to code.. by sitting down and writing it. I am struggling with how to tailor a delivery to focus on reading vs. writing source code. To me the only way you can be truely efficient in this process is by having written code yourself.
Thoughts?
==== Update ====
Great comments...
So boy do I agree about good comments, but to me comments are really geared towards explaining a particular block of code at the implementation detail level. But how does one know where to begin looking for that particular block of code? I think this stems from great engineering documentation about the object model itself and how things related from a high level.
I guess to ellaborate more on my intention and its really to help individuals without a lot of understanding of C/C++/C# begin to understand how things fit together and how they can begin using source code to determine what to look for that is wrong in the customer enviornment. To me if you are attempting to read source with sparse comments, then you need to have some practicle understanding of the language itself.
Comments
Anonymous
June 14, 2004
I believe that when it comes to reading and understanding a large body of code, the reader must lean heavily on object browsers and documentation. Not only that but readers must be competant searchers. They gotta have good searching tools for their source code and regular expression experience might be helpful. It also helps to have a version of the code which can be commented.Anonymous
June 14, 2004
Only read the comments! That should be enough for a "good source code" :-).
But, unfortunely, most programmers are too lazy to write good comments...Anonymous
June 14, 2004
A process that has worked well for me is to understand what level of abstraction the current problem is dealing with (architectural, algorithmic or implementation detail), imagine what steps the developer had to take to implement his approach, then start looking for alignment between theory and practice. A conceptual roadmap is key since you may need to understand what is MISSING, not what is there. Having SDK and other docs open for immediate reference is invaluable because developers understand what they are doing but they may not understand what other routines do at their request.
I guess this is the same process used in any objected oriented design -- identify key data and what operations must be done on the data, then hide everything except what is needed to support external interactions. It works in reverse, too.Anonymous
June 14, 2004
It's magic as best as I've been able to figure out.
I can do it, but others on my team can't.
My one recommendation is: Be fearless when debugging. If a problems steps out of your area of code, chase after it - read the code at the destination and try to understand it. In general, most code follows some form of standards, so it's usually not difficult to figure out what's going on.Anonymous
June 14, 2004
I guess it depends on what you mean by "without a lot of understanding of C/C++/C#". Myself, I set a break point at every method I'm interested in and start debugging. I look at the call stack a lot to see who's calling the method I'm interested in and that helps me figure out the program flow. I also reverse engineer at least a partial UML diagram of the classes I'm interested in. Between those two I can usually figure out what's going on. Even with bad variable and method names.Anonymous
June 14, 2004
I agree that being able to write code yourself is vitally important to reading code.
Whenever I have to read code that I did not write, I find myself looking for the iterative structures and control structures (If...Then, For Each...Next, Do...While, etc) to break the module into more easily-digestable chunks.
But I write code for a living. It sounds as though you are trying to teach non-coders how to read code. And that may not be realistic. I wish you luck, though.Anonymous
June 14, 2004
I explained how I do this, last month:
http://blogs.msdn.com/mattwar/archive/2004/05/20/135962.aspxAnonymous
June 14, 2004
There are degrees of "readingness", and degrees of problems that can be solved by different levels of source analysis.
I, for example, probably wouldn't be able to spot a leak in a given application (of the complexity of Exchange) just from reading the source (my stack isn't big enough), but I could probably tell you the intent of a particular block of code and follow the branches and the logic.
So, if the troubleshooting process can get the individual to the point where they know roughly what code to be looking at, having a rough idea of how it works is better than no idea at all.
Still, at this point having a "rough idea" basically means "understanding the language syntax", which tends to come from "writing stuff that doesn't work" :)Anonymous
June 14, 2004
I feel I am quite good at this, probably from switching jobs (and therefore code bases) a fair amount over the years.
Something I find invaluable when understanding someone elses code is Source Control System check-in comments. I can often learn a lot about a piece of code by looking at what changed between revisions and why. If the comments aren't helpful, I can at least find out who made a specific change and question them about it.
As far as source code comments, I'm of the camp that believes well written source code does not need (many) comments. The problem with comments is (1) that they are not applied in a consistent manner across modules written by different developers and (2) that they are not maintained or often reflect an individuals thoughts about a piece of code without a timestamp or username to go with those thoughts. Comments are best applied when used only when necessary.
When using clear, unabbreviated variable names and small blocks of well structured code, comments usually aren't necessary. I only provide comment blocks when doing something complex enough or unintuitive enough to warrant the comment. If you need heavily commented code, your code probably has a lot of room for improvement.Anonymous
June 17, 2004
The comment has been removedAnonymous
June 22, 2004
The comment has been removedAnonymous
August 05, 2004
One of the problems I've encountered is people writing documentation from the code. So, they click something which generates the UML or something, or just take theie code and convert it into a written spec of the code.
The best thing is a really good overview that then breaks down the design approach to the code. Whenever I've had in effect "programmer's notes", it's helped me understand why a coder did something in a particular way which helped with my understanding.
Discreet commenting of code rarely achieves much, except when a particularly tricky block needs some explanation. The harder part is people understanding how all the code works together.Anonymous
October 11, 2007
Escalation Engineer JeremyK asks in his blog this morning : how do you teach people this “art”Anonymous
May 26, 2009
PingBack from http://castironbakeware.info/story.php?title=jeremyk-s-msft-weblog-chicken-and-the-egg-aka-read-vs-writing