How I Learned To Review My Own Code

I recently discovered a new technique for reviewing my own code. Previously, I had the experience of flowcharting algorithms for two different patent applications. While producing the diagrams, the process of converting the code to a different form caused me to find bugs in both cases. The problem with flowcharts, though, is the constant fiddling with the graphical representation and adjusting things to fit in the space or how best to split the diagram when it does not fit in a viewable area. Because it had been so successful in forcing me to review my code at a more thorough level, I have been on the look out for something similar that would allow me put the code through an essentially isomorphic transformation without the complexity and overhead of a graphical layout to get the same benefit.

I came across a paper by Maarten H. van Emden that uses a matrix to describe the state transitions that take place in the code where the matrix represents a dual-state machine, where both the control flow and the data state are represented. I modified his method somewhat to adapt for use in a spreadsheet. He added the states starting from upper right going down and to the left. For a spreadsheet, it works much better to go from upper left going down and to the right. Maarten showed in his paper how to generate working C or Java code from the matrix, but I went the other way by converting existing C code to a matrix. The effort to convert my code to another form forced me to think through the code to a degree where I found numerous errors in the code I had previously missed. These included errors that passed the unit tests, working correctly in a single thread, but entirely wrong from a concurrency standpoint.

The other advantage was that it also allowed me to review the code in terms of its complexity, namely the number of states and state transitions involved. I found that I needed to reduce the number of states (to a reasonable number that would fit on a single spreadsheet tab) required refactoring the functions, thus improving the level of abstraction and greatly simplifying the code. The results can be seen in my current github repository, and the spreadsheets, which are in Google Sheets, can be seen in sheet1, sheet2, and sheet3.

I must say that this technique has helped me produce some of the best code I have ever written, both aesthetically and correctly, in which I justifiably can have a lot of confidence.

To TDD, Or Not?

One of the things bouncing around the Internet around the time I began to consider starting a blog was the debate over test driven development (TDD) as a measure of professionalism. I find I am an empiricist when it comes to arguing about development methodology after so many years of doing devleopment, which means I consider much of what is debated as vain efforts in convincing people to have the “correct opinion.” Yet, as with most things, I do have to make a decision about what I personally am going to practice in my personal effort to produce high quality code.

In addition, because of the type of code I tend to write, I have issues with some practices such as code reviews since reviewing the multithreaded, concurrent programs and data structures I write requires a skill set and level of expertise that is not common among my peers or developers in general. It isn’t that code reviews don’t help, but I find that most of the value comes not from others input, comments, or finding errors, but, rather, from the effort I have to undertake to explain the code which causes me to examine it in depth in order to be able to explicate it to others. So I am constantly looking for techniques that help me to examine, analyze, and force me to thoroughly walk through my own code in order for it to be as correct as possible. With my current project, I find I am even more on my own since it hasn’t attracted much attention, and I find myself even more dependent on my own self review.

So as I previously alluded, I find the arguments over TDD largely evidence free, and, therefore, largely arguments over opninions and preferences. To make TDD a hallmark of professionalism, I find a ridiculous assertion that reduces the practice of programming to mere domatic religion. The only measure of professionalism I find that makes sense is the actual production of high quality, error free code, regardless of how it was produced. That does not mean don’t find value in unit testing and some value in writing tests before writing code, but there is very little actual evidence that TDD itself produces well-designed and correct code over and above any other practice. Frankly, the more I listen to some of Leslie Lamport’s recent presentations about specifying systems and thinking before coding, I find his views on what should be practiced much more compelling as a measure of professionalism, even though I question how well they would scale as formal methods haven’t proven economical in practice for most development efforts.

As it stands, my current practice is to mix writing unit tests both before and after coding. I find that writing some simple, initial unit tests to exercise the major functions on their “happy path” the fastest way to get something working, which gives that initial, positive feedback I think that programmers find so motivating. I also find, though, a lot of value in writing most of my unit tests after writing the majority of the code as it forces me to examine the various paths through the code in a thorough fashion. I find the activity of writing thorough unit tests that exercise especially the error paths a valuable exercise as a form of self-directed code review.