Posted on 17 October 2010

The Joel Test is a set of twelve simple yes/no questions written by Joel Spolsky that is supposed to measure how good a software team is. Some of these questions include “Do you have a spec?”, “Do you have testers?”, and “Do you use source control?”. A software team that can answer “yes” to all twelve questions probably produces excellent software, according to Spolsky. Since being published in 2000, this test has achieved quite a bit of fame among software engineers in industry.

But what about academia? For example, how well does my research group perform on the Joel Test? After all, we write software, too. Should academic researchers be held to the same standard as a full-time software engineer in industry? I asked Stack Overflow and got one response that says no, and I agree. Academia and industry have different goals and different resources. Most research groups I know don’t have full-time testers. (Duh.) They don’t make daily builds, and they certainly don’t have the best tools money can buy.

Nevertheless, item 2 of this test reads, “Can you make a build in one step?” Spolsky elaborates:

… how many steps does it take to make a shipping build from the latest source snapshot? … If the process takes any more than one step, it is prone to errors. And when you get closer to shipping, you want to have a very fast cycle of fixing the “last” bug, making the final EXEs, etc. If it takes 20 steps to compile the code, run the installation builder, etc., you’re going to go crazy and you’re going to make silly mistakes.

I definitely did not have a one-step build for my research projects. And as predicted by Spolsky, without a one-step build, I went crazy. I could not remember the entire process to produce my results and incorporate them into my paper, especially after a few days of programming inactivity. When my numerical results changed, I had to repopulate the tables in the paper. All of these things could have been avoided by using a one-step build.

So I wrote one in Python. It is a single file named build.py. As shown in Figure 1, this build accepts three inputs: source code, data, and a paper template. The primary output is a paper — the ultimate goal of any research project.
one-step build

The build has three internal components:

  1. Compile the source code to generate an executable. This step is most important for languages such as C or C++. In languages such as Python or Matlab, you don’t need to worry about this step.
  2. Feed data into the executable to generate numerical results. I save these results to a file either as text or a serialized format (e.g., using pickle in Python).
  3. Feed results into a paper template to generate the paper. I have a paper_template.tex file that contains placeholders to be filled in. Then, I use the string.Template module in Python to fill in the results. (Perl is a good choice, too.) Finally, I use pdflatex to compile the tex file into a PDF document.

To run the build, I simply type python build.py at the command line. That is the beauty of the one-step build: at the single push of a button, you generate a new program, new results, and a new paper. If you are really lazy, you could even tell Linux to automatically run this script when you turn on the computer.

If you are having trouble managing the software for your research projects, write a one-step build. Good luck, and let me know how it goes.