Studying how genomes express themselves

org-mode for reproducible research

After overcoming the learning curve, I am very impressed with Emacs Org-mode as a tool for reproducible research. I started out wanting to document the statistical work I am doing for collaborators Michael Spivey, Stephanie Huette and Teenie Matlock here at U.C. Merced — a very exciting and interesting project on physical behavior during speech audition. I am doing my analysis in Perl and R, and I wanted to document the analysis completely — every critical line of code — for reproducibility, extensibility and integrity.

Well, Sweave is good for integrating R into your document — you can even generate your figures on the fly. This is the Literate Programming approach first advocated by Don Knuth (who named it so). But Sweave is only good for R, and I am doing R, perl, and shell. And Sweave only generates LaTeX.

Well, Org-mode — which at first glance looks just like a fancy outliner — has the only game in town for multilingual literate programming. It’s really not that hard to start using either — if you don’t mind grappling with your emacs a little to update org-mode (I put a tip on updating aquamacs in the aquamacs customization wiki).

And besides using org-mode for literate programming and reproducible research, I can see growing to love its other facilities for project management — TODOs, table editing, even work clocking. Org-mode is something I can see really growing into.