Automatic generation of a makefile for R and LaTeX

The so-called vignettes are a useful supplemental documentation for R packages. A vignette often mixes standard text, typically formatted in LaTeX, with chunks of R code. During the process of building the vignette, the code chunks are effectively run to generate some output that will be printed in the final document. The file document.rnw is a toy example of how the source file of a vignette looks like. This and other files used in this post can be found at the end of this post.

Sweave is a useful tool to generate a vignette. Upon a source file similar to 'document.rnw' shown above. Sweave generates a LaTeX file where the input and the output of the R chunks are inserted within appropriate LaTeX environments. It is also possible to hide the input or output of some of the R chunks. Then, this file can be processed as usual to get the final pdf, for example using 'pdflatex'.

As with any other document, the process of writing a vignette requires generating several versions of the document to check if everything is working as expected and how it is shaping up. Running all the scripts whenever we want a preview of the document is not efficient. Running everything from the scratch to see if some tweaks in the document had the desired result can be very disturbing. There may be some parts of the document that have not been modified, so it would be worth avoiding running them again and again.

So we would like to run only those R chunks that have been modified since the last time the document was generated. We must also be aware that usually some chunks depend on other chunks. Sometimes a code chunk has not been modified but the changes made in other chunk may affect the input used by the former and, hence, we should evaluate it as well. This is the typical scenario for which make is perfectly suited. 'Make' originally arises in the context of software development as a tool to facilitate the process of building and compiling programs. 'Make' can also be used to generate vignettes or any kind of other dynamic reports. This Makefile file processes the input file 'document.rnw' in a more efficient way.

Although the bulk of a Makefile can be reused for any document of this type, it requires some maintenance. In particular, the rules stating which files depend on each other must be updated whenever a code chunk is added, removed or modified. In practice it is not convenient to do so and updating the Makefile may cause more hassle than running everything from the scratch. Besides, it is easy to overlook something and make a mistake. A few lines of python can alleviate the burden of keeping the Makefile updated. The scripts described below automatically create the Makefile shown above

The first script, mk.py, copies each chunk tagged with the option eval=TRUE to a file in the 'Rprefix' directory. If no label is provided, a name of the type 'chunk__i' is given to the file. An auxiliar file 'rnw.aux' is created summarizing the options of all the chunks (to be used in 'write.py'). The variable 'varsOrigin' tracks the file where each variable was created and the dependency rules are determined. At the end, the Makefile file is created.

The script write.py is called by the Makefile. Its purpose is to create a LaTeX file where the R input and ouput are included in the 'Sinput' and 'Soutput' environments. These environments are defined in the 'document.rnw' based on the style used by Sweave. To be precise, only the input chunks marked with the tag 'Echo=TRUE' in the source file 'document.rnw' and the output from chunks with the option results='verbatim' are included. Similarly, only the R chunks marked with 'eval=TRUE' are evaluated.

The process is therefore as follows: run 'mk.py' and then 'make Makefile'. According to the variables 'Rprefix' and 'figPrefix' defined in 'mk.py' and 'write.py', two folders with names 'Rchunks' and 'figures' must be first created in the same path where 'document.rnw' is stored.

The necessary files to try an example can be found here: document.rnw | mk.py | write.py | Makefile.

I have recently realized that the R function 'all.vars' could be used for our purposes of maintaining the Makefile updated. This function returns the names in an R expression and, hence, it may simplify the scripts proposed here; it may also be easier to extend the scripts to particular situations not considered here. This calls for another post.

This entry was posted in LaTeX, literate programming, R and tagged , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *