Automatic Makefile for R and LaTeX

javi | May 7, 2025, 10:25 p.m.

The so-called vignettes are a useful supplemental documentation for R packages. A vignette often mixes standard text, typically formatted in LaTeX, with chunks of R code. During the process of building the vignette, the code chunks are effectively run to generate some output that will be printed in the final document. The file document.rnw is a sample source file of a vignette. This and other files used in this post are also available at the end of the post.

document.rnw

\documentclass[a4paper,11pt]{article}
\usepackage{graphicx}
\usepackage{fancyvrb} %DefineVerbatimEnvironment

\usepackage[utf8]{inputenc}
\usepackage[english]{babel}
\selectlanguage{english}

\renewcommand{\baselinestretch}{1.3}

\DefineVerbatimEnvironment{Sinput}{Verbatim}{
  fontshape = sl,
  xleftmargin = 0.7cm,
}
\DefineVerbatimEnvironment{Soutput}{Verbatim}{
  fontshape = sl,
  xleftmargin = 0.7cm,
}

\title{Sample vignette document}
\date{\today}

\begin{document}

\maketitle

%<<echo=FALSE, results='hide', eval = FALSE>>=
\begin{Sinput}
#if (file.exists(".RData")) {
#  load(".RData")
#} else
#  save.image(file = ".RData")
abc <- 123
\end{Sinput}
%\verbatiminput{another.Rout}

\begin{equation}
\label{eq1}
x = y + z
\end{equation}

%<>=
\begin{Sinput}
 x <-log(AirPassengers)
  fit1 <-     StructTS(x, type = "BSM") # comment
x<-exp(x)
print(x)
  \end{Sinput}

Reference to equation \ref{eq1}.

%<<label ='fig', echo=FALSE, results='hide', fig=TRUE>>=
\begin{Sinput}
## comment
postscript(file = file.path("figures", "fig.eps"),
horizontal = FALSE, paper="special", width=5, height=5)
plot(tsSmooth(fit1))
invisible(dev.off())
  \end{Sinput}

\begin{figure}[ht]
\caption{\label{fig:g1} Caption}
\centering
\noindent
\includegraphics[width=5in,height=5in]{./figures/fig.pdf}
\end{figure}

%<<echo=FALSE, results='hide', label = 'another', eval=TRUE>>=
\begin{Sinput}
y <- Nile
fit2 <- StructTS(  y, type = "level")
z <- c(1, 2, y)
xy <- 2
xy <- c(1, xy)
\end{Sinput} 

New reference to equation \ref{eq1}.

\end{document}
back top

Sweave is a useful tool to generate a vignette. Upon a source file similar to 'document.rnw' shown above, Sweave generates a LaTeX file where the input and the output of the R chunks are inserted within appropriate LaTeX environments. It is also possible to hide the input or output of some of the R chunks. Then, this file can be processed as usual to get the final pdf, for example using 'pdflatex'.

As with any other document, the process of writing a vignette requires generating several versions of the document to check if everything is working as expected and how it is shaping up. Running all the scripts whenever we want a preview of the document is not efficient. Running everything from the scratch to see if some tweaks in the document had the desired result can be very disturbing. There may be some parts of the document that have not been modified, so it would be worth avoiding running them again and again.

So we would like to run only those R chunks that have been modified since the last time the document was generated. We must also be aware that usually some chunks depend on other chunks. Sometimes a code chunk has not been modified but the changes made in other chunk may affect the input used by the former and, hence, we should evaluate it as well. This is the typical scenario for which make is perfectly suited. 'Make' originally arises in the context of software development as a tool to facilitate the process of building and compiling programs. 'Make' can also be used to generate vignettes or any kind of other dynamic reports. This Makefile file processes the input file 'document.rnw' in a more efficient way.

Makefile

DOCNAME = document
Rdir = ./Rchunks
figDir = ./figures
RFLAGS = --slave --no-timing --no-readline --restore --save
Rout = $(addprefix $(Rdir)/, load.Rout fig.Rout )
Figs = $(addprefix $(figDir)/, fig.pdf)

%.Rout : %.R
	R CMD BATCH $(RFLAGS) $< $@

%.pdf : %.eps
	epstopdf $<

all : R latex

R : $(Rout)
	@echo "done make R"

$(Rdir)/load.Rout : $(addprefix $(Rdir)/, load.R)
$(Rdir)/fig.Rout : $(addprefix $(Rdir)/, fig.R load.R)

latex : $(DOCNAME).rnw $(Rout) $(Figs)
	@python write.py
	@pdflatex $(DOCNAME).tex
	@while (grep "Rerun to get cross-references right" $(DOCNAME).log); do pdflatex $(DOCNAME).tex; done 
	@echo "done make latex"

.PHONY : clean

clean :
	@rm -f .RData $(Rout) $(DOCNAME).tex $(DOCNAME).pdf $(DOCNAME).log $(figDir)/*.eps  $(figDir)/*.pdf $(DOCNAME).aux rnw.aux
back top

Although the bulk of a Makefile can be reused for any document of this type, it requires some maintenance. In particular, the rules stating which files depend on each other must be updated whenever a code chunk is added, removed or modified. In practice it is not convenient to do so and updating the Makefile may cause more hassle than running everything from the scratch. Besides, it is easy to overlook something and make a mistake. A few lines of python can alleviate the burden of keeping the Makefile updated. The scripts described below automatically create the Makefile shown above

The first script, mk.py, copies each chunk tagged with the option eval=TRUE to a file in the 'Rprefix' directory. If no label is provided, a name of the type 'chunk__i' is given to the file. An auxiliar file 'rnw.aux' is created summarizing the options of all the chunks (to be used in 'write.py'). The variable 'varsOrigin' tracks the file where each variable was created and the dependency rules are determined. At the end, the Makefile file is created.

mk.py

import os
import re

TRUE = True
FALSE = False

# user defined variables

DOCNAME = 'document'
Rprefix = './Rchunks'
figPrefix = './figures'
deleteComments = True

# copy each chunk with option eval=TRUE to a file in the 'Rprefix' directory
# and create 'rnw.aux' with the options all the chunks
# create variable 'varsOrigin' tracking the file where a variable was created

fname = "%s.rnw" % DOCNAME
f = open(fname, 'r')
lines = f.readlines()
f.close()

n = len(lines)
c1 = [i for i in range(n) if re.search(r'^\s*\%<<(.*)>>=.*', lines[i])]
c2 = [i for i in range(n) if re.search(r'\\end\{Sinput\}', lines[i])]

frnwaux = open('rnw.aux', 'w')

chunks = []
labels = []
figures = []
varsOrigin = {}

for i in range(len(c1)):
  # get chunk
  chunk = lines[(c1[i]+2):c2[i]]
  chunkOpts = re.sub(r'^\s*%<<(.*)>>=.*\n$', '\\1', lines[c1[i]])  
  chunkOpts = eval("dict(%s)" % chunkOpts)
  # get options and write line to auxiliar file rnw.aux
  label = chunkOpts['label'] if 'label' in chunkOpts else 'chunk__%s' % (i + 1)
  chunkEcho = chunkOpts.get('echo', True)
  chunkResults = chunkOpts.get('results', 'verbatim')
  chunkFig = chunkOpts.get('fig', False)
  frnwaux.write("%s %s %s\n" % (label, chunkEcho, chunkResults))
  # work with those chunks with 'eval=TRUE'  
  if chunkOpts.get('eval', True) == True:
    if deleteComments:
      chunk = [re.sub(r'#.*', '', e) for e in chunk]
    chunks.append(chunk)
    if chunkFig == TRUE:
      figures.append('%s.pdf' % label)
    label = '%s.R' % label
    labels.append(label)
    # copy chunk to file
    f = open(os.path.join(Rprefix, label), 'w')
    f.writelines(chunk)
    f.close()
    # get user defined object names
    dvars = [re.sub(r'^\s*(.*)<-.*\n{0,1}$', '\\1', e) for e in chunk if re.search('<-', e)]
    dvars = [re.sub(r'\s$', '', e) for e in dvars]
    # unique names
    dvars = list(set(dvars))
    if len(dvars) > 0:
      varsOrigin.update(dict(zip(dvars, [labels[-1]] * len(dvars))))

frnwaux.close()

# search dependencies
# omit the variables created in the current chunk
# since a code chunk file already depends on itself 

allvars = varsOrigin.keys()
dep = [[e] for e in labels]
for i, chunk in enumerate(chunks):
  chunk = ''.join(chunk)
  for v in allvars:
    if varsOrigin[v] != labels[i]:
      #if re.search(r'(\(\s*|=\s*|,\s*)%s\b' % v, chunk):
      if re.search(r'\b%s\b' % v, chunk):      
	dep[i].append(varsOrigin[v])

# write the Makefile

Rout = ['%sout ' % e for e in labels]
f = open('Makefile', 'w')
f.write('DOCNAME = %s\n' % DOCNAME)
f.write('Rdir = %s\n' % Rprefix)
f.write('figDir = %s\n' % figPrefix)
f.write('RFLAGS = --slave --no-timing --no-readline --restore --save\n')
f.write('Rout = $(addprefix $(Rdir)/, %s)\n' % ''.join(Rout))
f.write('Figs = $(addprefix $(figDir)/, %s)\n\n' % ''.join(figures))
f.write('%.Rout : %.R\n')
f.write('\tR CMD BATCH $(RFLAGS) $< $@\n\n')
f.write('%.pdf : %.eps\n')
f.write('\tepstopdf $<\n')
f.write('\nall : R latex\n\n')
f.write('R : $(Rout)\n')
f.write('\t@echo \"done make R\"\n\n')
for i in range(len(labels)):
  f.write('$(Rdir)/%s: $(addprefix $(Rdir)/, %s)\n' % (Rout[i], ' '.join(dep[i])))
f.write('\nlatex : $(DOCNAME).rnw $(Rout) $(Figs)\n')
f.write('\t@python write.py\n')
f.write('\t@pdflatex $(DOCNAME).tex\n')
f.write('\t@while (grep "Rerun to get cross-references right" $(DOCNAME).log); do ')
f.write('pdflatex $(DOCNAME).tex; ')
f.write('done \n')
f.write('\t@echo \"done make latex\"\n\n')
f.write('.PHONY : clean\n\n')
f.write('clean :\n\t@rm -f .RData $(Rout) $(DOCNAME).tex $(DOCNAME).pdf $(DOCNAME).log ')
f.write('$(figDir)/*.eps  $(figDir)/*.pdf $(DOCNAME).aux rnw.aux\n')
f.close()

print 'Done, file Makefile has been created.'
back top

The script write.py is called by the Makefile. Its purpose is to create a LaTeX file where the R input and ouput are included in the 'Sinput' and 'Soutput' environments. These environments are defined in the 'document.rnw' based on the style used by Sweave. To be precise, only the input chunks marked with the tag 'Echo=TRUE' in the source file 'document.rnw' and the output from chunks with the option results='verbatim' are included. Similarly, only the R chunks marked with 'eval=TRUE' are evaluated.

write.py

import re
import os

TRUE = True
FALSE = False

# same as in 'mk.py'
DOCNAME = 'document'
Rprefix = './Rchunks'

f = open('rnw.aux', 'r')
line = f.readlines()
f.close()

rnwaux = [e[:-1].split(' ') for e in line]

fname = "%s.rnw" % DOCNAME
f = open(fname, 'r')
lines = f.readlines()
f.close()

id1 = [i  for i in range(len(lines)) if re.search(r'\\begin\{Sinput\}', lines[i])]
id1 = id1 + [len(lines) + 1]
id2 = [i  for i in range(len(lines)) if re.search(r'\\end\{Sinput\}', lines[i])]

output = [''.join(lines[:(id1[0]-1)])]
for i in range(len(id2)):
  if rnwaux[i][1] == 'True':
    output.append(''.join(lines[id1[i]:(id2[i]+1)]))
  Rout = '%s/%s.Rout' % (Rprefix, rnwaux[i][0])
  if os.path.lexists(Rout) and os.path.getsize(Rout) > 0 and rnwaux[i][2] == 'verbatim':
    output.append('\\begin{Soutput}\n')
    f = open(Rout, 'r')
    output.append(''.join(f.readlines()))
    f.close()
    output.append('\\end{Soutput}\n') 
  output.append(''.join(lines[(id2[i]+1):id1[i+1]]))
fname = "%s.tex" % DOCNAME  
ftex = open(fname, 'w')
ftex.write(''.join(output)) 
ftex.close()
back top

The process is therefore as follows: run 'mk.py' and then 'make Makefile'. According to the variables 'Rprefix' and 'figPrefix' defined in 'mk.py' and 'write.py', two folders with names 'Rchunks' and 'figures' must be first created in the same path where 'document.rnw' is stored.

The necessary files to try an example can be found here: document.rnw | mk.py | write.py | Makefile.

After posting this content I realized that the R function 'all.vars' could be used for our purposes of maintaining the Makefile updated. This function returns the names in an R expression and, hence, it may simplify the scripts proposed here; it may also be easier to extend the scripts to particular situations not considered here. This calls for another post.

About

This blog is part of jalobe's website.

jalobe.com

⚠️This site is currently being updated.
At present not all posts from jalobe's blog are available.