javi | May 7, 2025, 10:25 p.m.
The so-called vignettes are a useful supplemental documentation for R packages. A vignette often mixes standard text, typically formatted in LaTeX, with chunks of R code. During the process of building the vignette, the code chunks are effectively run to generate some output that will be printed in the final document. The file document.rnw is a sample source file of a vignette. This and other files used in this post are also available at the end of the post.
\documentclass[a4paper,11pt]{article}
\usepackage{graphicx}
\usepackage{fancyvrb} %DefineVerbatimEnvironment
\usepackage[utf8]{inputenc}
\usepackage[english]{babel}
\selectlanguage{english}
\renewcommand{\baselinestretch}{1.3}
\DefineVerbatimEnvironment{Sinput}{Verbatim}{
fontshape = sl,
xleftmargin = 0.7cm,
}
\DefineVerbatimEnvironment{Soutput}{Verbatim}{
fontshape = sl,
xleftmargin = 0.7cm,
}
\title{Sample vignette document}
\date{\today}
\begin{document}
\maketitle
%<<echo=FALSE, results='hide', eval = FALSE>>=
\begin{Sinput}
#if (file.exists(".RData")) {
# load(".RData")
#} else
# save.image(file = ".RData")
abc <- 123
\end{Sinput}
%\verbatiminput{another.Rout}
\begin{equation}
\label{eq1}
x = y + z
\end{equation}
%<>=
\begin{Sinput}
x <-log(AirPassengers)
fit1 <- StructTS(x, type = "BSM") # comment
x<-exp(x)
print(x)
\end{Sinput}
Reference to equation \ref{eq1}.
%<<label ='fig', echo=FALSE, results='hide', fig=TRUE>>=
\begin{Sinput}
## comment
postscript(file = file.path("figures", "fig.eps"),
horizontal = FALSE, paper="special", width=5, height=5)
plot(tsSmooth(fit1))
invisible(dev.off())
\end{Sinput}
\begin{figure}[ht]
\caption{\label{fig:g1} Caption}
\centering
\noindent
\includegraphics[width=5in,height=5in]{./figures/fig.pdf}
\end{figure}
%<<echo=FALSE, results='hide', label = 'another', eval=TRUE>>=
\begin{Sinput}
y <- Nile
fit2 <- StructTS( y, type = "level")
z <- c(1, 2, y)
xy <- 2
xy <- c(1, xy)
\end{Sinput}
New reference to equation \ref{eq1}.
\end{document}
back top
Sweave is a useful tool to generate a vignette. Upon a source file similar to 'document.rnw' shown above, Sweave generates a LaTeX file where the input and the output of the R chunks are inserted within appropriate LaTeX environments. It is also possible to hide the input or output of some of the R chunks. Then, this file can be processed as usual to get the final pdf, for example using 'pdflatex'.
As with any other document, the process of writing a vignette requires generating several versions of the document to check if everything is working as expected and how it is shaping up. Running all the scripts whenever we want a preview of the document is not efficient. Running everything from the scratch to see if some tweaks in the document had the desired result can be very disturbing. There may be some parts of the document that have not been modified, so it would be worth avoiding running them again and again.
So we would like to run only those R chunks that have been modified since the last time the document was generated. We must also be aware that usually some chunks depend on other chunks. Sometimes a code chunk has not been modified but the changes made in other chunk may affect the input used by the former and, hence, we should evaluate it as well. This is the typical scenario for which make is perfectly suited. 'Make' originally arises in the context of software development as a tool to facilitate the process of building and compiling programs. 'Make' can also be used to generate vignettes or any kind of other dynamic reports. This Makefile file processes the input file 'document.rnw' in a more efficient way.
DOCNAME = document
Rdir = ./Rchunks
figDir = ./figures
RFLAGS = --slave --no-timing --no-readline --restore --save
Rout = $(addprefix $(Rdir)/, load.Rout fig.Rout )
Figs = $(addprefix $(figDir)/, fig.pdf)
%.Rout : %.R
R CMD BATCH $(RFLAGS) $< $@
%.pdf : %.eps
epstopdf $<
all : R latex
R : $(Rout)
@echo "done make R"
$(Rdir)/load.Rout : $(addprefix $(Rdir)/, load.R)
$(Rdir)/fig.Rout : $(addprefix $(Rdir)/, fig.R load.R)
latex : $(DOCNAME).rnw $(Rout) $(Figs)
@python write.py
@pdflatex $(DOCNAME).tex
@while (grep "Rerun to get cross-references right" $(DOCNAME).log); do pdflatex $(DOCNAME).tex; done
@echo "done make latex"
.PHONY : clean
clean :
@rm -f .RData $(Rout) $(DOCNAME).tex $(DOCNAME).pdf $(DOCNAME).log $(figDir)/*.eps $(figDir)/*.pdf $(DOCNAME).aux rnw.aux
back top
Although the bulk of a Makefile can be reused for any document of this type, it requires some maintenance. In particular, the rules stating which files depend on each other must be updated whenever a code chunk is added, removed or modified. In practice it is not convenient to do so and updating the Makefile may cause more hassle than running everything from the scratch. Besides, it is easy to overlook something and make a mistake. A few lines of python can alleviate the burden of keeping the Makefile updated. The scripts described below automatically create the Makefile shown above
The first script, mk.py, copies each chunk tagged with the option eval=TRUE to a file in the 'Rprefix' directory. If no label is provided, a name of the type 'chunk__i' is given to the file. An auxiliar file 'rnw.aux' is created summarizing the options of all the chunks (to be used in 'write.py'). The variable 'varsOrigin' tracks the file where each variable was created and the dependency rules are determined. At the end, the Makefile file is created.
import os
import re
TRUE = True
FALSE = False
# user defined variables
DOCNAME = 'document'
Rprefix = './Rchunks'
figPrefix = './figures'
deleteComments = True
# copy each chunk with option eval=TRUE to a file in the 'Rprefix' directory
# and create 'rnw.aux' with the options all the chunks
# create variable 'varsOrigin' tracking the file where a variable was created
fname = "%s.rnw" % DOCNAME
f = open(fname, 'r')
lines = f.readlines()
f.close()
n = len(lines)
c1 = [i for i in range(n) if re.search(r'^\s*\%<<(.*)>>=.*', lines[i])]
c2 = [i for i in range(n) if re.search(r'\\end\{Sinput\}', lines[i])]
frnwaux = open('rnw.aux', 'w')
chunks = []
labels = []
figures = []
varsOrigin = {}
for i in range(len(c1)):
# get chunk
chunk = lines[(c1[i]+2):c2[i]]
chunkOpts = re.sub(r'^\s*%<<(.*)>>=.*\n$', '\\1', lines[c1[i]])
chunkOpts = eval("dict(%s)" % chunkOpts)
# get options and write line to auxiliar file rnw.aux
label = chunkOpts['label'] if 'label' in chunkOpts else 'chunk__%s' % (i + 1)
chunkEcho = chunkOpts.get('echo', True)
chunkResults = chunkOpts.get('results', 'verbatim')
chunkFig = chunkOpts.get('fig', False)
frnwaux.write("%s %s %s\n" % (label, chunkEcho, chunkResults))
# work with those chunks with 'eval=TRUE'
if chunkOpts.get('eval', True) == True:
if deleteComments:
chunk = [re.sub(r'#.*', '', e) for e in chunk]
chunks.append(chunk)
if chunkFig == TRUE:
figures.append('%s.pdf' % label)
label = '%s.R' % label
labels.append(label)
# copy chunk to file
f = open(os.path.join(Rprefix, label), 'w')
f.writelines(chunk)
f.close()
# get user defined object names
dvars = [re.sub(r'^\s*(.*)<-.*\n{0,1}$', '\\1', e) for e in chunk if re.search('<-', e)]
dvars = [re.sub(r'\s$', '', e) for e in dvars]
# unique names
dvars = list(set(dvars))
if len(dvars) > 0:
varsOrigin.update(dict(zip(dvars, [labels[-1]] * len(dvars))))
frnwaux.close()
# search dependencies
# omit the variables created in the current chunk
# since a code chunk file already depends on itself
allvars = varsOrigin.keys()
dep = [[e] for e in labels]
for i, chunk in enumerate(chunks):
chunk = ''.join(chunk)
for v in allvars:
if varsOrigin[v] != labels[i]:
#if re.search(r'(\(\s*|=\s*|,\s*)%s\b' % v, chunk):
if re.search(r'\b%s\b' % v, chunk):
dep[i].append(varsOrigin[v])
# write the Makefile
Rout = ['%sout ' % e for e in labels]
f = open('Makefile', 'w')
f.write('DOCNAME = %s\n' % DOCNAME)
f.write('Rdir = %s\n' % Rprefix)
f.write('figDir = %s\n' % figPrefix)
f.write('RFLAGS = --slave --no-timing --no-readline --restore --save\n')
f.write('Rout = $(addprefix $(Rdir)/, %s)\n' % ''.join(Rout))
f.write('Figs = $(addprefix $(figDir)/, %s)\n\n' % ''.join(figures))
f.write('%.Rout : %.R\n')
f.write('\tR CMD BATCH $(RFLAGS) $< $@\n\n')
f.write('%.pdf : %.eps\n')
f.write('\tepstopdf $<\n')
f.write('\nall : R latex\n\n')
f.write('R : $(Rout)\n')
f.write('\t@echo \"done make R\"\n\n')
for i in range(len(labels)):
f.write('$(Rdir)/%s: $(addprefix $(Rdir)/, %s)\n' % (Rout[i], ' '.join(dep[i])))
f.write('\nlatex : $(DOCNAME).rnw $(Rout) $(Figs)\n')
f.write('\t@python write.py\n')
f.write('\t@pdflatex $(DOCNAME).tex\n')
f.write('\t@while (grep "Rerun to get cross-references right" $(DOCNAME).log); do ')
f.write('pdflatex $(DOCNAME).tex; ')
f.write('done \n')
f.write('\t@echo \"done make latex\"\n\n')
f.write('.PHONY : clean\n\n')
f.write('clean :\n\t@rm -f .RData $(Rout) $(DOCNAME).tex $(DOCNAME).pdf $(DOCNAME).log ')
f.write('$(figDir)/*.eps $(figDir)/*.pdf $(DOCNAME).aux rnw.aux\n')
f.close()
print 'Done, file Makefile has been created.'
back top
The script write.py is called by the Makefile. Its purpose is to create a LaTeX file where the R input and ouput are included in the 'Sinput' and 'Soutput' environments. These environments are defined in the 'document.rnw' based on the style used by Sweave. To be precise, only the input chunks marked with the tag 'Echo=TRUE' in the source file 'document.rnw' and the output from chunks with the option results='verbatim' are included. Similarly, only the R chunks marked with 'eval=TRUE' are evaluated.
import re
import os
TRUE = True
FALSE = False
# same as in 'mk.py'
DOCNAME = 'document'
Rprefix = './Rchunks'
f = open('rnw.aux', 'r')
line = f.readlines()
f.close()
rnwaux = [e[:-1].split(' ') for e in line]
fname = "%s.rnw" % DOCNAME
f = open(fname, 'r')
lines = f.readlines()
f.close()
id1 = [i for i in range(len(lines)) if re.search(r'\\begin\{Sinput\}', lines[i])]
id1 = id1 + [len(lines) + 1]
id2 = [i for i in range(len(lines)) if re.search(r'\\end\{Sinput\}', lines[i])]
output = [''.join(lines[:(id1[0]-1)])]
for i in range(len(id2)):
if rnwaux[i][1] == 'True':
output.append(''.join(lines[id1[i]:(id2[i]+1)]))
Rout = '%s/%s.Rout' % (Rprefix, rnwaux[i][0])
if os.path.lexists(Rout) and os.path.getsize(Rout) > 0 and rnwaux[i][2] == 'verbatim':
output.append('\\begin{Soutput}\n')
f = open(Rout, 'r')
output.append(''.join(f.readlines()))
f.close()
output.append('\\end{Soutput}\n')
output.append(''.join(lines[(id2[i]+1):id1[i+1]]))
fname = "%s.tex" % DOCNAME
ftex = open(fname, 'w')
ftex.write(''.join(output))
ftex.close()
back top
The process is therefore as follows: run 'mk.py' and then 'make Makefile'. According to the variables 'Rprefix' and 'figPrefix' defined in 'mk.py' and 'write.py', two folders with names 'Rchunks' and 'figures' must be first created in the same path where 'document.rnw' is stored.
The necessary files to try an example can be found here: document.rnw | mk.py | write.py | Makefile.
After posting this content I realized that the R function 'all.vars' could be used for our purposes of maintaining the Makefile updated. This function returns the names in an R expression and, hence, it may simplify the scripts proposed here; it may also be easier to extend the scripts to particular situations not considered here. This calls for another post.
This blog is part of jalobe's website.
jalobe.com
At present not all posts from jalobe's blog are available.