The R3port package is developed to easily create pdf or html reports including tables, listings and plots created in R. The reporting idea was initiated within the context of clinical research, where listings for pharmacokinetics and safety are generated. The package was set up such that it is generic enough to be used in other fields too.
The main options to create reports in R are knitr
or
rmarkdown
. These packages are perfect for writing small to
medium sized documents. Especially when you want to add code, plots or
tables to help tell the story. There are however times when you’re
working in an R script and just want to see how a table or plot look
like in a browser or pdf. Also when someone else writes the report or it
has relatively little R results, knitr or rmarkdown might not be the
best option There are simple options like png()
and
pdf()
or more advanced within packages like
xtable
. With packages like tables
,
greport
, rapport
or startgazer
you can also directly create (full) reports including R results. But I
believe there is a gap between lower level flexibility and the more
specialized packages Within this vignette I will try to explain my
preferred workflow (as simple as possible) and how I implemented it in
the package. To demonstrate the usage of the package the tips dataset
from the reshape2 package was used as this dataset has a nice
combination of numerical and discrete values:
library(R3port)
library(reshape2)
data(tips)
tips$day <- factor(tips$day,levels=c("Thur","Fri","Sat","Sun" ),ordered = TRUE)
head(tips)
## total_bill tip sex smoker day time size
## 1 16.99 1.01 Female No Sun Dinner 2
## 2 10.34 1.66 Male No Sun Dinner 3
## 3 21.01 3.50 Male No Sun Dinner 3
## 4 23.68 3.31 Male No Sun Dinner 2
## 5 24.59 3.61 Female No Sun Dinner 4
## 6 25.29 4.71 Male No Sun Dinner 4
The package includes a few simple calculation functions. Say we want to know basic statistics for the total bill per gender, day of the week and time of day. the following can be done to achieve this:
## sex day time statistic value
## 1 Female Thur Dinner N 1
## 2 Female Thur Lunch N 31
## 3 Female Fri Dinner N 5
## 4 Female Fri Lunch N 4
## 5 Female Sat Dinner N 28
## 6 Female Sun Dinner N 18
## 7 Male Thur Lunch N 30
## 8 Male Fri Dinner N 7
## 9 Male Fri Lunch N 3
## 10 Male Sat Dinner N 59
## 11 Male Sun Dinner N 58
## 12 Total Thur Dinner N 1
## 13 Total Thur Lunch N 61
## 14 Total Fri Dinner N 12
## 15 Total Fri Lunch N 7
## 16 Total Sat Dinner N 87
## 17 Total Sun Dinner N 76
## 18 Female Thur Dinner Mean 18.78
## 19 Female Thur Lunch Mean 16.65
## 20 Female Fri Dinner Mean 14.31
The function above calculates a standard set of descriptive statistics frequently used in reporting clinical results. Also it places the results in a long format data frame (for easy processing in the other functions). There are some options for calculation of totals and number of digits in output but mainly it uses plyr to do the work. For this reason a predefined set of statisitcs was chosen as it is quite easy to calculate other statistics just using plyr.
Using a similar function, the frequencies within a data frame can be calculated. Let’s say we want to know the frequencies of each gender, day of the week and time of day within the data frame. Something like the following can be used:
## sex day time dnm Freq Perc FreqPerc
## 1 Female Thur Dinner 244 1 0.41 1~~(0.41)
## 2 Female Thur Lunch 244 31 12.70 31~(12.70)
## 3 Female Fri Dinner 244 5 2.05 5~~(2.05)
## 4 Female Fri Lunch 244 4 1.64 4~~(1.64)
## 5 Female Sat Dinner 244 28 11.48 28~(11.48)
## 6 Female Sun Dinner 244 18 7.38 18~(7.38)
In general the base of the function is the default table
within R. Although percentages are also given. Options are available for
the denominator, calculation of totals and taking duplicates into
account. This was mainly inspired by clinical safety analyses.
The package has two sets of functions to create documents in either
tex (and pdf) or html format. The doc functions ltx_doc
and
html_doc
are quite generic in the sense that any kind of
character vector can be included. This means simple text can be added to
a document or the output of functions that can create TeX/html coding.
The following example demonstrates how it works
## \documentclass{article}
## \usepackage[ landscape,a4paper,top=1in, bottom=1.2in, left=0.75in, right=0.75in]{geometry}
## \usepackage{graphicx,longtable,amsfonts,booktabs,fancyhdr,lastpage,hyperref,bookmark,caption,listings,float}
## \usepackage[figuresleft]{rotating}
## \setlength\parindent{0pt}
## \setlength{\LTleft}{0pt}
## \setlength{\LTcapwidth}{8in}
## \pagestyle{fancy}
## \renewcommand{\headrulewidth}{0.4pt}
## \renewcommand{\footrulewidth}{0.4pt}
## \cfoot{}\rfoot{}\lhead{\today}\rhead{\thepage\ of \pageref{LastPage}}
## \makeatletter \renewcommand*\l@table{\@dottedtocline{1}{1.5em}{6em}} \makeatother
## \makeatletter \renewcommand*\l@figure{\@dottedtocline{1}{1.5em}{6em}} \makeatother
## \begin{document}
## This is just some \emph{text}
## \end{document}
## <!DOCTYPE html>
## <html>
## <title> report </title>
## <head>
## <link rel='stylesheet' href='style.css' type='text/css'>
## </head>
## <body>
## This is just some <em>text</em>
## </body>
At first sight it might seem that there is little added value in the
above function, however it’s also possible to output the results of
xtable
or similar packages. Combine this with an output
statement and you can directly see the compiled pdf (or tex if you like)
which is opened by default. The same is goes for the html but without
compilation. The next example demonstrates this. The resulting html and
pdf files for this and all subsequent examples are available in the
links below the code block. Also in some examples the html version is
not displayed but works almost the same as the latex version.
library(xtable)
xtbl <- means(tips,variable="total_bill",by=c("day"))
xtbl_ltx <- print(xtable(xtbl),print.results=FALSE)
ltx_doc(xtbl_ltx,out="out1.tex")
xtbl_html <- print(xtable(xtbl),print.results=FALSE,type="html", html.table.attributes = "class=table")
html_doc(xtbl_html,out="out1.html",
css="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/css/bootstrap.min.css")
With the list functions you can directly output a data frame without using xtable. The function is intended to display a data frame a bit more attractive. Within the following example the header is be adapted and the variables are grouped and ordered. Furthermore a title, footnote and column width is adapted
lst1 <- means(tips,variable="tip",by=c("sex","day"),pack=2)
lst2 <- means(tips,variable="total_bill",by=c("sex","day"),pack=2)
lst <- merge(lst1,lst2,by=c("sex","day","statistic"))
names(lst)[4:5] <- c("tip","Total Bill")
ltx_list(lst,vargroup=c("","","","income","income"),group = 2, title="a listing",
fill="-",footnote="a footnote",mancol="p{3cm}llll",out="out2.tex")
An html document can be created quite similar, with the exception of the column width option
The most comprehensive functions are ltx_table
and
html_table
. With these functions it is possible to reshape
the data before presenting the results. This can make the results easier
to read and offers the possibility to tweak the table header. Because
the data is transposed to a wide format it is important to keep the
results “unique”. Read more about aggregating in
reshape2::dcast
which is used in this function.The
following example demonstrates a common usage of the function:
ltx_table(tbl1,x=c("sex","statistic"),y=c("day","time"),var="value",
title="total bill statistics",xabove=TRUE,out="out3.tex")
html_table(tbl1,x=c("sex","statistic"),y=c("day","time"),var="value",
title="total bill statistics",xabove=TRUE,out="out3.html")
The xabove option makes it possible to group the data and to save space. the header has two levels as two variables are selected to cast the results. Although you can add many levels this way, it is advised to use no more than three for the sake of readability. Another examples is presented where labels are added to the data frame and where some common options are adapted
attr(tbl2$sex,"label") <- "Gender"
attr(tbl2$day,"label") <- "Day of the week"
ltx_table(tbl2,x=c("sex","day"),y=c("time"),var="FreqPerc",yhead = TRUE,size="\\normalsize",
title="customer frequency",xabove=TRUE,out="out4.tex",group=1,mancol="lllll",
template=paste0(system.file(package = "R3port"), "/listing.tex"))
With a bit of fantasy you can make combinations like this for instance:
library(plyr)
tbl3 <- freq(tips,c("sex","day","time"),total=c("day","time"),spacechar = "~")
tbl3$statistic <- factor("N (perc)")
tbl3$value <- tbl3$FreqPerc
tbl3 <- rbind.fill(tbl1,tbl3)
ltx_table(tbl3,x=c("sex","day"),y=c("time","statistic"),var="value",convchar = TRUE,
title="total bill statistics",xabove=TRUE,out="out5.tex",
mancol=paste(rep("l",16),collapse=""),
template=paste0(system.file(package = "R3port"), "/listing.tex"))
The graphical output system is basically the same as for the tables and listings. The only difference is that the figures are also saved as png or pdf files. Then they are referenced in the tex or html file, and if selected compiled. Using grid graphics is the easiest because you can just place the plot in an object and pass it to the functions. The way base plots can be used is by placing it in a function. Some examples are given below:
# Used lwidth to ensure correct width in beamer presentations
library(ggplot2)
pl <- ggplot(tips,aes(x=day,y=total_bill)) + geom_boxplot()
ltx_plot(pl,out="out6.tex",titlepr = "plot A",
title="example for basic plotting",lwidth="0.9\\linewidth")
html_plot(pl,out="out6.html",titlepr = "plot A",
title="example for basic plotting")
# you can output a list of plots e.g:
pl <- lapply(unique(tips$sex),function(x){
ggplot(tips[tips$sex==x,],aes(time,total_bill)) + geom_boxplot() +
facet_wrap(~day) + ggtitle(x)
})
ltx_plot(pl,out="out7.tex",titlepr = "plot B",
title="example for plotting lists",lwidth="0.9\\linewidth")
html_plot(pl,out="out7.html",titlepr = "plot B",
title="example for plotting lists")
# base plots can be used but placed in function:
bpl <- function() plot(tips$day,tips$total_bill)
ltx_plot(bpl(),out="out8.tex",titlepr = "plot C",
title="example for base plot",lwidth="0.9\\linewidth")
html_plot(bpl(),out="out8.html",titlepr = "plot C",
title="example for base plot")
The last thing to after creating the tables listings and plots is to
combine it in an overall document. Here the last two functions will come
in ltx_combine
and html_combine
. To have a bit
of background on how these functions work we should first look at all
the output that is created. For each table and listing, by default 3
files are generated. For latex output, the tex file, compiled pdf and a
raw tex file. For html only a html and raw html file. In case of
plotting also a figures folder is created with teh plots. The raw files
have only the text/coding for the table or listing. The combine
functions can pick these files up and place it in a combined document.
The combine argument can be used to select files, for instance if you
want to combine a selected set of output.
You probably noticed that not all output generated has the same
layout. This is controlled by the templates, I didn’t explain the
template argument before because I think it deserves a section of it’s
own. The template system uses whisker
to create output in
different styes. A couple of templates are included within the package
to start with, however users can generate their own templates using this
system. The whisker package is set up to provide a template and a list
with variables to render (called rendlist in this package). Provide a
template and a rendlist to create a custom styled output. The included
templates are demonstrated below with the combine functions (but will
work the same for the list, table and plot functions). But if you want
to experiment read further for some details.
tmpl <- list.files(system.file(package="R3port"),pattern="\\.tex$",full.names=TRUE)
lapply(1:length(tmpl),function(x){
ppt <- ifelse(grepl("beamer.tex",tmpl[x]),TRUE,FALSE)
ltx_combine(out=paste0("combt",x,".tex"),template=tmpl[x],presentation=ppt)
})
comb1.pdf
comb2.pdf
comb3.pdf
comb4.pdf
comb5.pdf
The way the package handles the rendlist is that certain information
is always added to this list within a function. This is off course
necessary to be able to include the table,list or plot to be created
within the final document. Practically this means that the
rrres
tag is always included in the template to place in
the R results to be included. Other special tags include
rrtoc
, css
and orientation
. The
safest way is to not include these tags in rendlist or change this
within the template (or at least be aware of the effect when omitting
tags or placing them elsewhere in the template)