Reproducible reports for decision making

Alexander Matrunich bio photo By Alexander Matrunich

The problem and the opportunity

Flows of data in organizations today are constant and unstoppable. A report based on a data slice from a specific timestamp is already outdated after a half of hour. The infrastructure of R allows us to create documents, where tables, plots and other calculation-based insertions are automatically updated every time you regenerate these documents.

This technology is called RMarkdown.

What do you get exactly by employing RMarkdown

For example, an analyst can prepare a report by mixing a body of text with chunks of R (or other statistical package) source code. Then when the final document is produced, the code chunks are replaced by results of their evaluation: plotting instructions are displaced by graphics, tabulating commands by tables etc.

For dessimination and publication a RMarkdown file can be converted into a broad range of formats: Word, PDF, HTML web-pages etc. That is right: the analyst works with a single document, but that single document can generate output in any of several formats.

What do you need to generate reproducible reports in RMarkdown

Skills

The Markdown format is very simple. It is comprised of a bunch of conventions to mark paragraphs, headings of different levels, numbered and unordered lists, links, etc. It is much simpler than LaTeX, if you are familiar with that format. It is difficult to compare Markdown with preparing documents in MS Word, because they have completely different approaches. You will not likely spend more than 10 minutes learning how to use Markdown.

Learnig R is another story!

Software

All software for producing and working with RMarkdown documents is free. Basically you need R and RStudio IDE. RStudio will provide you with all required R-packages and dependencies. If you want to generate documents in PDF format, also you will need to install one of the LaTeX engines. It is more complex, but is also free software.

How does RMarkdown work

So RMarkdown consists of two components: documents in Markdown format and chunks of R commands. The job is done by a specific R-package knitr. It looks through the document and finds R commands. Knitr executes all these commands and creates new document, where R commands are substituted by their results. At this step we have clean Markdown document. Then knitr runs programm Pandoc, which converts Markdown document into file of desired format.