# Fold-change bar plots with “0” on y axis

I see it more and more frequently: bar plots which are supposed to illustrate the regulation of a gene in terms of “fold change”, which include a “0” on the y axis.

It is subtle, but it irks me a lot. Also, the last time I tried to argue with my experimentally working colleagues, I heard that “everybody does it like this” and that I am nit-picking.

What is the fold change? Suppose that you have a before and after measurements, $a_0$ and $a_1$. Now, the fold change is

$F=\frac{a_1}{a_0}$

Could you replace $a_0$ by $a_1$ and vice versa? Yes, you could define it as $\frac{a_0}{a_1}$, right? Fold change decrease (how many times smaller) rather than fold change increase (how many times larger).

OK, so what does that mean if the fold change is equal to 0?

First, think what it means that the fold change is equal to 0.5. That means that $a_1$ is half of $a_0$, or that $a_0$ is two times that of $a_1$.

What about 0.1? That means that $a_1$ is ten times smaller than $a_0$.

0.01? Hundred times.

0.001? Thousand times.

You see where this is going. As we approach zero, the relation $\frac{a_0}{a_1}$ approaches infinity; you could say (incorrectly) that when fold change is equal to zero, $a_1$ is infinitely smaller than $a_0$.

Of course, this is outside of regular statistics. In other words, a fold change of 0 is meaningless and cannot be computed. If you measured $a_1$ and it was zero, you cannot meaningfully compute the fold change. Putting a zero on the y axis is therefore as meaningfull as putting “infinity”.

For that and other reasons, in many applications one calculates the log-fold change rather than fold change:

$log_2{FC} = \log_2\frac{a_1}{a_0} = \log_2{a_1} - \log_2{a_0}$

That makes the measure nice and symmetric around 0. If $a_1$ is twice higher than $a_0$, then $log_2{FC}=1$. If it is half of $a_0$, then $log_2{FC}=-1$. Also, it follows that $a_0$ and $a_1$ cannot be equal to 0 — because you cannot logarithmize zero.

Moreover, in most applications, logFC is (more or less) normally distributed. Fold change not only isn’t, it is not even possible for it to be. That means that not only putting a zero on the y axis is meaningless; but calculating parametric statistics such as mean and standard deviation of fold change is equally misleading. You simply shouldn’t do that.

But people nonetheless do, and they are happy with that. That is why we cannot have nice things.

# Testing variance before ANOVA

“To make the preliminary test on variances [before running a t-test or ANOVA] is rather like putting to sea in a rowing boat to find out whether conditions are sufficiently calm for an ocean liner to leave port!”

• George Box, Biometrika 1953;40:318–35.

# More on reveal.js and pandoc

One of the problems I had with reveal.js was the interactive PDF exporting mode — not only you require google-chrome for that, there also is no way of easily automatizing that task.

It turns out that decktape.js is a good, command line solution. The only drawback is that it actually creates screenshots from a browser, so that the slides do not contain any text — they are just a bunch of screenshots! This makes the PDF huge and not searchable. Moreover, you really want the script to wait between the screenshots (by default one second, which makes the hole process slow), otherwise it creates screenshots of the transition, and the result does not look good.

On the up side, it looks exactly like the presentation.

There were two issues to install it in Ubuntu 14.04, though. First, it was necessary to install the libjpeg62 package, and second, it was necessary to install the gcc 4.9 compiler, which I did by using the toolchain ppa:

sudo add-apt-repository ppa:ubuntu-toolchain-r/test
sudo apt-get update
sudo apt-get install gcc-4.9 g++-4.9


Everything else went smooth.

Then I put phantomjs into ~/bin/, the decktape/ directory into ~/.local/share/, and wrote a little bash script to be able to call it easily from anywhere:

#!/bin/bash

PHANTOMJS=~/bin/phantomjs
DECKTAPE=~/.local/share/decktape/decktape.js
FILE=$1;shift PDF=$1;shift

if [ -z "${FILE}" ] ; then cat <<EOF Usage:${0##*/}  [output file [options]]

decktape options:
EOF
$PHANTOMJS$DECKTAPE -h
exit 0
fi

if [ -z "$PDF" ] ; then PDF=${FILE%.*}.pdf ; fi

$PHANTOMJS$DECKTAPE "$@" "$FILE" "$PDF"  # Two bar plots What is the difference between the two bar plots below? I am sitting on a conference and these type of plots are relatively frequent in the presentations. Complete with a log-scale. The answer is, of course, that there is no difference between these two — the data is exactly the same, the only thing different is the vertical scale. These two plots explain why you should never, ever use a bar plot to represent log-scaled data: the position of the y axis is completely arbitrary, yet it influences greatly our perception of which plot shows a larger difference. (See also “Kick the bar chart habit”) # R-devel in parallel to regular R installation Unfortunately, you need both: R-devel (development version of R) if you want to submit your packages to CRAN, and regular R for your research (you don’t want the unstable release for that). Fortunately, installing R-devel in parallel is less trouble than one might think. Say, we want to install R-devel into a directory called ~/R-devel/, and we will download the sources to ~/src/. We will first set up two environment variables to hold these two directories: export RSOURCES=~/src export RDEVEL=~/R-devel  Then we get the sources with SVN. In Ubuntu, you need package subversion for that: mkdir -p$RSOURCES
cd $RSOURCES svn co https://svn.r-project.org/R/trunk R-devel R-devel/tools/rsync-recommended  Then, we compile R-devel. R might complain about missing developer packages with header files, in such a case the necessary package name must be guessed and the package installed (e.g. libcurl4-openssl-dev for Ubuntu when configure is complaining about missing curl): mkdir -p$RDEVEL
cd $RDEVEL$RSOURCES/R-devel/configure && make -j


That's it. Now we just need to set up a script to launch the development version of R:

#!/bin/bash
export PATH="$RDEVEL/bin/:\$PATH"
export R_LIBS=$RDEVEL/library R "$@"


You need to save the script in an executable file somewhere in your $PATH, e.g. ~/bin might be a good idea. Here are commands that make this script automatically in ~/bin/Rdev: cat <<EOF>~/bin/Rdev; #!/bin/bash export R_LIBS=$RDEVEL/library
export PATH="$RDEVEL/bin/:\$PATH"
R "\$@" EOF chmod a+x ~/bin/Rdev  One last thing remaining is to populate the library with packages necessary for the R-devel to run and check the packages, in my case c("knitr", "devtools", "ellipse", "Rcpp", "extrafont", "RColorBrewer", "beeswarm", "testthat", "XML", "rmarkdown", "roxygen2" ) and others (I keep expanding this list while checking my packages). Also, bioconductor packages limma and org.Hs.eg.db, which I need for a package which I build. Now I can check my packages with Rdev CMD build xyz / Rdev CMD check xyz_xyz.tar.gz # Presentations in (R)markdown There are many ways to turn a markdown or Rmarkdown document into a presentation. Way too many, and none of them is perfect. I made my first presentation with knitr / Rmarkdown for the tmod package. After trying various options in knitr, I decided on an approach in which the Rmarkdown document is oblivious of the presentation system and the job of turning it into a presentation is taken up by pandoc. There were several bumps and problems, and I will give now a step – by – step guide. # 1. Input file Let’s start with an example Rmd. In the following, I assume it has been saved under “test.Rmd”. --- title: "Example presentation" author: January Weiner date: "r Sys.Date()" --- # First part ## Slide 1 Code: {r plot1} plot(1:10, 1:10)  ## Slide 2 Some maths:$sum_{i=1}^{N}$# Second part ## Slide 3 ... contents ...  # 2. From Rmarkdown to markdown I use knitr only to create a markdown file. Rscript -e 'knitr::knit("test.Rmd")'  This produces the file test.md. With that, knitr’s job is finished, we will not need it anymore. # 3. Download reveal.js I decided for reveal.js. It was easy to work with and adapt to my needs, it had elegant default themes, it has a low footprint and shortcuts. And it has the “2D” layout, meaning that sections (level one headers) are arranged horizontally, while slides within one section are arranged vertically. Pressing “Esc” in a presentation shows the slide overview: Anyway, download reveal.js and unpack it in the same directory as test.md. # Making the presentation Use pandoc to create the reveal.js presentation. Note that this is not the final command line; in the following points I will discuss the problems which will influence the final version. pandoc -s -S -t revealjs --mathjax -o test.html test.md  # 4. MathJax On slide 2, we have a bit of maths. The maths is written in a LaTeX-like notation, and there are many ways to turn it into an elegant mathematical equation on the final presentation. I have tried many options with pandoc, and found that only MathJax works properly and without a major hassle. This is why on the previous command line I used the option --mathjax. However, if you run the above command line, you will notice that on “Slide 2”, the maths doesn’t work, despite using the ‘–mathjax’ option. It would work, though, if we put the file on a server. The reason is that pandoc puts the URL to MathJax in the form ‘src=”//cdn.mathjax…”‘. This assumes the context of how we opened the file. If we opened it from a server, using http or https, this would have worked. If we open it directly in a browser, it uses “file://cdn.mathjax…” which is obviously not on our file system. We have two options. ## 4.1 External MathJax Use the command line pandoc -s -S -t revealjs --mathjax="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" -o test.html test.md  This works unless we have no Internet access, for example because we show our presentation in another institute, where our laptop cannot connect to the Internet, because then we are screwed. ## 4.2 Local MathJax Alternatively, you can download the whole MathJax: wget https://github.com/mathjax/MathJax/archive/v2.5-latest.zip unzip v2.5-latest.zip mv MathJax-2.5-latest/ MathJax  and specify the local installation with the following command line: pandoc -s -S -t revealjs --mathjax="MathJax/MathJax.js?config=TeX-AMS-MML_HTMLorMML" -o test.html test.md  This works, but our presentation has suddenly over 170 megabytes. Which sucks. # 5. 2D layout and section headers I mentioned previously that reveal.js allows a neat 2D layout, in which slides from one section are arranged vertically, and sections are put next to each other. However, sections with only a title and no contents might be a bit boring, so let us modify the .md file changing the second section as follows: # Second part This is the second part, even more interesting. ## Slide 3 ... contents ...  You run pandoc again, and… Huh, where is the 2D layout gone? Why are all slides next to each other? Why are all slides from one section all on one single slide? Pandoc automatically guesses which level header denotes boundaries between slides. It defines “slide level” as “the highest level followed immediately by non-header contents”. After our modification, the top level header (starting with a single #) became the level at which slides are separated. OK, so maybe we try specifying the slide level manually? pandoc -s -S -t revealjs --mathjax="MathJax/MathJax.js?config=TeX-AMS-MML_HTMLorMML" -o test.html test.md  OK, this works, but… the contents under the first level header (“This is the second part…”) is gone! This is because “Headers above the slide level in the hierarchy create “title slides,” which just contain the section title and help to break the slide show into sections.” Turns out that there is no way we can have both: 2D with slides divided neatly into sections, and section slides which contain more than just a title. Not if we use pandoc, that is. # 6. Modifying the layout ## 6.1 reveal.js theme This is the easiest part: pick one of the existing reveal.js themes (I omit the mathjax command line for simplicity sake, do remember to put it back in): pandoc -s -S -t revealjs -o test.html test.md -V theme=blood  Note that the themes listed on the reveal.js website start with a capital letter, but you must specify a lowercase letter in the above command line. ## 6.2 Fine tuning the theme I did not like the sans-serif, capitalized and decorated fonts of the blood theme (shadows on titles, I beg you). Ugly. However, if you know a little CSS (and you’d better learn it!), you can easily adapt it to your needs. Look up the file reveal.js/css/theme/blood.css for hints and create your own CSS file (let us call it test.css) in the same directory as test.md. In the file below, I reset all the ugly decorations and set two fonts for headers and body, respectively: Garamond for headers, and Quattrocento Sans for body, using the google fonts service: @import url('http://fonts.googleapis.com/css?family=EB+Garamond'); @import url('http://fonts.googleapis.com/css?family=Quattrocento+Sans'); .reveal { font-size: 32px; font-family: 'Quattrocento Sans', 'sans-serif'; } .reveal h1, .reveal h2, .reveal h3, .reveal h4, .reveal h5, .reveal h6 { font-family: 'EB Garamond', 'serif'; font-weight:normal; text-transform: none; text-shadow: none; } .reveal h1 { font-size: 2em; } .reveal h2 { font-size: 1.7em; } .reveal h3 { font-size: 1.4em; } .reveal h4 { font-size: 1em; }  Also, as you might notice, I prefer smaller fonts here. We integrate our test.css file with the following option pandoc -s -S -t revealjs -o test.html test.md -V theme=blood --css test.css  ## 6.3 Adding a logo You can add a logo (or whatever other background for your slides) by modifying the CSS file test.css. If logo.png is the name of your logo, adding this to your CSS will put it on all your slides in the top left corner: body { background-image: url(logo.png); background-repeat: no-repeat; background-position:20px 20px; }  ## 6.4 Better syntax highliting Pandoc’s syntax highlighting doesn’t look good on a dark background. You can add the following to the “test.css” file to reproduce the Solarized theme. .reveal pre code { color: #839496; background-color: #2B2B2B; } /* use #FDF6E3 for light background */ .sourceCode .kw { color: #268BD2; } .sourceCode .dt { color: #268BD2; } .sourceCode .dv, .sourceCode .bn, .sourceCode .fl { color: #D33682; } .sourceCode .ch { color: #DC322F; } .sourceCode .st { color: #2AA198; } .sourceCode .co { color: #93A1A1; } .sourceCode .ot { color: #A57800; } .sourceCode .al { color: #CB4B16; font-weight: bold; } .sourceCode .fu { color: #268BD2; } .sourceCode .re { } .sourceCode .er { color: #D30102; font-weight: bold; } } # 7. Creating a PDF of your presentation Of course you need a PDF for printing and as a backup. There are two ways for producing PDF from reveal.js. Each one is imperfect. ## 7.1 Creating PDF using pandoc Since the test.md file is a generic markup, we can turn it into a simple PDF bash pandoc -s -S -o test.pdf test.md  Or even beamer presentation: pandoc -s -S -t beamer -o test.pdf test.md  Unfortunately, this is not so nice as our presentation, and completely ignores whatever we have put in the CSS. ## 7.2 Using the reveal.js printing facility and Google Chrome The second way is interactive only (you cannot create the PDF with a command line). Open the file in google chrome and add ?print-pdf to the file URL, such that the end of the URL reads test.html?print-pdf. The output looks garbled: the slides overlap. Don’t worry, it’s OK. Open the print dialog (press Ctrl-P), and you will see that now the output is correct. You can save it as PDF or send it to a printer. # 8. The final command line pandoc -s -S -t revealjs --mathjax="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" -V theme=blood --css test.css -o test.html test.md  # Kneat tricks So I have finally switched to knitr for doing my vignettes. The result is satisfactory, but the process was not entirely painless. • The command to run instead of “R CMD Sweave foo.Rnw” is Rscript -e 'rmarkdown::render("foo.rmd")' • I think that the concept of writing a package which has the main purpose to generate documentation in literate programming without providing mandatory documentation (such as list of options) within the package itself, referring instead to the online resources is beautifully subversive. • Knitr in the current R version requires pandoc X.Y.Z, while Ubuntu has X.Y.(Z-1). It was necessary to download the deb package from the pandoc site and install it manually. • To use knitr in vignettes, you need to add VignetteBuilder:knitr to your DESCRIPTION file. • I was confused at first as to what to do the old vignette header (the lines that start with “%\Vignette…”). The markdown header is different. Turns out you have to include these lines in the markdown header (Kill me, but I have no idea why there is a “>” behind “vignette:” or “|” behind “abstract:”. Knitr produces neat results, but it is one of the most confusing packages I have ever encountered.):  --- title: "FOO: the fooly of foology" author: "January Weiner" date: "r Sys.Date()" output: pdf_document: vignette: > %\VignetteIndexEntry{Foo} %\VignetteKeyword{foo} %\VignetteKeyword{foology} %\VignetteEngine{knitr::rmarkdown} %\SweaveUTF8 \usepackage[utf8](inputenc) abstract: | Foo foo foo foo. Foo foo, foo foo foo, foo. toc: yes bibliography: bibliography.bib ---  • <>= becomes {r label, fig.width=5, fig.height=5}. Also, any character argument to options must be in quotes. • I have no idea why fig.width=5 works, but opt.chunk$set(fig.width=5) doesn’t and at this point I don’t care to ask.

• I had a nightmarish forensic experience trying to figure out why my figures don’t get updated, where is the cache and some other things. Turns out that if you provide a symbolic link to an rmd file to knitr, it will change to the directory to where the original is. Which is not the same behavior as in the case of Sweave.

• It turns out that some options are valid for HTML, but not PDF, and vice versa, and you don’t get a warning. Also, it’s not mentioned in the documentation. Why? Because f— you, that’s why. For example, I spent half an hour trying to change the theme of a PDF vignette, after which it turned out that the theme option is not valid for PDFs. There was a table somewhere showing which options can be used when, but I lost the link and can’t find it in the documentation.

• I haven’t found out how to change the font size if generating pdf_document (my favorite). Update: I have found out that it is not possible.

• Also, no idea how to prevent breaking code small chunks between pages, which really, really should not happen.

• At first I specified the vignette engine to be knitr::knitr, but apparently this produces only (botched) HTML vignette (botched: no title, no author, no references). To generate neat, honest-to-Knuth PDF via pandoc and LaTeX, one should use knitr::rmarkdown, although that is not documented anywhere.

%\VignetteEngine{knitr::rmarkdown}