Rotating movies with RGL

Learned another thing today: it is very simple to create animated GIFs in the rgl package with the build-in functions spin3d, play3d and movie3d.

library( pca3d )
data(metabo)
pca <- prcomp( metabo[,-1], scale.= TRUE )
pca3d( pca, group= metabo[,1] )
rot <- spin3d( axis= c( 0, 1, 0 ) )
movie3d( rot, duration= 12 )

Here is the result:

movie

Check credit card numbers using the Luhn algorithm

You can find a nice infographic explaining how to understand the credit card number here.

Credit card numbers include, as the last digit, a simple checksum calculated using the Luhn algorithm. In essence, we add up the digits of the card number in a certain way, and if the resulting sum divided by 10 is not an integer (i.e., the reminder of dividing by 10 is zero), then the number is invalid (the reverse is not true; a number divisible by 10 can still be invalid).

This can be easily computed in R, although I suspect that there might be an easier way:

checkLuhn <- function( ccnum ) {

  # helper function
  sumdigits <- function( x ) 
    sapply( x, function( xx ) 
          sum( as.numeric( unlist( strsplit( as.character( xx ), "" ))))
          )
  # split the digits
  ccnum <- as.character( ccnum )
  v <- as.numeric( unlist( strsplit( ccnum, "" ) ) )
  v <- rev( v )

  # indices of every second digit
  i2 <- seq( 2, length( v ), by= 2 )
  v[i2] <- v[i2] * 2
  v[ v > 9 ] <- sumdigits( v[ v > 9 ] )

  if( sum( v ) %% 10 == 0 ) return( TRUE )
  else return( FALSE )

}


Sample size / power calculations for Kaplan-Meier survival curves

The problem is simple: we have two groups of animals, treated and controls. Around 20% of the untreated animals will die during the course of the experiment, and we would like to be able to detect effect such that instead of 20%, 80% of animals will die in the treated group, with power 0.8 and alpha=0.05. Group sizes are equal and no other parameters are given.

What is the necessary group size?

I used the `ssizeCT.default` function from the `powerSurvEpi` R package. Based on the explanation in the package manual, this calculates (in my simple case) the required sample size in a group as follows:

n = \frac{m}{p_E + p_C}

where p_E and p_C are, respectively, probabilities of failure in the E(xpermiental) and C(ontrol) groups. I assume that in my case I should use 0.8 and 0.2, respectively, so n=m. The formulas here are simplified in comparison with the manual page of ssizeCT.default, simply because the group sizes are identical.

m is calculated as

m=\big(\frac{RR+1}{RR-1}\big)^2(z_{1-\alpha/2}+z_{1-\beta})^2

RR is the minimal effect size that we would like to be able to observe with power 0.8 and at alpha 0.05. That means, if the real effect size is RR or greater, we have 80% chance of getting a p-value smaller than 0.05 if the group sizes are equal to m. To calculate RR, I first calculate \theta, the hazard ratio, and for this I use the same approximate, expected mortality rates (20% and 80%):

\theta = \log(\frac{\log(0.8)}{\log(0.2)}) = -1.98

Since RR=exp(\theta)=0.139; thus m=18.3. This seems reasonable (based on previous experience).

Copy-paste and R interactive sessions

I wanted to have a way of easily copying R commands from my interactive sessions. I often work interactively, while creating a pipeline or lab-book like readme in a separate file.

Of course, you can just copy anything that is on the screen and paste it. However, I want to have my readme file to be directly executable, so generally I want only to copy the commands. The solution was to write a small function, called xclip, which copies an arbitrary number of lines from history into the X window system primary clipboard. Naturally, it is not compatible with Windows or MacOS. However, I can copy now a fragment of recent history with one command and one mouse click (since the primary X window system clipboard runs without shortcuts).

The function actually is only a wrapper around the little xclip command utility, available for Linux.

xclip <- function (n = 5) 
{
    t <- tempfile()
    savehistory(file = t)
    system(paste0("tail -n ", n, " ", t, "|xclip"))
}
/ Tagged

riverplot

Prompted by this cross-validated discussion, I have created the riverplot package. Here is a minimal gallery of the graphics produced by the package:

example

Here is an example which recreates the famous Minard plot:

minard

So, how to do these figures:

First, you need to create a specific riverplot object that can be directly plotted. (Use riverplot.example to generate an example object). Here, I show how to recreate the Minard plot using the provided data. makeRiver, the function that will create the object necessary for plotting, will use data frames to input information about nodes and edges, but we must use specific naming of the columns:

library( riverplot )
data( minard )
nodes <- minard$nodes
edges <- minard$edges
colnames( nodes ) <- c( "ID", "x", "y" )
colnames( edges ) <- c( "N1", "N2", "Value", "direction" )

Now we can add some style information to the “edge” columns, to mimick the orignal Minard plot:

# color the edges by troop movement direction
edges$col <- c( "#e5cbaa", "black" )[ factor( edges$direction ) ]
# color edges by their color rather than by gradient between the nodes
edges$edgecol <- "col"

# generate the riverplot object
river <- makeRiver( nodes, edges )

The makeRiver function reads any columns that match the style information (like colors of the nodes) and uses it to create the river object. The river object is just a simple list, you can easily view and manipulate it — or create it with your own functions. The point about makeRiver is to make sure that the data is consistent.

Once you have created a riverplot object with one of the above methods (or manually), you can plot it either with plot(x) or riverplot(x). I enforce the use of lines, and I also tell the plotting function to use a particular style. The default edges look curvy.

style <- list( edgestyle= "straight", nodestyle= "invisible" )
# plot the generated object
plot( river, lty= 1, default.style= style )
# Add cities
with( minard$cities, points( Longitude, Latitude, pch= 19 ) )
with( minard$cities, text( Longitude, Latitude, Name, adj= c( 0, 0 ) ) )

myfuncs 1.2

Thanks to the roxygen2, I quickly created manuals for many little functions that I put all together in the “myfuncs” package — a collection of bits and pieces that have nowhere else to go.

You can download it from here.

roxygen2: documenting R functions

From what I gathered on the roxygen2 package, it was perfect for me: documentation in Rd format generated automatically from the comments in the source files.

However, it took me a while to figure out how, exactly, does that work. The roxygen manual is rudimentary and technical; there is no vignette for roxygen2 and the vignette for roxygen is not applicable. I was lost and confused — so much for literate programming!

Still, once I understood how to proceed, it is truly a huge improvement in package creation. WRONG: not package creation; that was my first mistake. Roxygen does not help to maintain the package structure, and although it does update some of the package files apart from the manual pages, notably the NAMESPACE file, in general it is just for the sole purpose of maintaining the man pages. Which is a lot.

Creating the package

So, I had this source code file with several functions, and I wanted to build a package around it. First, I generated the package with the standard R package.skeleton function. I changed into the package directory, deleted the “Read-me-and-delete-me” file and modified the DESCRIPTION file (roxygen will not do it).

Second: Roxygen looks up the directory packagename/R and searches for source files which are there. So, if you want to create a man page for something, you need to have a corresponding source file. Pretty straightforward for functions that are already in the R directory. However, to create a man page for the package itself (i.e., the name-package.Rd file), one needs to create a dummy file called “name-package.R” in the R directory of the package. This file only contains the documentation in this commented, roxygen-specific format, and a single line of code containing a NULL. (that was actually one of these things which made me get stuck: in the roxygen vignette, an example is given with NA instead of NULL, which does not work, at least not in the recommended roxygen2. Also, command line examples don’t work).

To generate the manuals, start R in the parent directory of the package and run roxygenize( "packagedir" ). Presto, the manuals are there.

Using the roxygen tags to document the functions

Of course, you still need to edit the source files (including name-package.R) using the roxygen style formatting. This, fortunately, is straightforward. A block of roxygen code precedes the function to document; each line starts with #' (comment char + single quote). Several tags are necessary (notably the @export tag that indicates that the function should be exported in NAMESPACE, which I tend to forget), but this was simple to figure out.

The most important parts of the document — title, short description and details — can be included at the start of the roxygen block without any additional tags:

#' Hello world
#' 
#' The most boring program in the world
#'
#' This is so annoyingly boring that I don't even
#' know why I write it.
hello.world <- function( s="stuff it" ) 
    print( sprintf( "Go and %s, world", s ) )

So far, so good. However, this will not generate the manual: we need to mark the function for exporting. Also, we can add more tags: describes what the parameters do, who is the author, add an example.

#' Hello world
#' 
#' The most boring program in the world
#'
#' This is so annoyingly boring that I don't even
#' know why I write it.
#' @param s What the world should do and how
#' @export
#' @author God <bog@@niebo.org>
#' @examples
#' hello.world( "bugger yourself" )
hello.world <- function( s="stuff it" ) 
    print( sprintf( "Go and %s, world", s ) )

Note that @ has a special meaning for roxygen, and so it must be escaped (with another @, to be inconsistent, but practical).

Minimal example

To test this minimal example above, at least one has to do the following:

  • Create a valid DESCRIPTION file
  • Create the R directory (mkdir R)
  • Save the above code in a file in the R directory
  • Start R (in the directory where the DESCRIPTION and R reside)
  • Enter library( roxygen2 ) ; roxygenize( "." )

Roxygen will now produce a man directory in the current directory, and there, save a file called “hello.world.Rd”. This is the contents of the file:

\name{hello.world}
\alias{hello.world}
\title{Hello world}
\usage{
hello.world(s = "stuff it")
}
\arguments{
  \item{s}{What the world should do and how}
}
\description{
The most boring program in the world
}
\details{
This is so annoyingly boring that I don't even know why I
write it.
}
\examples{
hello.world( "bugger yourself" )
}
\author{
God ˂bog@niebo.org˃
}

(the < / &rt; look funny in the code above, but this is just a wordpress deficiency, the code is correct)

Links

Hadley Wickham, one of the authors of roxygen, published Advanced R Programming, an online book that includes, among other things, a tutorial for Roxygen; incomplete at best, but useful. Unfortunately, this is not easy to find and I still think that to merit a “literate programming bagde” one should document the roxygen package much better than it is documented at present.