Using caret

A caret call I frequently use. Given that x is training data and y response,

library(doMC)
registerDoMC(cores=6)

tc <- trainControl(method="repeatedcv", number=10, repeats=1, 
  returnData=TRUE, savePredictions="all", verboseIter=TRUE, classProbs=TRUE)
mod <- train(x=x, y=y, trControl=tc, method="rf",
  tuneGrid=data.frame(mtry=500))
  • library(doMC) and registerDoMC allow me to use more than one processor
  • repeatedcv: if more than one repeat of k-fold crossvalidation is requested, the repeated= parameter should be modified. repeatedcv must be used instead of cv
  • savePredictions: if we want to evaluate predictions on our own
  • verboseIter: to see the progress
  • classProbs: to report class probabilities, so we can use them to calculate ROC post factum
  • tuneGrid: if not specified, caret will tune parameters. Normally, we don’t want that