Teaching Statistics: A Bag of Tricks

This volume takes a positive spin on the field of statistics. Statistics is seen by students as difficult and boring, however, the authors of this book have eliminated that theory. Teaching Statistics: A Bag Of Tricks, brings together a complete set of examples, demonstrations and projects that not only will increase class participation but will help to eliminate any negative feelings toward the area of statistics.

How Students Learn Statistics

Research in the areas of psychology, statistical education, and mathematics education is reviewed
and the results applied to the teaching of college-level statistics courses. The argument is made that
statistics educators need to determine what it is they really want students to learn, to modify their
teaching according to suggestions from the research literature, and to use assessment to determine if
their teaching is effective and if students are developing statistical understanding and competence.

devtools for R

devtools: Tools to make developing R code easier

They do and it is better for you to read about them before you need them.

ADDENDUM: 2014-09-06T09:08:01

devtools: Tools to make developing R code easier

Collection of package development tools

That is a bit too terse. Intro to the README follows

The aim of devtools is to make your life as a package developer easier by providing R functions that simplify many common tasks. R packages are actually really simple, and with the right tools it should be easier to use the package structure than not. Package development in R can feel intimidating, but devtools does every thing it can to make it as welcoming as possible. devtools comes with a small guarantee: if because of a bug in devtools a member of R-core gets angry with you, I will send you a handwritten apology note. Just forward me the email and your address, and I’ll get a card in the mail.

Excellent.

Readme. Manual. Github.

At the very least, just know of this package, as you will be installing it if
you want to us tidyr.

install.packages("devtools")
devtools::install_github("devtools")
library(devtools)

plyr and dplyr for R

plyr is a set of tools for a common set of problems: you need to split up a big data structure into homogeneous pieces, apply a function to each piece and then combine all the results back together.

dplyr is the next iteration of plyr, focussed on tools for working with data frames (hence the d in the name). It has three main goals:

  • Identify the most important data manipulation tools needed for data analysis
    and make them easy to use from R.
  • Provide blazing fast performance for in-memory data by writing key pieces in
    C++.
  • Use the same interface to work with data no matter where it’s stored, whether
    in a data frame, a data table or database.

These two are a couple of the other mainstream manipulation tools outside of
base R.

data.table for R

data.table is a nice option for retaining the familiarity of a dataframe while
opening the door for pass-by-reference semantics and a more SQL-like query
language. The literature is really wonderful too providing all levels of detail
ranging from the 10-minute introduction to the entirety of the API itself.

The 10-minute introduction is really revealing of things that you would probably
enjoy in your personal analytical workflow whether you obtain them with
data.table or elsewhere. The short introduction is also revealing that whatever
solution you choose requires sincere and focused mastery in order to truly
utilize its power without making major, major mistakes.

Resources:

Package management in R

Once you start using different versions of packages for different projects
then you will probably want a package-management solution for R. Or perhaps
you are coming from another language that had a package-management solution
that you were already happy with. Curious about the offerings I searched
and found two options, rbundler and Packrat. Having read the literature for both,
they seem to be more than adequate providing everything that one would expect.

Package Github CRAN PDF
rbundler here here here
Packrat here NA NA

My current approach is to install all packages into my user directory so as not
to spoil the global package cache. Most likely that approach won’t scale for
larger projects, so when the need arises I will migrate to one of these
solutions.

If you’ve never used something like this before, then you will be well-served to
first becomes comfortable and masterful managing it yourself before automating
it.

Whatever your approach, it is a real treat to know that both solutions are
available for when you embrace reproducible research.

Ultra-lightweight-reproducibility for R: which version of R that you used

Here is another post from the realm of ultra-lightweight-reproducibility for R:

If are going to get serious about locking down your system then only let it run
on the version of R that you personally used to obtain your results!

It takes very, very little effort:

stopifnot(R.version$major==3 && R.version$minor==1.1)

99% of the time, using a newer version won’t matter, but make it crystal clear
both to yourself and your collaborators how you obtained your results, at least
when it comes to which version of R that you used.

Clean and easy string manipulation with stringr for R

Strings are not glamorous, high-profile components of R, but they do play a big role in many data cleaning and preparations tasks. R provides a solid set of string operations, but because they have grown organically over time, they can be inconsistent and a little hard to learn. Additionally, they lag behind the string operations in other programming languages, so that some things that are easy to do in languages like Ruby or Python are rather hard to do in R. The stringr package aims to remedy these problems by providing a clean, modern interface to common string operations.

See also http://cran.r-project.org/web/packages/stringr/index.html.