Skip to content

Package management in R

Once you start using different versions of packages for different projects
then you will probably want a package-management solution for R. Or perhaps
you are coming from another language that had a package-management solution
that you were already happy with. Curious about the offerings I searched
and found two options, rbundler and Packrat. Having read the literature for both,
they seem to be more than adequate providing everything that one would expect.

Package Github CRAN PDF
rbundler here here here
Packrat here NA NA

My current approach is to install all packages into my user directory so as not
to spoil the global package cache. Most likely that approach won’t scale for
larger projects, so when the need arises I will migrate to one of these

If you’ve never used something like this before, then you will be well-served to
first becomes comfortable and masterful managing it yourself before automating

Whatever your approach, it is a real treat to know that both solutions are
available for when you embrace reproducible research.

A progress indicator for code blocks in org-mode

A progress indicator for code blocks in org-mode courtesy
of John Kitchin:

;; give us some hint we are running
(defadvice org-babel-execute-src-block (around progress nil activate)
   'org-block-background nil :background "LightSteelBlue")
  (message "Running your code block")
  (set-face-attribute 'org-block-background nil :background "gray")
  (message "Done with code block"))

Ultra-lightweight-reproducibility for R: which version of R that you used

Here is another post from the realm of ultra-lightweight-reproducibility for R:

If are going to get serious about locking down your system then only let it run
on the version of R that you personally used to obtain your results!

It takes very, very little effort:

stopifnot(R.version$major==3 && R.version$minor==1.1)

99% of the time, using a newer version won’t matter, but make it crystal clear
both to yourself and your collaborators how you obtained your results, at least
when it comes to which version of R that you used.

Clean and easy string manipulation with stringr for R

Strings are not glamorous, high-profile components of R, but they do play a big role in many data cleaning and preparations tasks. R provides a solid set of string operations, but because they have grown organically over time, they can be inconsistent and a little hard to learn. Additionally, they lag behind the string operations in other programming languages, so that some things that are easy to do in languages like Ruby or Python are rather hard to do in R. The stringr package aims to remedy these problems by providing a clean, modern interface to common string operations.

See also

MASS: Support Functions and Datasets for Venables and Ripley’s MASS

Functions and datasets to support Venables and Ripley, ‘Modern Applied Statistics with S‘ (4th edition, 2002).

testthat for R

testthat is an amazingly easy library to use unit-testing in R that has enough options to account for the majority of your usage scenarios. It is so simple that I just had to copy and past my same comments about assertthat!

See also

assertthat for R

assertthat is an amazingly easy library to use design-by-contract in R that has enough options to account for the majority of your usage scenarios.

See also

How Yoga Makes You A Fill-In-The-Blank Person

After doing some practice you will surely think that yoga is making you a fill in the blank person. If you don’t think it, then you will read it. If you don’t read it then someone will tell you it. That is OK.

Just know that it isn’t making into anything!

Rather, it is helping you return to that which you already are.

Matt Dowle’s “data.table” talk at useR 2014

Matt Dowle’s “data.table” talk touches upon so many revealing and educational points in only 20 short minutes. It is a must-watch if you are wanting to choose between the various data query options in R.


sqldf brokers your dataframe from R into a SQLite database, executes your SQL query, and brokers the data back as a new dataframe. Very nice. The documentation and literature reveals that it works with H2, MySQL, and PostgreSQL, and additionally does a whole lot more than the one-liner claims!