Skip to content

Clean and easy string manipulation with stringr for R

Strings are not glamorous, high-profile components of R, but they do play a big role in many data cleaning and preparations tasks. R provides a solid set of string operations, but because they have grown organically over time, they can be inconsistent and a little hard to learn. Additionally, they lag behind the string operations in other programming languages, so that some things that are easy to do in languages like Ruby or Python are rather hard to do in R. The stringr package aims to remedy these problems by providing a clean, modern interface to common string operations.

See also

MASS: Support Functions and Datasets for Venables and Ripley’s MASS for R

Functions and datasets to support Venables and Ripley, ‘Modern Applied Statistics with S‘ (4th edition, 2002).

testthat for R

testthat is an amazingly easy library to use unit-testing in R that has enough options to account for the majority of your usage scenarios. It is so simple that I just had to copy and past my same comments about assertthat!

See also

assertthat for R

assertthat is an amazingly easy library to use design-by-contract in R that has enough options to account for the majority of your usage scenarios.

See also

How Yoga Makes You A Fill-In-The-Blank Person

After doing some practice you will surely think that yoga is making you a fill in the blank person. If you don’t think it, then you will read it. If you don’t read it then someone will tell you it. That is OK.

Just know that it isn’t making into anything!

Rather, it is helping you return to that which you already are.

Matt Dowle’s “data.table” talk at useR 2014

Matt Dowle’s “data.table” talk touches upon so many revealing and educational points in only 20 short minutes. It is a must-watch if you are wanting to choose between the various data query options in R.

sqldf for R

sqldf brokers your dataframe from R into a SQLite database, executes your SQL query, and brokers the data back as a new dataframe. Very nice. The documentation and literature reveals that it works with H2, MySQL, and PostgreSQL, and additionally does a whole lot more than the one-liner claims!

Who ever got fired for compiling code?

Q. How do you fix a memory corruption in S-PLUS?
A. Use R.

The IBM AS/400 is an application-platform consisting of custom hardware, an operating system, and database. It is a quite interesting system giving you literally everything that you need to develop a custom computing environment for a company immediately “out of the box”.

Working to deploy a pretty typical Java based system with DB2 and WebSphere backing it up, my co-worker and I ran into a bug with the JVM. The issue revolved around the fact that the web-application layer utilized a byte-code engineering library that resulted in the revelation of a bug in the JVM itself. First level support explained that it was our bug so we provided a stand-alone example in straight Java and that got us to second level support.

They explained that they would look into it and that that we should watch for a release note in the next version of OS/400. This was getting kind of silly so our department wanted to cash-in in one of their immediate-support “get out of jail free cards” that you get when you spend hundreds-of-thousands-of-dollars per quarter for IBM support. That got us to third level support.

Upon reviewing the issue that we researched, tracked down the exact situation where the bug occurred, and demonstrated a reproducible example, they expertly agreed that there was indeed a bug in their JVM, that they would have it fixed for the next release six months from now, and thanks for letting them know.

One of the value-adds of purchasing something from IBM is that every C-level knows that you can’t get fired for buying IBM. Something that you can’t have done either apparently is computing Java! :P

CRANberries for R

CRANberries aggregates information about new, updated and removed packages from the CRAN network for R


DATA-COPE seeks to provide information for research analysts accessing and using administrative data. A key aspect of this mission is resource sharing and curation of quality information that is recommended by our members.