Milwaukee Data Science: Introductions

The purpose of this Meetup is to provide a safe, pleasant, and convenient venue (physical and virtual) to facilitate the growth and nurturing of the Data Science community in Southeastern Wisconsin.

Since Data Science covers the entirety of all human endeavors, this group serves to facilitate dialogue and discussion between all realms of mastery. Strategic advisers, managers, investors, health-care administrators, and quantitative analysts will feel equally at home here, as will mathematicians, statisticians, ecologists, biologists, and social-scientists. All realms of mastery are invited and welcomed to join.

Given the limitless application of technology here, all members of the Information Technology field are welcomed to join. All roles are welcome ranging from technical (developers, architects, system and devops administrators) to project managers and business analysts.

Experts and neophytes with interest in particular languages, environments, frameworks, and technologies have a home here. Be it R, Python, Java, or Octave, there is a place for everyone to learn and share. Our doors are equally open to practitioners applying specific technological offerings for every industry and platform.

With a laid back approach open to all ideas, this group will reflect the contributions and participation of its members, in whatever form time and resources permit.

Key links:

How to debug within R

In my words this is how to debug within R:

  • If you want to print a stack trace for the most recent exception
    • then use traceback
    • you may or may not have the source
  • If you want to set a breakpoint at a specific location in the code
    • then use browser
    • you must have the source
    • it may be conditional
  • If you want to set a breakpoint on a function at its
    entry point

    • then use debug
    • you may or may not have the source
    • delegates work to browser
  • If you want to install a global exception handler that will immediately start
    debugging

    • then use recover
    • you may or may not have the source
  • If you want to add watch statement to a function
    • then use trace
    • you may or may not have the source

rbenchmark: Benchmarking routine for R

rbenchmark is inspired by the Perl module Benchmark, and is intended to facilitate benchmarking of arbitrary R code. The library consists of just one function, benchmark, which is a simple wrapper around system.time. Given a specification of the benchmarking process (counts of replications, evaluation environment) and an arbitrary number of expressions, benchmark evaluates each of the expressions in the specified environment, replicating the evaluation as many times as specified, and returning the results conveniently wrapped into a data frame.

This is an absolute must have for every situation where you want to do some
simple benchmarking.
For some reason when I read the documentation it sounds harder to use then it
really is.
Give it a try and you will be rewarded greatly.

jsonlite: A smarter JSON encoder/decoder for R

This package is a fork of the RJSONIO package by Duncan Temple Lang. It builds on the same libjson c++ parser, but implements a smarter mapping between JSON data and R classes. This is particularly useful when working with JSON data from pipelines and web APIs. The vignettes describe the behavior in great detail. In addition to drop-in replacements for toJSON and fromJSON, the package contains functions to validate, prettify and minify JSON, and many unit tests to verify that all edge cases are encoded and decoded consistently for use with dynamic data in systems and applications.

eval-in-repl: Consistent ESS-like eval interface for various REPLs

This package does what ESS does for R for various REPLs, including ielm.
Emacs Speaks Statistics (ESS) package has a nice function called ess-eval-region-or-line-and-step, which is assigned to C-RET. This function sends a line or a selected region to the corresponding shell (R, Julia, Stata, etc) visibly. It also start up a shell if there is none.
This package implements similar work flow for various read-eval-print-loops (REPLs)

Via ess-help.

Advanced R

For experienced programmers, Advanced R will serve you as an excellent resource
to learn about specific aspects of R in a comprehensive manner, in addition to
digging through the individual (and wonderful) documentation entries inside
of R.

Getting through the book requires experience with R to make sense of it, so it
may not be the best place to start. However, an experienced programmer will have
questions that only this book addresses directly and easily. As such, using it
as-needed for reference is a fine approach.