Resources for Learning R
The entry point for most of us into analytics is our experience with Microsoft Excel. At some point though, projects and organizations require more. The breakdowns are typically:
-
We need to separate the data from the analysis, with storage, for a single source of truth.
-
We need to be able to automate the analysis.
-
We need the analysis to be reproducible.
-
We should not pay a third party obscene amounts of money for something as basic as arithmetic. The budget is better allocated towards innovation and staff education.
There are a number of solutions on the market that address these needs to varying degrees. Some are suites of software tools, like SAS or Matlab. One of the most widely used tool stacks for analysts is built on the R programming language.
Why R?
In a nutshell,
- R is free.
- R is popular.
- R is versatile.
- R is powerful.
- R is well supported.
Related posts:
The mothership for all things R is the R project web site. From there an analyst can download R for your platform, discover add-on packages, documentation, and source code and find other resources. The second stop for most is downloading the free Rstudio Integrated Development Environment (IDE).
Many books are available for new users. The R project site itself contains an extensive bibliography related to R. My short list includes:
R-Studio Education: Beginners is an approachable video instruction with cloud exercises to get started quickly.
See the University of British Columbia’s stat545.com for more content on project organization, data preparation, and communication. These have a profound effect on the quality and credibility of an analysis.
For an interactive tutorial on R, in R, see {swirl}.
The R Cookbook 2nd Edition by J.D. Long and Paul Teetor is a wonderful deskside reference. This is a cookbook, not a bible. Example code and data are posted at the R Cookbook GitHub
R for Data Science, by Hadley Wickham and Garrett Grolemund (O’Reilly), is a solid introduction to the Tidyverse, especially for using them in data analysis and statistics. It is also available free, online at R for Data Science.
R-Studio Education: Intermediate introduces more advanced data visualization tools and other specialized packages.
My complete list of useful free courses, books, tutorials, and expert blogs
Regardless of your platform, all analysts should take a deep dive into the craft of data visualization.
For great examples, How Charts Lie and Factfulness each speak to how numbers are persuasive, especially when presented as charts, because we associate them with science and reason. Our World in Data is an open access project with a goal of making knowledge on big global problems accessible and understandable. Charts can enlighten and enable conversations, allowing us to peek through the complexity of large amounts of data. Good charts make us smarter.
Kieran Healy teaches Data Visualization at Duke University and makes his Practial Introduction in R text available online. The book is a hands-on introduction to the principles and practice of looking at and presenting data using R and ggplot.
Anyone doing serious graphics work in R will want R Graphics by Paul Murrell (Chapman & Hall/CRC). The R Graphics Cookbook, 2nd ed., by Winston Chang is indispensible for creating graphics. And the book ggplot2: Elegant Graphics for Data Analysis by Hadley Wickham (Springer) is the definitive reference for the graphics package ggplot2.
htmlwidgets for R is a showcase of more advanced, interactive visuals that can be built in R and published to end users. Shiny is an R package that is emerging as the go-to platform for publishing web applications straight from R.
The use of git will be important for team collaboration. Look for Jenny Bryan’s web book Happy Git and GitHub for the useR and Corey Schafer’s YouTube series at Git Tutorials.
R in a Nutshell, by Joseph Adler (O’Reilly), is another quick tutorial and reference you’ll keep by your side. Also consider Hands On Programming with R by Garrett Grolemund (O’Reilly). Hadley Wickham’s Advanced R Programming is available either as a printed book or free online and is a great deeper dive into advanced R topics. Efficient R Programming, by Colin Gillespie and Robin Lovelace (O’Reilly), is another good guide to learning the deeper concepts about R programming.
R-Studio Education: Expert covers building your own packages, the Keras deep learning interface, and R-Markdown extensions.