Some things R can do you might not be aware of

Jeff Leek
2013-12-30

There is a lot of noise around the “R versus Contender X” for Data Science. I think the two main competitors right now that I hear about are Python and Julia. I’m not going to weigh into the debates because I go by the motto: “Why not just use something that works?”

R offers a lot of benefits if you are interested in statistical or predictive modeling. It is basically unrivaled in terms of the breadth of packages for applied statistics.  But I think sometimes it isn’t obvious that R can handle some tasks that you used to have to do with other languages. This misconception is particularly common among people who regularly code in a different language and are moving to R. So I thought I’d point out a few cool things that R can do. Please add to the list in the comments if I’ve missed things that R can do people don’t expect.

  1. R can do regular expressions/text processing: Check out stringr, tm, and a large number of other natural language processing packages.
  2. R can get data out of a database: Check out RMySQL, RMongoDB, rhdf5, ROracle, MonetDB.R (via Anthony D.).
  3. R can process nasty data: Check out plyrreshape2, Hmisc
  4. R can process images: EBImage is a good general purpose tool, but there are also packages for various file types like jpeg.
  5. R can handle different data formats: XML and RJSONIO handle two common types, but you can also read from Excel files with xlsx or handle pretty much every common data storage type (you’ll have to search R + data type) to find the package.
  6. R can interact with APIs: Check out RCurl and httr for general purpose software, or you could try some specific examples like twitteR. You can create an api from R code using yhat.
  7. R can build apps/interactive graphics: Some pretty cool things have already been built with shiny, rCharts interfaces with a ton of interactive graphics packages.
  8. R can create dynamic documents: Try out [There is a lot of noise around the “R versus Contender X” for Data Science. I think the two main competitors right now that I hear about are Python and Julia. I’m not going to weigh into the debates because I go by the motto: “Why not just use something that works?”

R offers a lot of benefits if you are interested in statistical or predictive modeling. It is basically unrivaled in terms of the breadth of packages for applied statistics.  But I think sometimes it isn’t obvious that R can handle some tasks that you used to have to do with other languages. This misconception is particularly common among people who regularly code in a different language and are moving to R. So I thought I’d point out a few cool things that R can do. Please add to the list in the comments if I’ve missed things that R can do people don’t expect.

  1. R can do regular expressions/text processing: Check out stringr, tm, and a large number of other natural language processing packages.
  2. R can get data out of a database: Check out RMySQL, RMongoDB, rhdf5, ROracle, MonetDB.R (via Anthony D.).
  3. R can process nasty data: Check out plyrreshape2, Hmisc
  4. R can process images: EBImage is a good general purpose tool, but there are also packages for various file types like jpeg.
  5. R can handle different data formats: XML and RJSONIO handle two common types, but you can also read from Excel files with xlsx or handle pretty much every common data storage type (you’ll have to search R + data type) to find the package.
  6. R can interact with APIs: Check out RCurl and httr for general purpose software, or you could try some specific examples like twitteR. You can create an api from R code using yhat.
  7. R can build apps/interactive graphics: Some pretty cool things have already been built with shiny, rCharts interfaces with a ton of interactive graphics packages.
  8. R can create dynamic documents: Try out](http://yihui.name/knitr/) or slidify.
  9. R can play with Hadoop: Check out the rhadoop wiki.
  10. R can create interactive teaching modules: You can do it in the console with swirl or on the web with Datamind.
  11. R interfaces very nicely with C if you need to be hardcore (also maybe? interfaces with Python): Rcpp, enough said. Also read the tutorial. I haven’t tried the rPython library, but it looks like a great idea.