31
Jan

paste0 is statistical computing's most influential contribution of the 21st century

Tweet about this on Twitter24Share on Facebook27Share on Google+11Share on LinkedIn1Email this to someone

The day I discovered paste0 I literally cried. No more paste(bla,bla, sep=""). While looking through code written by a student who did not know about paste0 I started pondering about how many person hours it has saved humanity. So typing sep="" takes about 1 second. We R users use paste about 100 times  a day and there are about 1,000,000 R users in the world. That's over 3 person years a day! Next up read.table0 (who doesn't want  as.is to be TRUE?).

  • http://fellgernon.tumblr.com/ Leonardo Collado Torres

    nice!

  • Matthew Maenner

    I had a similar paste0 moment. I'd also like table0 (that defaults to useNA="always").

  • Student

    Nice, but available only in R >= 2.15.0

  • http://yihui.name/ Yihui Xie

    R probably should just allow the addition of character strings like many other languages do, e.g. 'foo' + 'bar' (= 'foobar'). I'll be crying then.

    • Thomas Lumley

      It's been seriously suggested on at least three occasions.

      What torpedos it in the end is lack of associativity (does "2+2="+2+2 come out as "2+2=22" as you'd expect or as "2+2=4"). That was also Paul Graham's argument against putting this in his Arc language.

      That puts it in a different category from as.is=TRUE, which is there purely for the benefit of the hysterical raisins --- until strings became hashed references, the overhead of as.is=TRUE in read.table() was truly appalling for large data sets. Now, not so much.

      • http://www.nottamuntown.com Adam Bradley

        PHP's workaround for that is to use a different operator for string concatenation than for addition: 2+2 returns 4, but 2.2 returns "22".

  • http://profiles.google.com/andrew.e.gelman Andrew Gelman

    Rafael:

    You are so right. For years I've been planning to write that function but in any given example it seemed easier to just type sep="". The other thing that always hung me up was, what to call the new function. paste0 is just a perfect, perfect name.

    Now we just need a name for plot with mar=c(3,3,2,0), mgp=c(1.5,.5,0), tck=-.02, and for hist with more bars.

    • Rafael Irizarry

      i call it mypar. here is my .Rprofile:
      library(RColorBrewer)
      mypar <- function(a=1,b=1,brewer.n=8,brewer.name="Dark2",...){
      par(mar=c(2.5,2.5,1.6,1.1),mgp=c(1.5,.5,0))
      par(mfrow=c(a,b),...)
      palette(brewer.pal(brewer.n,brewer.name))
      }
      amazingly similar to yours!

      • http://profiles.google.com/andrew.e.gelman Andrew Gelman

        I used to put things in my .Rprofile but it was always a pain to have to port this over every time I loaded in a new version of R. So now I just put my favorite functions in the "arm" package.

  • http://twitter.com/SQLCoffey @SQLCoffey

    What is it you 'R' users really do in any given day? How about a day in the life of an 'R' user?

    • http://myindigolives.wordpress.com/ Ellie K

      Good idea!
      I'd like to make a request. SimplyStatistics, may we have a "A Day in the Life of an R User" as a future post, please? I'll settle for a day in your life. I didn't know that you had moved here to WordPress, from Tumblr. I am your long-time follower DataAnxiety there.
      P.S. For SQLCoffey: Are you the remarkable Neil Coffey of JavaMex? I've run across him on English StackExchange before, so I figured it was a possibility here, too.

  • Roger Peng

    When I saw 'paste0' in the NEWS file, I shed a tear.

  • Nat

    As a newbie to R, I'm baffled by you guys shedding tears of joy about the addition of paste0. Couldn't you just have created your own paste function as follows or am I missing something? myPaste <- function(...) { paste(...,sep="") }

    • anoldman

      Yes, and remember to define this function every time you enter a code... Or always remember to add this function to rprofile.site. Or always remember to explain it to your coworkers. No. This is not a good approach. This time remember this, another time remember another thing... What is in the standard, is the best.

  • http://twitter.com/Malarky67 Stephen Henderson

    I'd really like a way to turn recycling off so that you can 'cbind' or 'rbind' big lists of variables into a single matrix or data.frame with NAs or "" on the end.

    Is this too much to ask?

    • http://twitter.com/Malarky67 Stephen Henderson

      ps oh yeah write.table (..., quote=F)

      who puts bloody quotes in their text files?

  • Andrew

    Mac and Linux users can save lots of typing or aggravation by modifying .inputrc. For example, add ";g": " <- " to ~/.inputrc (just create it if it doesn't exist) to get the assignment operator <- by typing semicolon+g (mnemonic: gets) in the R shell. Much easier and faster than reaching for < and -, at least for me. Other useful ones are "zp": "()C-b" (mnemonic: parentheses) and ";q": '""C-b' (mnemonic: quotes), which give you parentheses or double quotes, respectively, followed by a backspace (so you are in the pars or quotes, ready to type).

  • Robert Balicki

    For the same reason, when I discovered currying (as a concept, I guess) I shed a tear. And then rewrote my startup files.

    For example, now I have meanNA, medNA, sumNA, etc. which all default to na.rm=TRUE, I have pst, which is like paste0 with collapse="", etc. And I have all sorts of ones like rc <- readClipboard, an <- as.numeric, etc. etc.

    It's not laziness, it's DRY