23
Nov

An R function to analyze your Google Scholar Citations page

Tweet about this on Twitter0Share on Facebook0Share on Google+0Share on LinkedIn0Email this to someone

Google scholar has now made Google Scholar Citations profiles available to anyone. You can read about these profiles and set one up for yourself here.

I asked John Muschelli and Andrew Jaffe to write me a function that would download my Google Scholar Citations data so I could play with it. Then they got all crazy on it and wrote a couple of really neat functions. All cool/interesting components of these functions are their ideas and any bugs were introduced by me when I was trying to fiddle with the code at the end.  

So how does it work? Here is the code. You can source the functions like so:

source(“http://biostat.jhsph.edu/~jleek/code/googleCite.r”)

This will install the following packages if you don’t have them: wordcloud, tm, sendmailR, RColorBrewer. Then you need to find the url of a google scholar citation page. Here is Rafa Irizarry’s:

http://scholar.google.com/citations?user=nFW-2Q8AAAAJ&hl=en

You can then call the googleCite function like this:

out = googleCite(“http://scholar.google.com/citations?user=nFW-2Q8AAAAJ&hl=en”,pdfname=”rafa_wordcloud.pdf”)

or search by name like this:

out = searchCite(“Rafa Irizarry”,pdfname=”rafa_wordcloud.pdf”)

The function will download all of Rafa’s citation data and put it in the matrix out. It will also make wordclouds of (a) the co-authors on his papers and (b) the titles of his papers and save them in the pdf file specified (There is an option to turn off plotting if you want). Here is what Rafa’s clouds look like: 

We have also written a little function to calculate many of the popular citation indices. You can call it on the output like so:

gcSummary(out)

When you download citation data, an email with the data table will also be sent to Simply Statistics so we can collect information on who is using the function and perform population-level analyses. 

If you liked this function you might also be interesting in our R function to determine if you are a data scientist, or in some of the other stuff going on over at Simply Statistics

Enjoy!

  • nik

    Is this set of functions still operational? When I try to use the function

    out <- googleCite("http://scholar.google.com/citations?user=nFW-2Q8AAAAJ&hl=en&quot;, pdfname="rafa_cloud.pdf")

    I get the following error message:

    Error in grepl(alldata$"First Author", pattern = author) :

    invalid 'pattern' argument

    is it my bad or is something wrong with the function?
    Running: Win 7 OS, R-studio 0.97.551, R x64 3.0.2

    • Edel Rodea

      I had the same problem, but reading the code and drinking a cup of coffe make it possible to solve it, you need to correct pattern = author by pattern = "author" in the grepl functions that appear, the omission of quotation marks is the problem.