# An R function to analyze your Google Scholar Citations page

23 Nov 2011Google scholar has now made Google Scholar Citations profiles available to anyone. You can read about these profiles and set one up for yourself here.

I asked John Muschelli and Andrew Jaffeto write me a function that would download my Google Scholar Citations data so I could play with it. Then they got all crazy on it and wrote a couple of really neat functions. All cool/interesting components of these functions are their ideas and any bugs were introduced by me when I was trying to fiddle with the code at the end.

So how does it work? Here is the code. You can source the functions like so:

source(“http://biostat.jhsph.edu/~jleek/code/googleCite.r”)

This will install the following packages if you don’t have them: wordcloud, tm, sendmailR, RColorBrewer. Then you need to find the url of a google scholar citation page. Here is Rafa Irizarry’s:

http://scholar.google.com/citations?user=nFW-2Q8AAAAJ

You can then call the googleCite function like this:

out = googleCite(“http://scholar.google.com/citations?user=nFW-2Q8AAAAJ;,pdfname=”rafa_wordcloud.pdf;)

or search by name like this:

out = searchCite(“Rafa Irizarry”,pdfname=”rafa_wordcloud.pdf”)

The function will download all of Rafa’s citation data and put it in the matrix out. It will also make wordclouds of (a) the co-authors on his papers and (b) the titles of his papers and save them in the pdf file specified (There is an option to turn off plotting if you want). Here is what Rafa’s clouds look like:

We have also written a little function to calculate many of the popular citation indices. You can call it on the output like so:

gcSummary(out)

When you download citation data, an email with the data table will also be sent to Simply Statistics so we can collect information on who is using the function and perform population-level analyses.

If you liked this function you might also be interesting in our R function to determine if you are a data scientist, or in some of the other stuff going on over at Simply Statistics.

Enjoy!