Web Scraping

An R function to analyze your Google Scholar Citations page

Google scholar has now made Google Scholar Citations profiles available to anyone. You can read about these profiles and set one up for yourself here. I asked John Muschelli and Andrew Jaffeto write me a function that would download my Google Scholar Citations data so I could play with it. Then they got all crazy on it and wrote a couple of really neat functions. All cool/interesting components of these functions are their ideas and any bugs were introduced by me when I was trying to fiddle with the code at the end.

Web-scraping

The internet is the greatest source of publicly available data. One of the key skills to being able to obtain data from the web is “web-scraping”, where you use a piece of software to run through a website and collect information.¬†This technique can be used for collecting data from databases or to collect data that is scattered across a website. Here is a very cool little exercise¬†in web-scraping that can be used as an example of the things that are possible.