Simply Statistics A statistics blog by Rafa Irizarry, Roger Peng, and Jeff Leek

OracleWorld Claims and Sensations

Larry Ellison, the CEO of Oracle, like most technology CEOs, has a tendency for the over-the-top sales pitch. But it’s fun to keep track of what these companies are up to just to see what they think the trends are. It seems clear that companies like IBM, Oracle, and HP, which focus substantially on the enterprise (or try to), think the future is data data data. One piece of evidence is the list of companies that they’ve acquired recently.

Ellison claims that they’ve developed a new computer that integrates hardware with software to produce an overall faster machine. Why do we need this kind of integration? Well, for data analysis, of course!

I was intrigued by this line from the article:

On Sunday Mr. Ellison mentioned a machine that he claimed would do data analysis 18 to 23 times faster than could be done on existing machines using Oracle databases. The machine would be able to compute both standard Oracle structured data as well as unstructured data like e-mails, he said.

It’s always a bit hard in these types of articles to figure out what they mean by “data analysis”, but even still, there’s an important idea here.

Alex Szalay talks about the need to “bring the computation to the data”. This comes from his experience working with ridiculous amounts of data from the Sloan Digital Sky Survey. There, the traditional model of pulling the data on to your computer, running some analyses, and then producing results just does not work. But the opposite is often reasonable. If the data are sitting in an Oracle/Microsoft/etc. database, you bring the analysis to the database and operate on the data there. Presumably, the analysis program is smaller than the dataset, or this doesn’t quite work.

So if Oracle’s magic computer is real, it and others like it could be important as we start bringing more computations to the data.