Simply Statistics


Given the history of medicine, why are randomized trials not used for social policy?

Tweet about this on TwitterShare on FacebookShare on Google+Share on LinkedInEmail this to someone

Policy changes can have substantial societal effects. For example, clean water and  hygiene policies have saved millions, if not billions, of lives. But effects are not always positive. For example, prohibition, or the "noble experiment", boosted organized crime, slowed economic growth and increased deaths caused by tainted liquor. Good intentions do not guarantee desirable outcomes.

The medical establishment is well aware of the danger of basing decisions on the good intentions of doctors or biomedical researchers. For this reason, randomized controlled trials (RCTs) are the standard approach to determining if a new treatment is safe and effective. In these trials an objective assessment is achieved by assigning patients at random to a treatment or control group, and then comparing the outcomes in these two groups. Probability calculations are used to summarize the evidence in favor or against the new treatment. Modern RCTs are considered one of the greatest medical advances of the 20th century.

Despite their unprecedented success in medicine, RCTs have not been fully adopted outside of scientific fields. In this post, Ben Goldcare advocates for politicians to learn from scientists and base policy decisions on RCTs. He provides several examples in which results contradicted conventional wisdom. In this TED talk Esther Duflo convincingly argues that RCTs should be used to determine what interventions are best at fighting poverty. Although some RCTs  are being conducted, they are still rare and oftentimes ignored by policymakers. For example, despite at least two RCTs finding that universal pre-K programs are not effective, polymakers in New York are implementing a $400 million a year program. Supporters of this noble endeavor defend their decision by pointing to observational studies and "expert" opinion that support their preconceived views. Before the 1950s, indifference to RCTs was common among medical doctors as well, and the outcomes were at times devastating.

Today, when we compare conclusions from non-RCT studies to RCTs, we note the unintended strong effects that preconceived notions can have. The first chapter in this book provides a summary and some examples. One example comes from a study of 51 studies on the effectiveness of the portacaval shunt. Here is table summarizing the conclusions of the 51 studies:

Design Marked Improvement Moderate Improvement None
No control 24 7 1
Controls; but no randomized 10 3 2
Randomized 0 1 3

Compare the first and last column to appreciate the importance of the randomized trials.

A particularly troubling example relates to the studies on Diethylstilbestrol (DES). DES is a drug that was used to prevent spontaneous abortions. Five out of five studies using historical controls found the drug to be effective, yet all three randomized trials found the opposite. Before the randomized trials convinced doctors to stop using this drug , it was given to thousands of women. This turned out to be a tragedy as later studies showed DES has terrible side effects. Despite the doctors having the best intentions in mind, ignoring the randomized trials resulted in unintended consequences.

Well meaning experts are regularly implementing policies without really testing their effects. Although randomized trials are not always possible, it seems that they are rarely considered, in particular when the intentions are noble. Just like well-meaning turn-of-the-20th-century doctors, convinced that they were doing good, put their patients at risk by providing ineffective treatments, well intentioned policies may end up hurting society.

Update: A reader pointed me to these preprints which point out that the control group in one of the cited early education RCTs included children that receive care in a range of different settings, not just staying at home. This implies that the signal is attenuated if what we want to know is if the program is effective for children that would otherwise stay at home. In this preprint they use statistical methodology (principal stratification framework) to obtain separate estimates: the effect for children that would otherwise go to other center-based care and the effect for children that would otherwise stay at home. They find no effect for the former group but a significant effect for the latter. Note that in this analysis the effect being estimated is no longer based on groups assigned at random. Instead, model assumptions are used to infer the two effects. To avoid dependence on these assumptions we will have to perform an RCT with better defined controls. Also note that the RCT data facilitated the principal stratification framework analysis. I also want to restate what I've posted before, "I am not saying that observational studies are uninformative. If properly analyzed, observational data can be very valuable. For example, the data supporting smoking as a cause of lung cancer is all observational. Furthermore, there is an entire subfield within statistics (referred to as causal inference) that develops methodologies to deal with observational data. But unfortunately, observational data are commonly misinterpreted."


So you are getting crushed on the internet? The new normal for academics.

Tweet about this on TwitterShare on FacebookShare on Google+Share on LinkedInEmail this to someone

Roger and I were just talking about all the discussion around the Case and Deaton paper on death rates for middle class people. Andrew Gelman discussed it among many others. They noticed a potential bias in the analysis and did some re-analysis. Just yesterday an economist blogger wrote a piece about academics versus blogs and how many academics are taken by surprise when they see their paper being discussed so rapidly on the internet. Much of the debate comes down to the speed, tone, and ferocity of internet discussion of academic work - along with the fact that sometimes it isn't fully fleshed out.

I have been seeing this play out not just in the case of this specific paper, but many times that folks have been confronted with blogs or the quick publication process of f1000Research. I think it is pretty scary for folks who aren't used to "internet speed" to see this play out and I thought it would be helpful to make a few points.

  1. Everyone is an internet scientist now. The internet has arrived as part of academics and if you publish a paper that is of interest (or if you are a Nobel prize winner, or if you dispute a claim, etc.) you will see discussion of that paper within a day or two on the blogs. This is now a fact of life.
  2. The internet loves a fight. The internet responds best to personal/angry blog posts or blog posts about controversial topics like p-values, errors, and bias. Almost certainly if someone writes a blog post about your work or an f1000 paper it will be about an error/bias/correction or something personal.
  3. Takedowns are easier than new research and happen faster. It is much, much easier to critique a paper than to design an experiment, collect data, figure out what question to ask, ask it quantitatively, analyze the data, and write it up. This doesn't mean the critique won't be good/right it just means it will happen much much faster than it took you to publish the paper because it is easier to do. All it takes is noticing one little bug in the code or one error in the regression model. So be prepared for speed in the response.

In light of these three things, you have a couple of options about how to react if you write an interesting paper and people are discussing it - which they will certainly do (point 1), in a way that will likely make you uncomfortable (point 2), and faster than you'd expect (point 3). The first thing to keep in mind is that the internet wants you to "fight back" and wants to declare a "winner". Reading about amicable disagreements doesn't build audience. That is why there is reality TV. So there will be pressure for you to score points, be clever, be fast, and refute every point or be declared the loser. I have found from my own experience that is what I feel like doing too. I think that resisting this urge is both (a) very very hard and (b) the right thing to do. I find the best solution is to be proud of your work, but be humble, because no paper is perfect and thats ok. If you do the best you can , sensible people will acknowledge that.

I think these are the three ways to respond to rapid internet criticism of your work.

  • Option 1: Respond on internet time. This means if you publish a big paper that you think might be controversial  you should block off a day or two to spend time on the internet responding. You should be ready to do new analysis quickly, be prepared to admit mistakes quickly if they exist, and you should be prepared to make it clear when there aren't. You will need social media accounts and you should probably have a blog so you can post longer form responses. Github/Figshare accounts make it better for quickly sharing quantitative/new analyses. Again your goal is to avoid the personal and stick to facts, so I find that Twitter/Facebook are best for disseminating your more long form responses on blogs/Github/Figshare. If you are going to go this route you should try to respond to as many of the major criticisms as possible, but usually they cluster into one or two specific comments, which you can address all in one.
  • Option2 : Respond in academic time. You might have spent a year writing a paper to have people respond to it essentially instantaneously. Sometimes they will have good points, but they will rarely have carefully thought out arguments given the internet-speed response (although remember point 3 that good critiques can be faster than good papers). One approach is to collect all the feedback, ignore the pressure for an immediate response, and write a careful, scientific response which you can publish in a journal or in a fast outlet like f1000Research. I think this route can be the most scientific and productive if executed well. But this will be hard because people will treat that like "you didn't have a good answer so you didn't respond immediately". The internet wants a quick winner/loser and that is terrible for science. Even if you choose this route though, you should make sure you have a way of publicizing your well thought out response - through blogs, social media, etc. once it is done.
  • Option 3: Do not respond. This is what a lot of people do and I'm unsure if it is ok or not. Clearly internet facing commentary can have an impact on you/your work/how it is perceived for better or worse. So if you ignore it, you are ignoring those consequences. This may be ok, but depending on the severity of the criticism may be hard to deal with and it may mean that you have a lot of questions to answer later. Honestly, I think as time goes on if you write a big paper under a lot of scrutiny Option 3 is going to go away.

All of this only applies if you write a paper that a ton of people care about/is controversial. Many technical papers won't have this issue and if you keep your claims small, this also probably won't apply. But I thought it was useful to try to work out how to act under this "new normal".


Prediction Markets for Science: What Problem Do They Solve?

Tweet about this on TwitterShare on FacebookShare on Google+Share on LinkedInEmail this to someone

I've recently seen a bunch of press on this paper, which describes an experiment with developing a prediction market for scientific results. From FiveThirtyEight:

Although replication is essential for verifying results, the current scientific culture does little to encourage it in most fields. That’s a problem because it means that misleading scientific results, like those from the “shades of gray” study, could be common in the scientific literature. Indeed, a 2005 study claimed that most published research findings are false.


The researchers began by selecting some studies slated for replication in the Reproducibility Project: Psychology — a project that aimed to reproduce 100 studies published in three high-profile psychology journals in 2008. They then recruited psychology researchers to take part in two prediction markets. These are the same types of markets that people use to bet on who’s going to be president. In this case, though, researchers were betting on whether a study would replicate or not.

There are all kinds of prediction markets these days--for politics, general ideas--so having one for scientific ideas is not too controversial. But I'm not sure I see exactly what problem is solved by having a prediction market for science. In the paper, they claim that the market-based bets were better predictors of the general survey that was administrated to the scientists. I'll admit that's an interesting result, but I'm not yet convinced.

First off, it's worth noting that this work comes out of the massive replication project conducted by the Center for Open Science, where I believe they have a fundamentally flawed definition of replication. So I'm not sure I can really agree with the idea of basing a prediction market on such a definition, but I'll let that go for now.

The purpose of most markets is some general notion of "price discovery". One popular market is the stock market and I think it's instructive to see how that works. Basically, people continuously bid on the shares of certain companies and markets keep track of all the bids/offers and the completed transactions. If you are interested in finding out what people are willing to pay for a share of Apple, Inc., then it's probably best to look at...what people are willing to pay. That's exactly what the stock market gives you. You only run into trouble when there's no liquidity, so no one shows up to bid/offer, but that would be a problem for any market.

Now, suppose you're interested in finding out what the "true fundamental value" of Apple, Inc. Some people think the stock market gives you that at every instance, while others think that the stock market can behave irrationally for long periods of time. Perhaps in the very long run, you get a sense of the fundamental value of a company, but that may not be useful information at that point.

What does the market for scientific hypotheses give you? Well, it would be one thing if granting agencies participated in the market. Then, we would never have to write grant applications. The granting agencies could then signal what they'd be willing to pay for different ideas. But that's not what we're talking about.

Here, we're trying to get at whether a given hypothesis is true or not. The only real way to get information about that is to conduct an experiment. How many people betting in the markets will have conducted an experiment? Likely the minority, given that the whole point is to save money by not having people conduct experiments investigating hypotheses that are likely false.

But if market participants aren't contributing real information about an hypothesis, what are they contributing? Well, they're contributing their opinion about an hypothesis. How is that related to science? I'm not sure. Of course, participants could be experts in the field (although not necessarily) and so their opinions will be informed by past results. And ultimately, it's consensus amongst scientists that determines, after repeated experiments, whether an hypothesis is true or not. But at the early stages of investigation, it's not clear how valuable people's opinions are.

In a way, this reminds me of a time a while back when the EPA was soliciting "expert opinion" about the health effects of outdoor air pollution, as if that were a reasonable substitute for collecting actual data on the topic. At least it cost less money--just the price of a conference call.

There's a version of this playing out in the health tech market right now. Companies like Theranos and 23andMe are selling health products that they claim are better than some current benchmark. In particular, Theranos claims its blood tests are accurate when only using a tiny sample of blood. Is this claim true or not? No one outside Theranos knows for sure, but we can look to the financial markets.

Theranos can point to the marketplace and show that people are willing to pay for its products. Indeed, the $9 billion valuation of the private company is another indicator that people...highly value the company. But ultimately, we still don't know if their blood tests are accurate because we don't have any data. If we were to go by the financial markets alone, we would necessarily conclude that their tests are good, because why else would anyone invest so much money in the company?

I think there may be a role to play for prediction markets in science, but I'm not sure discovering the truth about nature is one of them.


Biostatistics: It's not what you think it is

Tweet about this on TwitterShare on FacebookShare on Google+Share on LinkedInEmail this to someone

My department recently sent me on a recruitment trip for our graduate program. I had the opportunity to chat with undergrads interested in pursuing a career related to data analysis. I found that several did not know about the existence of Departments of Biostatistics and most of the rest thought Biostatistics was the study of clinical trials. We have posted on the need for better marketing for Statistics, but Biostatistics needs it even more. So this post is for students considering a career as applied statisticians or data scientists and who are considering PhD programs.

There are dozens of Biostatistics departments and most run PhD programs. You may have never heard of it because they are usually in schools that undergrads don't regularly frequent: Public Health and Medicine.  However, they are very active in research and teaching graduate students. In fact, the 2014 US News & World Report ranking of Statistics Departments includes three Biostat departments in the top five spots. Although clinical trials are a popular area of interest in these departments, there are now many other areas of research. With so many fields of science shifting to data intensive research, Biostatistics has adapted to work in these areas. Today pretty much any Biostat department will have people working on projects related to genetics, genomics, computational biology, electronic medical records, neuroscience, environmental sciences, and epidemiology, health-risk analysis, and clinical decision making. Through collaborations, academic biostatisticians have early access to the cutting edge datasets produced by public health scientists and biomedical researchers. Our research usually revolves in either developing statistical methods that are used by researchers working in these fields or working directly with a collaborator in data-driven discovery.

How is it different from Statistics? In the grand scheme of things, they are not very different. As implied by the name, Biostatisticians focus on data related to biology while statisticians tend to be more general. However, the underlying theory and skills we learn are similar. In my view, the major difference is that Biostatisticians, in general, tend to be more interested in data and the subject matter, while in Statistics Departments more emphasis is given to the mathematical theory.

What type of job can I get with a Phd In Biostatistics? A well paying one. And you will have many options to chose from. Our graduates tend to go to academia, industry or government. Also, the Bio in the name does not keep our graduates for landing non-bio related jobs, such as in high tech. The reason for this is that the training our students receive and the what they learn from research experiences can be widely applied to data analysis challenges.

How should I prepare if I want to apply to a PhD program? First you need to decide if you are going to like it. One way to do this is to participate in one of the  many summer programs where you get a glimpse of what we do. My department runs one of these as well.  However, as an undergrad I would mainly focus on courses. Undergraduate research experiences are a good way to get an idea of what it's like, but it is difficult to do real research unless you can set aside several hours a week for several consecutive months. This is
difficult as an undergrad because you have to make sure to do well in your courses,
prepare for the GRE, and get a solid mathematical and computing
foundation in order to conduct research later. This is why these
programs are usually in the summer.

If you decide to apply to a PhD program, I recommend you take advanced math courses such as Real Analysis and Matrix Algebra. If you plan to develop software for complex datasets, I  recommend CS courses that cover algorithms and optimization. Note that programming skills are not the same thing as the theory taught in these CS courses. Programming skills in R will serve
you well if you plan to analyze data regardless of what academic route
you follow. Python and a low-level language such as C++ are more
powerful languages that many biostatisticians use these days.

I think the demand for well-trained researchers that can make sense of data will continue to be on the rise. If you want a fulfilling job where you analyze data for a living, you should consider a PhD in Biostatistics.



Not So Standard Deviations: Episode 4 - A Gajillion Time Series

Tweet about this on TwitterShare on FacebookShare on Google+Share on LinkedInEmail this to someone

Episode 4 of Not So Standard Deviations is hot off the audio editor. In this episode Hilary first explains to me what heck is DevOps and then we talk about the statistical challenges in detecting rare events in an enormous set of time series data. There's also some discussion of Ben and Jerry's and the t-test, so you'll want to hang on for that.




How I decide when to trust an R package

Tweet about this on TwitterShare on FacebookShare on Google+Share on LinkedInEmail this to someone

One thing that I've given a lot of thought to recently is the process that I use to decide whether I trust an R package or not. Kasper Hansen took a break from trolling me on Twitter to talk about how he trusts packages on Github less than packages that are on CRAN and particularly Bioconductor.  A couple of points he makes that I think are very relevant. First, that having a package on CRAN/Bioconductor raises trust in that package:

The primary reason is because Bioc/CRAN demonstrate something about the developer's willingness to do the boring but critically important parts of package development like documentation, vignettes, minimum coding standards, and being sure that their code isn't just a rehash of something else. The other big point Kasper made was the difference between a repository - which is user oriented and should provide certain guarantees and Github - which is a developer platform and makes things easier/better for developers but doesn't have a user guarantee system in place.

This discussion got me thinking about when/how I depend on R packages and how I make that decision. The scenarios where I depend on R packages are:

  1. Quick and dirty analyses for myself
  2. Shareable data analyses that I hope are reproducible
  3. As dependencies of R packages I maintain

As you move from 1-3 it is more and more of a pain if the package I'm depending on breaks. If it is just something I was doing for fun, its not that big of a deal. But if it means I have to rewrite/recheck/rerelease my R package than that is a much bigger headache.

So my scale for how stringent I am about relying on packages varies by the type of activity, but what are the criteria I use to measure how trustworthy a package is? For me, the criteria are in this order:

  1. People prior 
  2. Forced competence
  3. Indirect data

I'll explain each criteria in a minute, but the main purpose of using these criteria is (a) to ensure that I'm using a package that works and (b) to ensure that if the package breaks I can trust it will be fixed or at least I can get some help from the developer.

People prior

The first thing I do when I look at a package I might depend on is look at who the developer is. If that person is someone I know has developed widely used, reliable software and who quickly responds to requests/feedback then I immediately trust the package. I have a list of people like Brian, or Hadley, or Jenny, or Rafa, who could post their package just as a link to their website and I would trust it. It turns out almost all of these folks end up putting their packages on CRAN/Bioconductor anyway. But even if they didn't I assume that the reason is either (a) the package is very new or (b) they have a really good reason for not distributing it through the normal channels.

Forced competence

For people who I don't know about or whose software I've never used, then I have very little confidence in the package a priori. This is because there are a ton of people developing R packages now with highly variable levels of commitment to making them work. So as a placeholder for all the variables I don't know about them, I use the repository they choose as a surrogate. My personal prior on the trustworthiness of a package from someone I don't know goes something like:

Screen Shot 2015-11-06 at 1.25.01 PM

This prior is based on the idea of forced competence. In general, you have to do more to get a package approved on Bioconductor than on CRAN (for example you have to have a good vignette) and you have to do more to get a package on CRAN (pass R CMD CHECK and survive the review process) than to put it on Github.

This prior isn't perfect, but it does tell me something about how much the person cares about their package. If they go to the work of getting it on CRAN/Bioc, then at least they cared enough to document it. They are at least forced to be minimally competent - at least at the time of submission and enough for the packages to still pass checks.

Indirect data

After I've applied my priors I then typically look at the data. For Bioconductor I look at the badges, like how downloaded it is, whether it passes the checks, and how well it is covered by tests. I'm already inclined to trust it a bit since it is on that platform, but I use the data to adjust my prior a bit. For CRAN I might look at the download stats provided by Rstudio. The interesting thing is that as John Muschelli points out, Github actually has the most indirect data available for a package:

If I'm going to use a package that is on Github from a person who isn't on my prior list of people to trust then I look at a few things. The number of stars/forks/watchers is one thing that is a quick and dirty estimate of how used a package is. I also look very carefully at how many commits the person has submitted to both the package in question and in general all other packages over the last couple of months. If the person isn't actively developing either the package or anything else on Github, that is a bad sign. I also look to see how quickly they have responded to issues/bug reports on the package in the past if possible. One idea I haven't used but I think is a good one is to submit an issue for a trivial change to the package and see if I get a response very quickly. Finally I look and see if they have some demonstration their package works across platforms (say with a travis badge). If the package is highly starred, frequently maintained, all issues are responded to and up-to-date, and passes checks on all platform then that data might overwhelm my prior and I'd go ahead and trust the package.


In general one of the best things about the R ecosystem is being able to rely on other packages so that you don't have to write everything from scratch. But there is a hard balance to strike with keeping the dependency list small. One way I maintain this balance is using the strategy I've outlined to worry less about trustworthy dependencies.


The Statistics Identity Crisis: Am I a Data Scientist

Tweet about this on TwitterShare on FacebookShare on Google+Share on LinkedInEmail this to someone

The joint ASA/Simply Statistics webinar on the statistics identity crisis is now live!


Faculty/postdoc job opportunities in genomics across Johns Hopkins

Tweet about this on TwitterShare on FacebookShare on Google+Share on LinkedInEmail this to someone

It's pretty exciting to be in genomics at Hopkins right now with three new Bloomberg professors in genomics areas, a ton of stellar junior faculty, and a really fun group of students/postdocs. If you want to get in on the action here is a non-comprehensive list of great opportunities.

Faculty Jobs

Job: Multiple tenure track faculty positions in all areas including in genomics
Department:  Biostatistics
To apply:
Deadline: Review ongoing

Job: Tenure track position in data intensive biology
Department:  Biology
To apply
Deadline: Nov 1st and ongoing

Job: Tenure track positions in bioinformatics, with focus on proteomics or sequencing data analysis
Department:  Oncology Biostatistics
To apply
Deadline: Review ongoing


Postdoc Jobs

Job: Postdoc(s) in statistical methods/software development for RNA-seq
Employer:  Jeff Leek
To apply: email Jeff (
Deadline: Review ongoing

Job: Data scientist for integrative genomics in the human brain (MS/PhD)
Employer:  Andrew Jaffe
To apply: email Andrew (
Deadline: Review ongoing

Job: Research associate for genomic data processing and analysis (BA+)
Employer:  Andrew Jaffe
To apply: email Andrew (
Deadline: Review ongoing

Job: PhD developing scalable software and algorithms for analyzing sequencing data
Employer:  Ben Langmead
To apply:
Deadline: See site

Job: Postdoctoral researcher developing scalable software and algorithms for analyzing sequencing data
Employer:  Ben Langmead
To apply:  email Ben (
Deadline: Review ongoing

Job: Postdoctoral researcher developing algorithms for challenging problems in large-scale genomics whole-genome assenbly, RNA-seq analysis, and microbiome analysis
Employer:  Steven Salzberg
To apply:  email Steven (
Deadline: Review ongoing

Job: Research associate for genomic data processing and analysis (BA+) in cancer
Employer:  Luigi Marchionni (with Don Geman)
To apply:  email Luigi (
Deadline: Review ongoing

Job: Postdoctoral researcher developing algorithms for biomarkers development and precision medicine application in cancer
Employer:  Luigi Marchionni (with Don Geman)
To apply:  email Luigi (
Deadline: Review ongoing

Job:Postdoctoral researcher developing methods in machine learning, genomics, and regulatory variation
Employer:  Alexis Battle
To apply:  email Alexis (
Deadline: Review ongoing

Job: Postdoctoral fellow with interests in biomarker discovery for Alzheimer’s disease
Employer:  Madhav Thambisetty / Ingo Ruczinski
To apply:
Deadline: Review ongoing

Job: Postdoctoral positions for research in the interface of statistical genetics, precision medicine and big data
Employer:  Nilanjan Chatterjee
To apply:
Deadline: Review ongoing

Job: Postdoctoral research developing algorithms and software for time course pattern detection in genomics data
Employer:  Elana Fertig
To apply:  email Elana (
Deadline: Review ongoing

Job: Postdoctoral fellow to develop novel methods for large-scale DNA and RNA sequence analysis related to human and/or plant genetics, such as developing methods for discovering structural variations in cancer or for assembling and analyzing large complex plant genomes.
Employer:  Mike Schatz
To apply:  email Mike (
Deadline: Review ongoing


We are all always on the hunt for good Ph.D. students. At Hopkins students are admitted to specific departments. So if you find a faculty member you want to work with, you can apply to their department. Here are the application details for the various departments admitting students to work on genomics:





The statistics identity crisis: am I really a data scientist?

Tweet about this on TwitterShare on FacebookShare on Google+Share on LinkedInEmail this to someone






Tl;dr: We will host a Google Hangout of our popular JSM session October 30th 2-4 PM EST. 


I organized a session at JSM 2015 called "The statistics identity crisis: am I really a data scientist?" The session turned out to be pretty popular:

but it turns out not everyone fit in the room:

Thankfully, Steve Pierson at the ASA had the awesome idea to re-run the session for people who couldn't be there. So we will be hosting a Google Hangout with the following talks:

'Am I a Data Scientist?': The Applied Statistics Student's Identity CrisisAlyssa Frazee, Stripe
How Industry Views Data Science Education in Statistics DepartmentsChris Volinsky, AT&T
Evaluating Data Science Contributions in Teaching and ResearchLance Waller, Emory University
Teach Data Science and They Will ComeJennifer Bryan, The University of British Columbia

You can watch it on Youtube or Google Plus. Here is the link:

The session will be held October 30th (tomorrow!) from 2-4PM EST. You can watch it live and discuss the talks using the hashtag #JSM2015 or you can watch later as the video will remain on Youtube.


Discussion of the Theranos Controversy with Elizabeth Matsui

Tweet about this on TwitterShare on FacebookShare on Google+Share on LinkedInEmail this to someone

Theranos is a Silicon Valley diagnostic testing company that has been in the news recently. The story of Theranos has fascinated me because I think it represents a perfect collision of the tech startup culture and the health care culture and how combining them together can generate unique problems.

I talked with Elizabeth Matsui, a Professor of Pediatrics in the Division of Allergy and Immunology here at Johns Hopkins, to discuss Theranos, the realities of diagnostic testing, and the unique challenges that a health-tech startup faces with respect to doing good science and building products people want to buy.