Tag: bleg

05
Dec

Email is a to-do list made by other people - can someone make it more efficient?!

This is a follow-up to one of our most popular posts: getting email responses from busy people. This post had been in the drafts for a few weeks, then this morning I saw this quote in our Twitter feed:

Your email inbox is a to-do list created by other people (via)

This is 100% true of my work email and I have to say, because of the way those emails are organized - as conversations rather than a prioritized, organized to-do list - I end up missing really important things or getting to them too late. This is happening to me with increasing enough frequency I feel like I'm starting to cause serious problems for people.

So I am begging someone with way better skills than me to produce software that replaces gmail in the following ways. It is a to-do list that I can allow people to add tasks too. The software shows me the following types of messages.

  1. We have an appointment at x time on y date to discuss z. Next to this message is a checkbox. If I click “ok” it gets added to my calendar, if I click “no” then a message gets sent to the person who scheduled the meeting saying I’m unavailable.
  2. A multiple choice question where they input the categories of answer I can give and I just pick one, it sends them the response.
  3. A request to be added as a person who can assign me tasks with a yes/no answer.
  4. A longer request email - this has three entry fields: (1) what do you want, (2) when do you want it by? and (3) a yes/no checkbox asking if I’m willing to perform the task.  If I say yes, it gets added to my calendar with automated reminders.
  5. It should interface with all the systems that send me reminder emails to organize the reminders.
  6. You can assign quotas to people, where they can only submit a certain number of tasks per month.
  7. It allows you to re-assign tasks to other people so when I am not the right person to ask, I can quickly move the task on to the right person.
  8. It would collect data and generate automated reports for me about what kind of tasks I'm usually forgetting/being late on and what times of day I'm bad about responding so that I could improve my response times.

The software would automatically reorganize events/to-dos to reflect changing deadlines/priorities, etc. This piece of software would revolutionize my life. Any takers?

22
Oct

A statistical project bleg (urgent-ish)

We all know that politicians can play it a little fast and loose with the truth. This is particularly true in debates, where politicians have to think on their feet and respond to questions from the audience or from each other. 

Usually, we find out about how truthful politicians are in the “post-game show”. The discussion of the veracity of the claims is usually based on independent fact checkers such as PolitiFact. Some of these fact checkers (Politifact in particular) live-tweet their reports on many of the issues discussed during the debate. This is possible, since both candidates have a pretty fixed set of talking points they use, so it is near real time fact-checking. 

What would be awesome is if someone could write an R script that would scrape the live data off of Politifact’s Twitter account and create a truthfullness meter that looks something like CNN’s instant reaction graph (see #7) for independent voters. The line would show the moving average of how honest each politician was being. How cool would it be to show the two candidates and how truthful they are being? If you did this, tell me it wouldn’t be a feature one of the major news networks would pick up…

16
Sep

Sunday Data/Statistics Link Roundup (9/16/12)

  1. There has been a lot of talk about the Michael Lewis (of Moneyball fame) profile of Obama in Vanity fair. One interesting quote I think deserves a lot more discussion is: “On top of all of this, after you have made your decision, you need to feign total certainty about it. People being led do not want to think probabilistically.” This is a key issue that is only going to get worse going forward. All of public policy is probabilistic - we are even moving to clinical trials to evaluate public policy
  2. It’s sort of amazing to me that I hadn’t heard about this before, but a UC Davis professor was threatened for discussing the reasons PSA screening may be overused. This same issue keeps coming up over and over - screening healthy populations for rare diseases is often not effective (you need a ridiculously high specificity or a treatment with almost no side effects). What we need is John McGready to do a claymation public service video or something explaining the reasons screening might not be a good idea to the general public. 
  3. A bleg - I sometimes have a good week finding links myself and there are a few folks who regularly send links (Andrew J., Alex N., etc.) I’d love it if people would send me cool links when they see them with the email title, “Sunday LInks” - i’m sure there is more cool stuff out there. 
  4. The ICSB has a competition to improve the coverage of computational biology on Wikipedia. Someone should write a surrogate variable analysis or robust multiarray average article. 
  5. I had not hear of the ASA’s Stattrak until this week, it looks like there are some useful resources there for early career statisticians. With the onset of fall, it is closing in on a new recruiting season. If you are a postdoc/student on the job market and you haven’t read Rafa’s post on soft vs. hard money, now is the time to start brushing up! Stay tuned for more job market posts this fall from Simply Statistics. 
13
May

Sunday data/statistics link roundup (5/13)

  1. Patenting statistical sampling? I’m pretty sure the Supreme Court who threw out the Mayo Patent wouldn’t have much trouble tossing this patent either. The properties of sampling are a “law of nature” right? via Leonid K.
  2. This video has me all fired up, its called 23 1/2 hours and talks about how the best preventative health measure is getting 30 minutes of exercise - just walking - every day. He shows how in some cases this beats doing much more high-tech interventions. My favorite part of this video is how he uses a ton of statistical/epidemiological terms like “effect sizes”, “meta-analysis”, “longitudinal study”, “attributable fractions”, but makes them understandable to a broad audience. This is a great example of “statistics for good”.
  3. A very nice collection of 2-minute tutorials in R. This is a great way to teach the concepts, most of which don’t need more than 2 minutes, and it covers a lot of ground. One thing that drives me crazy is when I go into Rafa’s office with a hairy computational problem and he says, “Oh you didn’t know about function x?”. Of course this only happens after I’ve wasted an hour re-inventing the wheel. If more people put up 2 minute tutorials on all the cool tricks they know, the better we’d all be.
  4. A plot using ggplot2, developed by this week’s interviewee Hadley Wickham appears in the Atlantic! Via David S.
  5. I’m refusing to buy into Apple’s hegemony, so I’m still running OS 10.5. I’m having trouble getting github up and running. Anyone have this same problem/know a solution? I know, I know, I’m way behind the times on this…
22
Apr

Sunday data/statistics link roundup (4/22)

  1. Now we know who is to blame for the pie chart. I had no idea it had been around, straining our ability to compare relative areas, since 1801. However, the same guy (William Playfair) apparently also invented the bar chart. So he wouldn’t be totally shunned by statisticians. (via Leonid K.)
  2. A nice article in the Guardian about the current group of scientists that are boycotting Elsevier. I have to agree with the quote that leads the article, “All professions are conspiracies against the laity.” On the other hand, I agree with Rafa that academics are partially to blame for buying into the closed access hegemony. I think more than a boycott of a single publisher is needed; we need a change in culture. (first link also via Leonid K)
  3. A blog post on how to add a transparent image layer to a plot. For some reason, I have wanted to do this several times over the last couple of weeks, so the serendipity of seeing it on R Bloggers merited a mention. 
  4. I agree the Earth Institute needs a better graphics advisor. (via Andrew G.)
  5. A great article on why multiple choice tests are used - they are an easy way to collect data on education. But that doesn’t mean they are the right data. This reminds me of the Tukey quote: “The data may not contain the answer. The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data”. It seems to me if you wanted to have a major positive impact on education right now, the best way would be to develop a new experimental design that collects the kind of data that really demonstrates mastery of reading/math/critical thinking. 
  6. Finally, a bit of a bleg…what is the best way to do the SVD of a huge (think 1e6 x 1e6), sparse matrix in R? Preferably without loading the whole thing into memory…
23
Mar

This graph shows that President Obama's proposed budget treats the NIH even worse than G.W. Bush - Sign the petition to increase NIH funding!

The NIH provides financial support for a large percentage of biological and medical research in the United States. This funding supports a large number of US jobs, creates new knowledge, and improves healthcare for everyone. So I am signing this petition


NIH funding is essential to our national research enterprise, to our local economies, to the retention and careers of talented and well-educated people, to the survival of our medical educational system, to our rapidly fading worldwide dominance in biomedical research, to job creation and preservation, to national economic viability, and to our national academic infrastructure.


The current administration is proposing a flat $30.7 billion FY 2013 NIH budget. The graph below (left) shows how small the NIH budget is in comparison to the Defense and Medicare budgets in absolute terms. The difference between the administration’s proposal and the petition’s proposal ($33 billion) are barely noticeable. 

The graph on the right shows how in 2003 growth in the NIH budget fell dramatically while medicare and military spending kept growing. However, despite the decrease in rate, the NIH budget did continue to increase under Bush. If we follow Bush’s post 2003 rate (dashed line), the 2013 budget will be about what the petition asks for: $33 billion.  


If you agree that the relatively modest increase in the NIH budget is worth the incredibly valuable biological, medical, and economic benefits this funding will provide, please consider signing the petition before April 15 

09
Nov

Statisticians on Twitter...help me find more!

In honor of our blog finally dragging itself into the 21st century and jumping onto Twitter/Facebook, I have been compiling a list of statistical people on Twitter. I couldn’t figure out an easy way to find statisticians in one go (which could be because I don’t have Twitter skills). 

So here is my very informal list of statisticians I found in a half hour of searching. I know I missed a ton of people; let me know who I missed so I can update!

@leekgroup - Jeff Leek (What, you thought I’d list someone else first?)

@rdpeng - Roger Peng

@rafalab - Rafael Irizarry

@storeylab - John Storey

@bcaffo - Brian Caffo

@sherrirose - Sherri Rose

@raphg - Raphael Gottardo

@airoldilab - Edo Airoldi

@stat110 - Joe Blitzstein

@tylermccormick - Tyler McCormick

@statpumpkin - Chris Volinsky

@fivethirtyeight - Nate Silver

@flowingdata - Nathan Yau

@kinggary - Gary King

@StatModeling - Andrew Gelman

@AmstatNews - Amstat News

@hadleywickham - Hadley Wickham