On Thursday, FiveThirtyEight hosted its first chat on Facebook with our editor in chief, Nate Silver. We’ve pulled in some of the most interesting questions below (content has been lightly edited for clarity). You can find a link to the entire chat at the end of the post.
Liz Klinger: Do you expect the balance of power in Congress to shift as a result of the 2014 midterm elections? Which races do you see as the most decisive?
Nate Silver: Our forecast as of about a month ago had Republicans as being just slightly more likely than not to take over the Senate. (We’ll be launching the full-fledged version of our forecast model pretty soon.) I doubt that the Democrats’ position has gotten much better — Obama’s approval rating has dropped some since that time, for instance. There are any number of races that could shift the balance: Michigan, Iowa, Alaska, Arkansas, Louisiana and North Carolina among others.
Archie Shaw: Do you apply anecdotal information to your calculations and if so how much typically does that account for in your +- final numbers?
Nate Silver: In general, we seek to avoid anecdotal information. That doesn’t mean it’s ALWAYS worthless. But sometimes it’s better to say: “Outcome X has a 60% chance of happening based on our model, which is based on some relatively clear and simple assumptions — and here are some other things the model isn’t considering,” as opposed to lumping everything together, if that makes sense.
Clifton Hall: What is it about sports that makes them so much harder to forecast than politics?
Nate Silver: I don’t think sports are harder to forecast. In some ways they’re easier — to get a bit wonky, there’s less structural uncertainty about how to model sports as opposed to politics (we know basically the “right” way to predict the NCAA tournament, for instance, more so than for something like a presidential primary). But in some sports — and some political events — the favorite wins more often. Novak Djokovic is going to win 99% of his quarterfinal matches or something, for instance. Soccer odds are generally not that definitive. There are similar cases in politics — the Iowa caucus is tough to forecast, for instance.
Bob Bjarke: Soccer culture is notoriously resistant to placing too much importance on statistics. Is that culture changing? And what is one stat or type of information you’d like to see gathered in the future?
Nate Silver: Hopefully we’ll see the resistance to statistical analysis decrease in soccer as the data gets better and we can start to measure things other than goals and bookings. There are a number of club teams in Europe — Arsenal, Man City I think — that already employ some really smart statistical analysts.
Robert James: How do you deal with the criticism you’ve been getting with SPI and its exposure during this World Cup? Does it drive you to improve the model or make you skeptical about heavy statistical analysis in soccer?
Nate Silver: The criticism has mostly been coming from professional trolls who didn’t like FiveThirtyEight to begin with and who are cherry-picking results. Brazil was a HUGE miss given the scoreline — but SPI has “called” 13 of 14 games correctly in the knockout round so far, by contrast. We’d love to improve SPI by accounting for a wider variety of play-by-play data (time of possession, shots on target etc.), including more sophisticated handling of travel effects and home-field advantage, and other things. I’m not sure any of that would much have helped with the Brazil match, however.
Joel Sutherland: What field/topic is really difficult to apply data analysis to, but would benefit the most from such analysis? Perhaps something where good data could be gathered but hasn’t been for whatever reason.
Nate Silver: It’s a little esoteric — but one area is public transit. There was recently a big data dump of NYC taxi records, for instance, which could be fascinating to look at. I’ve also heard reports that some large Asian cities are thinking about crowdsourcing their transit data.
Dan Schroeder: 538 is your web site and you can cover what you like, but I personally have been disappointed by its signal to noise ratio. It seems that you’re far more interested in sports and burritos and other feature stories than in serious news. How about showing us a pie chart breaking down your articles published so far, by category?
Nate Silver: Thank for for reading, Dan. A lot of our beats are pretty cyclical. This has been a HUGE few months for sports and a slow time for politics. But we’re still publishing lots of great stuff about politics, economics and other issues. You can navigate to our Politics front page or follow our Politics RSS feed if you don’t want to see the sports stuff (and come to think of it, maybe we should set up a Twitter feed for our politics & economics stuff too). At the same time, there’s an element of our philosophy where we’d rather publish serious takes on unserious subjects (like burritos) as opposed to unserious takes on serious subjects (i.e. half-baked politics coverage when there isn’t the sort of news we’re good at covering).
Kunbi Adeyemo: There is this public impression that “numbers don’t lie” but every day I see misleading charts with axes that are non zero used to exaggerate trends. What are your thoughts on helping the public understand the limitations of data, no matter how “big” (or at least learn the right questions to ask when presented with a number/visualization)?
Nate Silver: Kunbi — thanks for the question. … One of my pet peeves is that people associate me with the idea that the world is super predictable using “big data”. In fact, there are a LOT of limitations and a lot of problems (that’s what my book is about). Some things like American elections are more predictable than people assume — but MANY other things are LESS predictable than people think. At the same time, one should always be comparing data-driven approaches against the alternatives. Maybe a statistical model isn’t great at predicting soccer — but is it better than the pundits?
Chris Rees: Do you think data journalists have the obligation to be transparent with respect to the parameters imputed into their predictive models and the methods they use to derive these predictions. Why or why not?
Nate Silver: I think ALL journalists ought to value transparency. We don’t always publish our data and code — we’re getting better about that — but even when we don’t, we strive to make it very clear to the reader exactly what assumptions we’re making. The idea is to show the reader something rather than telling her something. (p.s. One of the best things about being detailed in describing your methodology that is that it makes it much clearer when you’ve made a mistake.)
Neil Bhatiya: What is the probability you would have to fight either 1 horse-sized duck or 100 duck-sized horses in your life time?
Nate Silver: I doubt that this will come up, but if it does, I’d prefer the horse-sized duck. The problem with the 100 duck-sized horses is that, even if you have a 99% probability of killing each one, the odds of successfully killing ALL of them is only 37% (0.99^100). (p.s. that’s the last question for now. This was really fun and we’ll do it again sometime.)
