For any data modelling problem – who is the best person to analyse it? One of the things we’re learning with crowdsourcing is that Wally could be anywhere.
A bank will use data modelling to profile which of its customers are most likely to default. A medical researcher will analyse data and look for patterns. If he or she finds them, lives can be saved.
William Dampier – not the seventeenth century explorer, but the twenty first century “scientific” explorer – had a rich dataset on HIV patients. After a decade of analysis, the best model from the scientific literature yielded predictions on the progression of viral load in the patients that were 70% accurate.
Then William hosted a global data prediction competition…
William asked “Where’s Wally” and within a week and a half a model with 70.8% accuracy was produced. Three months later – 77%. accuracy. The state-of-the art in the scientific literature was advanced by 10 percent in three months!
It was the work of over 100 teams from 30 countries. PhD-level specialists from around the world, all experts in analysing data, all volunteering countless hours of their free time to perform the analysis. The winner’s prize? Only $500! A commercial competition to predict freeway commute times for the NSW Government attracted over 50 entries in the first week alone ($10,000 prize).
Data prediction competitions are particularly effective because there are countless techniques that can be applied to any problem. Any analyst or consultant might be sufficiently skilled/resourced to try a few. Only by opening the problem to a wide audience, with different participants trying different techniques, can we reach the frontier of what’s possible.
Competitions flush out the best technique/analyst for your problem. The “freelance” community is well supported by PhD-level specialists who crave real-world data to benchmark/refine their techniques and who can leverage competitions to enhance their professional reputations.
Right now Wally’s waiting to find out what’s hiding in your data. Something that no-one else has found . . . yet.
Wally’s the best analyst in the world for your data. Trouble is, right now we don’t know where he is.
A data prediction competition can help you find out.
What could the world’s best analysts find in your data?
This article was written by Kaggle Chairman, Nicholas Gruen.