Machine-Learning Wizards Vie for Zillow’s $1 Million Prize
In 1714, the British Parliament passed the Longitude Act, which offered serious money to anyone who could devise a practical method to measure longitude at sea. While the determination of longitude might seem a trivial thing in today’s world of smartphones and GPS satellites, at the time it was an immense technical challenge. It took many years, but the strategy worked, leading to the development of the marine chronometer, a handheld mechanical marvel that undoubtedly saved the lives of countless sailors.
Prizes have, of course, since been used to spur innovation in many other spheres. “These were typically offered by governments,” says Josh Lerner of the Harvard Business School, who has studied the effectiveness of such prizes. But private companies are doing it, too. Lerner notes that the US $1 million Netflix Prize, awarded in 2009 to a team that devised an algorithm that could beat the company’s Cinematch recommender system, helped to revive the popularity of these prizes.
In particular, the Netflix Prize helped give rise to the current Zillow Prize competition, which challenges data scientists to come up with a computerized system that can beat Zillow’s current method for predicting home prices, something the company calls the Zestimate.
Stan Humphries, chief analytics officer for the Zillow Group, in Seattle, explains that the Zestimate was the very first product Zillow created when the company launched in 2006, it being critical to Zillow’s goal of becoming a premier information portal in the real estate market.
Early on, Humphries says, the Zestimate was frequently wide of the mark: In estimating what a house would sell for, the algorithm had a median error of 14 percent. He and his colleagues were able to reduce the error level to around 4 percent. But they wanted to do even better. So they decided to “invite the global data-science community” to participate—an invitation that came with the possibility that if you were smart and lucky you might win the $1 million Zillow Prize, which will be awarded early in 2019. “Ever since [the Netflix Prize], this has been a glimmer in my eye,” says Humphries.
Some 4,000 groups participated in the first round of Zillow’s home-valuation competition, which launched in 2017. One hundred teams moved on to compete in the second and final round, which is being judged now. The grading criterion is how accurately the contestants’ systems were able, in July 2018, to predict the actual sales prices of a large set of U.S. homes that were sold in September and October.
Participants in the first round could use only the input data that Zillow supplied, which consisted of the kinds of information you might find in a municipal property database or a typical real estate listing. But for the second round, contestants were allowed to pull in data from other sources as well.
At the top of the leaderboard in the first round was team that calls itself Zensemble, made up of three data mavens from Australia, Israel, and the United States: Dmytro Poplavskiy, Jonathan Gradstein, and Russ Wolfinger.
Wolfinger is director of scientific discovery and genomics at SAS, a company in Cary, N.C., that develops software for statistical analysis. This isn’t Wolfinger’s first time competing in something like this. Indeed, he has been very active on Kaggle, which hosts a variety of machine-learning competitions. “I’m kind of a competitive person myself,” says Wolfinger. “I’ve always played sports.” He notes that such competitions skew sharply toward men with backgrounds in computer or data sciences. “Ironically, I don’t see a lot of people from statistics,” he says.
Is participating an expensive proposition, given the computing power needed to be competitive? Not really, in Wolfinger’s opinion. “A lot of guys on Kaggle build their own machine-learning rigs,” he says. He warns that you should expect your electricity bill to go up if you do a lot of number crunching with one of those rigs. But he also notes that what gives one team an edge over another is more about brain power than computing power.
It’s too early to know whether contestants will really devise something that improves on Zillow’s current scheme for home valuation, which is already pretty good. “We have a large team of AI professionals that work on this problem for us,” says Humphries. But he nevertheless thinks he will learn a lot from the many outsiders now vying for the Zillow Prize. That is, the prize isn’t just a publicity stunt. Zillow really expects to improve its Zestimate, he says. That goes a long way toward explaining why Zillow required that all 100 teams in the final round assign intellectual property rights for their software to the company.
Of course, as Yogi Berra reportedly said, “It ain’t over till it’s over.” One or more teams may soundly beat today’s Zestimate, earning the leader a cool million. Wolfinger certainly anticipates that this is going to happen. Of course, there’s a chance that Zillow’s AI professionals have done as well as anyone can at the moment. We’ll know for sure any day now.