The Trump administration’s controversial attempt to declare its recent presidential inauguration as having “the largest audience to witness an inauguration, period,” has inadvertently highlighted the fact that counting crowds remains a painstaking and inexact science. But the rise of artificial intelligence could soon spare crowd scientists the task of manually counting heads.
An early glimpse of how artificial intelligence (AI) could help count crowds appeared in 2013. University of Central Florida researchers showed how computer software based on machine learning can swiftly provide automated headcount estimates for crowds numbering in the hundreds of thousands. Such AI tools still have room for improvement in terms of achieving accurate headcounts based on images. But the software needed just half an hour to accomplish what would have taken researchers a week to do manually.
“For accuracy [in counting large groups], we can get within plus or minus 30 percent error compared to the ground truth count obtained by human annotators, mainly undergraduates. But we’re not sure it’s better than professional counters,” says Mubarak Shah, computer science professor and director of the Center for Research in Computer Vision at the University of Central Florida. “But for efficiency, it’s impossible for humans to do this so quickly.” (Shah also notes that the computer software's count is more "objective" as it does not contain human biases.)
Crowd counts for politically-charged events such as protests or presidential inaugurations can sometimes spark controversy. Recent examples include Trump’s presidential inauguration and the next day’s Women’s March in Washington, D.C., that coincided with related women’s marches in many cities and towns across the globe. (In case you were wondering, the Trump administration’s claims do not hold up based on the available measures.) Similarly, the automated crowd counting software got its own start by looking at protests involving thousands of people calling for the Catalonia province to be independent of Spain.
Of course, crowd scientists don’t usually spend a week painstakingly counting every single head in photographs of huge crowds. Instead, they typically count the number of people in certain parts of images where they know the size of the area, and then extrapolate from there to come up with total crowd count estimates across a larger area.
Today’s computer software based on machine learning can count all the heads in a crowd very quickly, but computer vision technology yields its own inaccuracies. To improve its accuracy, the University of Central Florida software subdivides a given crowd image into smaller patches for counting heads. The individual patch counts can then be averaged together based on assumptions about crowd density to smooth out some of the inaccuracies of the individual patch counts.
The sheer efficiency of this software has already proven useful enough for some real-world applications. Saudi Arabian officials have already licensed the software to count the throngs of Hajj pilgrims who visit Islam’s holy sites at Mecca every year. The country of Qatar is also funding Shah’s team to improve the computer software for use in counting crowds that might attend the 2022 World Cup event in Doha.
Newer AI methods such as deep learning could soon boost the accuracy of computer vision. The University of Central Florida team has already switched over to deep learning AI that takes advantage of neural networks—software that automatically learns by filtering relevant data through many layers of processing. The researchers have not yet publicly released benchmarks comparing the new deep learning approach with their older software, but they have a research paper in the works. “We expect that deep learning will be much, much better,” Shah says.
But even deep learning AI will face the same challenges that human crowd scientists face today, Shah explains. For counting crowds, the ideal image would be taken from above by drone, aircraft or satellite—a special challenge for counting the crowds that attended Inauguration Day and the Women’s March because of D.C.’s airspace restrictions and lack of suitable satellite imagery. Images taken from an oblique angle present greater crowd-counting challenges for computers because they need to account for perspective and scale (people who are closer will be slightly larger).
Low-resolution images can also present a challenge because the computer software must identify relevant features based on fewer pixels per person. But by training on many different crowd images, deep learning AI can improve its overall accuracy in counting heads even in low-resolution images.
Perhaps the biggest challenge for the deep learning approach to automated crowd counting is the need for lots and lots of training data. Ideally, Shah’s team wants many different images of the same crowd events so they can train their deep learning software to recognize and count humans in a wide variety of circumstances.
But even training is not as simple as just feeding online images of crowd events into the software. To learn how to accurately identify humans in a crowd image, the AI needs images that already have accurate annotations showing the individual people within the crowd and the overall headcount. That means researchers still need to manually perform headcounts of certain images to provide training datasets for their software.
The University of Central Florida team plans to use online crowdsourcing services such as Amazon’s Mechanical Turk to help manually create such training datasets. If they succeed in training computer vision to become more accurate, automated crowd counting could become viable in many more scenarios ranging from shopping malls to concerts. And given a U.S. president who loves to talk about huge crowds, crowd counting is likely to remain politically relevant for the foreseeable future.