New AI Inferencing Records

Nvidia tops MLPerf records again, consortium adds benchmarks to measure mobile

27 Oct 2020

Nvidia's A100 GPU powered new AI inferencing records.

Photo: Nvidia

MLPerf, a consortium of AI experts and computing companies, has released a new set of machine learning records. The records were set on a series of benchmarks that measure the speed of inferencing: how quickly an already-trained neural network can accomplish its task with new data. For the first time, benchmarks for mobiles and tablets were contested. According to David Kanter, executive director of MLPerf’s parent organization, a downloadable app is in the works that will allow anyone to test the AI capabilities of their own smartphone or tablet.

MLPerf’s goal is to present a fair and straightforward way to compare AI systems. Twenty-three organizations—including Dell, Intel, and Nvidia—submitted a total of 1200 results, which were peer reviewed and subjected to random third-party audits. (Google was conspicuously absent this round.) As with the MLPerf records for training AIs released over the summer, Nvidia was the dominant force, besting what competition there was in all six categories for both datacenter and edge computing systems. Including submissions by partners like Cisco and Fujitsu, 1029 results, or 85 percent of the total for edge and data center categories, used Nvidia chips, according to the company.

“Nvidia outperforms by a wide range on every test,” says Paresh Kharaya, senior director of product management, accelerated computing at Nvidia. Nvidia’s A100 GPUs powered its wins in the datacenter categories, while its Xavier was behind the GPU-maker’s edge-computing victories. According to Kharaya, on one of the new MLPerf benchmarks, Deep Learning Recommendation Model (DLRM), a single DGX A100 system was the equivalent of 1000 CPU-based servers.

There were four new inferencing benchmarks introduced this year, adding to the two carried over from the previous round:

BERT, for Bi-directional Encoder Representation from Transformers, is a natural language processing AI contributed by Google. Given a question input, BERT predicts a suitable answer.
DLRM, for Deep Learning Recommendation Model is a recommender system that is trained to optimize click-through rates. It’s used to recommend items for online shopping and rank search results and social media content. Facebook was the major contributor of the DLRM code.
3D U-Net is used in medical imaging systems to tell which 3D voxel in an MRI scan are parts of a tumor and which are healthy tissue. It’s trained on a dataset of brain tumors.
RNN-T, for Recurrent Neural Network Transducer, is a speech recognition model. Given a sequence of speech input, it predicts the corresponding text.

In addition to those new metrics, MLPerf put together the first set of benchmarks for mobile devices, which were used to test smartphone and tablet platforms from MediaTek, Qualcomm, and Samsung as well as a notebook from Intel. The new benchmarks included:

MobileNetEdgeTPU, an image classification benchmark that is considered the most ubiquitous task in computer vision. It’s representative of how a photo app might be able pick out the faces of you or your friends.
SSD-MobileNetV2, for Single Shot multibox Detection with MobileNetv2, is trained to detect 80 different object categories in input frames with 300x300 resolution. It’s commonly used to identify and track people and objects in photography and live video.
DeepLabv3+ MobileNetV2: This is used to understand a scene for things like VR and navigation, and it plays a role in computational photography apps.
MobileBERT is a mobile-optimized variant of the larger natural language processing BERT model that is fine-tuned for question answering. Given a question input, the MobileBERT generates an answer.

The benchmarks were run on a purpose-built app that should be available to everyone within months, according to Kanter. “We want something people can put into their hands for newer phones,” he says.

The results released this week were dubbed version 0.7, as the consortium is still ramping up. Version 1.0 is likely to be complete in 2021.

Samuel K. Moore

Samuel K. Moore is the senior editor at IEEE Spectrum in charge of semiconductors coverage. An IEEE member, he has a bachelor's degree in biomedical engineering from Brown University and a master's degree in journalism from New York University.

New AI Inferencing Records

Nvidia tops MLPerf records again, consortium adds benchmarks to measure mobile

What Software Engineers Need to Know About AI Jobs

Jobs in AI increased slightly, but the boom shows signs of quieting

15 Graphs That Explain the State of AI in 2024

The AI Index tracks the generative AI boom, model costs, and responsible AI use

German EV Motor Could Break Supply-Chain Deadlock

Innovative design promises high performance without rare-earth magnets

E-Bikes Are Growing Up, Finding Jobs, Still Having Fun

New features are inspiring new applications, including cargo hauling

The Tiny Ultrabright Laser that Can Melt Steel

Photonic crystals are the key to the brightest semiconductor laser ever

Getting the Grid to Net Zero

Grid-forming inverters will take us to 100 percent renewable energy

Video Friday: Robot Dog Can’t Fall

Your weekly selection of awesome robot videos

AI Chip Trims Energy Budget Back by 99+ Percent

Photonic tech uses both diffraction and interference to boost performance

Pogo Stick Microcopter Bounces off Floors and Walls

Hopcopter is a more efficient way to (mostly) fly

Caltech’s SSPD-1 Is a New Idea for Space-Based Solar

Ali Hajimiri on boosting an energy-beaming system from the lab to orbit

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum