The December 2022 issue of IEEE Spectrum is here!

Close bar
Google CEO Sundar Pichai presenting about the company’s latest AI chip the TPU V4 (Tensor Processing Unit version 4).
GOOGLE

Google CEO Sundar Pichai says the company’s latest AI chip the TPU V4 (Tensor Processing Unit version 4) is capable of more than double the processing power of its predecessor. At Google’s I/O event this week, Pichai detailed the chip’s performance, stating that when combined into into a 4096-processor “pod”—an interconnected, liquid cooled-cluster of servers—TPU V4s were capable of crunching through a billion billion operations per second, or an exaflop, a long-standing milestone.

“This is the fastest system we’ve ever deployed at Google and a historic milestone for us,” Pichai said. Top supercomputers have yet to reach an exaflop, but the TPU V4 pod isn’t really in their league, because it calculates with lower-precision numbers than scientific supercomputers.

Powerful AI is particularly needed for training the large neural networks that run the prediction systems and natural language processing integral to digital commerce today. Google previewed TPU V4’s abilities in that area last year when it benchmarked three systems in MLPerf’s v0.7 training set, which was released in July 2020.

The tests included three vision systems (ResNet-50, SSD, and Mask R-CNN), a natural language processor (BERT) , two English-to-German translation networks (NMT and Transformer), a recommender system (DLRM), and a reinforcement learning network that plays Go.

Each system was distinguished by the number of TPU V4 accelerators, and the tests were measured in minutes needed to complete the training. (So lower is better.)

No. TPUs864256Top ranked commercial system
Image classification (ResNet)30.74.51.820.76 (1840 Nvidia accelerators)
Object detection (SSD)8.681.431.060.82 (1024 Nvidia accelerators)
Object detection (Mask R-CNN)103.115.59.9510.46 (256 Nvidia accelerators)
Translation (NMT)8.032.081.290.71 (1024 Nvidia accelerators)
Translation (Transformer)9.011.630.780.62 (480 Nvidia accelerators)
Natural language processing (BERT)45.575.731.820.81 (2048 Nvidia accelerators)
Recommendation (DLRM)4.421.213.33 (8 Nvidia accelerators)
Reinforcement learning (Minigo)150.917.07 (1792 Nvidia accelerators)

The company uses TPU infrastructure for its own AI and also rents it out to Google Cloud customers. Other big cloud companies, such as Amazon and Microsoft, have either deployed or are working on their own AI chips. Several other AI chip companies have launched with products aimed at customers needing data-center based systems. Cerebras recently detailed its 1.4-trillion-transistor second generation system for high-performance computing, which is powered by the world’s largest single chip. Other companies, such as Samba Nova and Graphcore, have reached valuations in the billions of dollars.

The Conversation (0)

Why Functional Programming Should Be the Future of Software Development

It’s hard to learn, but your code will produce fewer nasty surprises

11 min read
Vertical
A plate of spaghetti made from code
Shira Inbar
DarkBlue1

You’d expectthe longest and most costly phase in the lifecycle of a software product to be the initial development of the system, when all those great features are first imagined and then created. In fact, the hardest part comes later, during the maintenance phase. That’s when programmers pay the price for the shortcuts they took during development.

So why did they take shortcuts? Maybe they didn’t realize that they were cutting any corners. Only when their code was deployed and exercised by a lot of users did its hidden flaws come to light. And maybe the developers were rushed. Time-to-market pressures would almost guarantee that their software will contain more bugs than it would otherwise.

Keep Reading ↓Show less
{"imageShortcodeIds":["31996907"]}