The July 2022 issue of IEEE Spectrum is here!

Close bar

Specialized AI Chips Hold Both Promise and Peril for Developers

In the next few years, chipmaking giants and well-funded startups will race to gain market share

3 min read
Abstract illustration of brightly colored parts of a microchip
Illustration: iStock

This is a guest post. The views expressed in this article are solely those of the author and do not represent positions of IEEE Spectrum or IEEE.

When it comes to the compute-intensive field of AI, hardware vendors are reviving the performance gains we enjoyed at the height of Moore's Law. The gains come from a new generation of specialized chips for AI applications like deep learning. But the fragmented microchip marketplace that's emerging will lead to some hard choices for developers.

The new era of chip specialization for AI began when graphics processing units (GPUs), which were originally developed for gaming, were deployed for applications like deep learning. The same architecture that made GPUs render realistic images also enabled them to crunch data much more efficiently than central processing units (CPUs). A big step forward happened in 2007 when Nvidia released CUDA, a toolkit for making GPUs programmable in a general-purpose way.

AI researchers need every advantage they can get when dealing with the unprecedented computational requirements of deep learning. GPU processing power has advanced rapidly, and chips originally designed to render images have become the workhorses powering world-changing AI research and development. Many of the linear algebra routines that are necessary to make Fortnite run at 120 frames per second are now powering the neural networks at the heart of cutting-edge applications of computer vision, automated speech recognition, and natural language processing.

Now, the trend toward microchip specialization is turning into an arms race. Gartner projects that specialized chip sales for AI will double to around US $8 billion in 2019 and reach more than $34 billion by 2023. Nvidia's internal projections place the market for data center GPUs (which are almost solely used to power deep learning) at $50 billion in the same time frame. In the next five years, we'll see massive investments in custom silicon come to fruition from Amazon, ARM, Apple, IBM, Intel, Google, Microsoft, Nvidia, Qualcomm. There are also a slew of startups in the mix. CrunchBase estimates that AI chip companies, including Cerebras, Graphcore, Groq, Mythic AI, SambaNova Systems, and Wave Computing, have collectively raised more than $1 billion.

To be clear, specialized AI chips are both important and welcomed, as they're catalysts for transforming cutting-edge AI research into real-world applications. However, the flood of new AI chips, each one faster and more specialized than the next, will also seem like a throwback to the rise of enterprise software. We can expect cut-throat sales deals and software specialization aimed at locking developers into working with just one vendor.

Imagine if, 15 years ago, the cloud services AWS, Azure, Box, Dropbox, and GCP all came to market within 12 to 18 months. Their mission would have been to lock in as many businesses as possible—because once you're on one platform, it's hard to switch to another. This type of end-user gold rush is about to happen in AI, with tens of billions of dollars, and priceless research, at stake.

Chipmakers won't be short on promises, and the benefits will be real. But it's important for AI developers to understand that new chips that require new architectures could make their products slower to market—even with faster performance. In most cases, AI models are not going to be portable between different chip makers. Developers are well aware of the vendor lock-in risk posed by adopting higher-level cloud APIs, but in the past, the actual compute substrate has been standardized and homogeneous. This situation is going to change dramatically in the world of AI development.

It's quite likely that more than half of the chip industry's revenue will soon be driven by AI and deep learning applications. Just as software begets more software, AI begets more AI. We've seen it many times: Companies initially focus on one problem, but ultimately solve many. For example, major automakers are striving to bring autonomous cars to the road, and their cutting-edge work in deep learning and computer vision is already having a cascading effect; the research is leading to such offshoot projects as Ford's delivery robots.

As specialized AI chips come to market, the current chip giants and major cloud companies will probably strike exclusive deals or acquire top performing startups. This trend will fragment the AI market rather than unifying it. All that AI developers can do now is understand what's about to happen and plan how they'll weigh the benefits of a faster chip with the costs of building on new architectures.

The Conversation (0)

3 Ways 3D Chip Tech Is Upending Computing

AMD, Graphcore, and Intel show why the industry’s leading edge is going vertical

8 min read
Vertical
A stack of 3 images.  One of a chip, another is a group of chips and a single grey chip.
Intel; Graphcore; AMD
DarkBlue1

A crop of high-performance processors is showing that the new direction for continuing Moore’s Law is all about up. Each generation of processor needs to perform better than the last, and, at its most basic, that means integrating more logic onto the silicon. But there are two problems: One is that our ability to shrink transistors and the logic and memory blocks they make up is slowing down. The other is that chips have reached their size limits. Photolithography tools can pattern only an area of about 850 square millimeters, which is about the size of a top-of-the-line Nvidia GPU.

For a few years now, developers of systems-on-chips have begun to break up their ever-larger designs into smaller chiplets and link them together inside the same package to effectively increase the silicon area, among other advantages. In CPUs, these links have mostly been so-called 2.5D, where the chiplets are set beside each other and connected using short, dense interconnects. Momentum for this type of integration will likely only grow now that most of the major manufacturers have agreed on a 2.5D chiplet-to-chiplet communications standard.

Keep Reading ↓Show less
{"imageShortcodeIds":[]}