AI, particularly the huge neural networks that meant to understand and interact with us humans, is not a natural fit for computer architectures that have dominated for decades. A host of startups recognized this in time to develop chips and sometimes the computers they'd power. Among them, Palo Alto-based
SambaNova Systems is a standout. This summer the startup passed US $1 billion in venture funding to value the company at $5 billion. It aims to tackle the largest neural networks that require the most data using a custom-built stack of technology that includes the software, computer system, and processor, selling its use as a service instead of a package. IEEE Spectrum spoke to SambaNova CEO Rodrigo Liang in October 2021.
Rodrigo Liang on…
- SambaNova's origin story
- What it takes to deliver huge AIs like GPT-3
- AI as a service
- Things you can do with massive amounts of data
IEEE Spectrum: What was the original idea behind SambaNova?
Rodrigo Liang: This is the biggest transition since the internet, and most of the work done on AI is done on legacy platforms, legacy [processor] architectures that have been around for 25 or 30 years. (These architectures are geared to favor the flow of instructions rather than the flow of data.) We thought, let's get back to first principles. We're going to flip the paradigm on its head and not worry as much about the instructions but worry about the data, make sure that the data is where it needs to be. Remember, today, you have very little control how you move the data in a system. In legacy architectures, you can't control where the data is, which cache its sitting on.
“Once we created the hardware, suddenly it opened up opportunities to really explore models like GPT-3.”
—Rodrigo Liang, CEO SambaNova
So we went back to first principles and said, "Let's just take a look at what AI actually wants, natively, not what other architectures cause AI to be." And what it wants is to actually create networks that are changing all the time. Neural nets have data paths that connect and reconnect as the algorithm changes.
We broke things down to a different set of sub-operators. Today, you have add, subtract, multiply, divide, load, and store as your typical operators. Here, you want operators that help with dataflow—things like map, reduce, and filter. These are things that are much more data focused than instruction focused.
Once you look at how these software programs want to be and how they want to flow, then the conclusion comes about what base units you need the amount of software controllability you need to allow these networks to interconnect and flow most efficiently. Once you've got to that point, then you realize "we can actually implement that in a processor"—a highly dense, highly efficient, highly performing piece of silicon with a single purpose of running AI efficiently. And that's what we built here with SambaNova.
Is this an example of hardware-software co-development, a term that I am hearing more and more?
Liang: 100 percent. The first step is you take the software, you break it down, just see natively what you want it to do. Then we build the hardware. And what the hardware allowed us to do is explore a much bigger problems than we could imagine before. In the developers' lab, things are small, because we can't handle production-size data sets. But once we created the hardware, suddenly it opened up opportunities to really explore models like GPT-3, which people are running using thousands of GPUs and with hundreds of people managing that one model. That's really impractical. How many companies are going to be able to afford to hire hundreds of people just to manage one model and have thousands of GPUs interconnected to run one thing?
SambaNova Systems Cardinal SN10 Reconfigurable Dataflow Unit (RDU) is the industry's next-generation processor. RDUs are designed to allow the data to flow through the processor in ways in which the model was intended to run, freely and without any bottlenecks.SambaNova
So we asked, "How do we automate all of this?" Today, we deploy GPT-3 on a customer's behalf, and we operate the model for them. The hardware we're delivering as a software service. These customers are subscribing to it and paying us a monthly fee for that prediction.
So now we can ask, how well is the software operating? How well is the hardware operating? With each generation, you iterate, and you get better and better. That's opposed to traditional hardware design where once you build a microprocessor, you throw it over the fence, and then somebody does something with it, and maybe, eventually, you hear something about it. Maybe you don't.
Because we define it from the software, we build the hardware, we deploy the software, we make our money off these services, then the feedback loop is closed. We are using what we build, and if it's not working well, we'll know very quickly.
“We’re not trying to be everything to everybody. We’ve picked some lanes that we’re really good at and really focus on AI for production.”
So you are spinning up new silicon that involves that feedback from the experience so far?
Liang: Yeah. We're constantly building hardware; we're constantly building software—new software releases that do different things and are able to support new models that maybe people are just starting to hear about. We have strong ties to university research with Stanford, Cornell, and Purdue professors involved. We stay ahead and are able to look at what's coming; so our customers don't have to. They will trust that we can help them pick the right models that are coming down the pipeline.
Is this hardware-and-software as service, full stack model of a computing company, the future in this space?
Liang: We're the only ones doing it today and for a couple different reasons. For one, in order to do these differentiated services, you really need a piece of silicon that's differentiated. You start with people that can produce a high-performance piece of silicon to do this type of computing, that requires a certain skill set. But then to have the skill set to build a software stack and then have the skill set to create models on behalf of our customers and then have the skill set to deploy on a customer's behalf, those are all things that are really hard to do; it's a lot of work.
For us, we've been able to do it because we're very focused on a certain set of workloads, a certain type of model, a certain type of use case that's most valuable to enterprises. We then focus on taking those to production. We're not trying to be everything to everybody. We've picked some lanes that we're really good at and really focus on AI for production.
“How are [smaller and medium-sized companies] going to compete in this next age of AI? They need people that come in and provide them a lot of the infrastructure so they don't have to build it themselves.”
For example, with natural language models, we're taking those for certain use cases and taking those to production. Image models, we're thinking about high resolution only. The world of AI is actually shockingly low res these days. [Today's computers] can't train high-res images; they have to downsample them. We're the only ones today that are able to do true resolution, original resolution, and train them as is.
It sounds like your company has to have a staff that can understand the complete stack of the technology from software down to the chip.
Liang: Yeah. That's one of the most differentiated advantages we have. Chip companies know how to do chips, but they don't understand the stack. AI companies know how to do AI, but they can't do silicon. And the compiler technology—think about... how few companies are actually writing languages. These technologies are hard for certain classes of people to really understand across the divide. We were able to assemble a team that can truly do it. If you want to do hardware-software co-design, you truly have to understand across the boundaries, because if you don't, then you're not getting the advantages of it.
The other thing that I think you are also touching on is the expertise in the customer's own house. If you go outside of Fortune 50, most of them do not have an AI department with 200 data scientists that are A players. They might have 5. If you think about the expertise gap between these larger companies and your Fortune 500 company, how are they going to compete in this next age of AI? They need people that come in and provide them a lot of the infrastructure so they don't have to build it themselves. And most of those companies don't want to be AI centers. They have a very healthy business selling whatever they're selling. They just need the capabilities the AI brings.
SambaNova Systems DataScale is an integrated software and hardware system optimized for dataflow from algorithms to silicon. SambaNova DataScale is the core infrastructure for organizations that want to quickly build and deploy next-generation AI technologies at scale.Samba Nova
We do that on their behalf. Because everything is automated, we can service our systems and our platforms more efficiently than anybody else can. Other service companies would have to staff up on somebody else's behalf. But that wouldn't be practical. To the extent that there is a shortage of semiconductors, there is also a shortage of AI experts. So if I were to hire just as many as my customer had to hire, I couldn't scale the business up. But because I can do it automatically and much more efficiently, they don't have to hire all those people, and neither do I.
“Give me the entire data set; don’t chop it up.”
What's the next milestone you are looking towards? What are you working on?
Liang: Well, we've raised over $1 billion in venture capital at $5 billion valuation, but the company's fairly young. We're just approaching a four-year anniversary, and so we've got a lot of aspirations for ourselves as far as being able to help a much broader set of customers. Like I said, if you really see how many companies are truly putting AI in production, it's still a very small percentage. So we're very focused on getting customers into production with AI and getting our solutions out there for people. You're going to see us talk a lot about large data and large models. If you've got hairy problems with too much data and the models you need are too big, that's our wheelhouse. We're not doing little ones. Our place is when you have big, big enterprise models with tons of data; let us crunch on that for you. We're going to deploy larger and larger models, larger and larger solutions for people.
Tell me about a result that you that kind of took your breath away? What is one of the coolest things that you've seen that your system has done?
Liang: One of our partners, Argonne National Labs, they're doing this project mapping the universe. Can you imagine this? They're mapping the universe.
They've been doing a lot of work trying to map the universe [training an AI with] really high-resolution images they've taken over many, many years. Well, as you know, artifacts in the atmosphere can really cause a lot of problems. The accuracy is actually not very good. You have to downsample these images and stitch them together, and then you've got all the atmospheric noise.
There are scientists that are much smarter than I am to figure all that stuff out. But we came in, shipped the systems, plugged it in and within 45 minutes, they were up and training. They mapped the whole thing without changing the image size and got a higher level of accuracy than what they had gotten for years before and in much, much less time.
We're really proud of that. It's the type of thing that you're confident that your technology can do, and then you see amazing customers do something you didn't expect and get this tremendous result.
Like I said, we're built for large. In e-commerce with all the uses and all of the products they've got, give me the entire data set; don't chop it up. Today, they have to chop it, because infrastructure doesn't allow it. In banking, all of the risks that you have across all your entities, well, let me see all the data. With all these different use cases, more data produces better results. We're convinced that if you have more data, it actually produces better results, and that's what we're built for.
- Here's How Google's TPU v4 AI Chip Stacked Up in Training Tests ... ›
- Specialized AI Chips Hold Both Promise and Peril for Developers ... ›
- New Records for AI Training - IEEE Spectrum ›
Samuel K. Moore is the senior editor at IEEE Spectrum in charge of semiconductors coverage. An IEEE member, he has a bachelor's degree in biomedical engineering from Brown University and a master's degree in journalism from New York University.