Facebook’s deep-learning artificial intelligence systems have learned to recognize your friends in your photos, and Google’s AI has learned to anticipate what you’ll be searching for. But there’s no need to feel left out, even if your company’s computers haven’t learned much lately.
A growing number of tech giants and startups have begun offering machine learning as a cloud service. That means other companies and startups do not need to develop their own specialized hardware or software to apply deep learning—the high-powered version du jour of machine learning—to their specific business needs.
“Deep-learning algorithms dominate other machine-learning methods when data sets are large,” says Zachary Chase Lipton, a deep-learning researcher in the Artificial Intelligence Group at the University of California, San Diego, who has examined cloud AI services from companies such as Amazon and IBM. “Thus any company or application that has well-formed prediction problems—such as forecasting demand or translating between languages—could benefit from deep learning.”
With cloud-based deep learning, companies can simply select a cloud service and browse its online offerings of application programming interfaces for software tasks such as recognizing images of corgi dogs or automatically translating a restaurant menu. Some services will even tailor their machine-learning tools to the data and needs of individual companies.
According to Lipton, the rise of cloud services for machine learning hinges on at least two factors: first, a continued rise in the demand for machine learning as the technology has matured in its ability to solve a wide variety of problems with economic value; and second, the relative scarcity of machine-learning talent, which makes it tough for every company to build its own machine-learning team. Competition for talent has become even tougher with startups trying to compete with tech giants like Microsoft and IBM, which can afford to vacuum up the best and brightest.
Most commercial applications of machine learning rely on supervised learning. This involves algorithms that can observe correctly labeled examples and learn to perform certain tasks through imitation. Artificial neural networks are currently the most popular and successful algorithms for supervised machine learning on large data sets. They learn by passing information through an interconnected network of multiple nodes (also known as neurons). The connections between these nodes each have adjustable weights that influence the flow of information through the graph. Nodes are generally arranged in layers. But historically it was feasible to train networks with only one hidden layer of neurons in addition to the input and output layers.
Deep learning takes these methods to the next level by filtering the data through multiple layers of neurons, Lipton explains. At each layer, the network can learn successively more abstract representations of relationships between data points. With enough layers and enough nodes, deep neural networks can perform a host of functions.
The challenge in building a neural network is training it for specific tasks. Starting from a random setting of the weights, examples from the data set are presented to the neural network one after the next. Each time, the neural network’s weights are tuned slightly to bring the network’s output closer to the correct output.
The Ivy League of Deep Learning
|Amazon||Amazon Machine Learning||DSSTNE
Deep Scalable Sparse
Tensor Network Engine;
library for building deep-learning models
|None||Tools for deep-learning
models released through the open-source Torch library
|Google Cloud Machine Learning||TensorFlow
Library for developing deep-learning models and
more general machine-learning models
|Dark Blue Labs, DeepMind, DNNresearch, Moodstocks,
Optimization platform for general machine-learning
models on the open-source Apache Spark library
Toolkit; library for building deep-learning models
A number of startups seem primarily interested in demonstrating their deep-learning research in order to draw the attention of larger companies that might acquire them, Lipton notes. Salesforce.com and Twitter have acquired startups, including MetaMind and Whetlab, respectively. Some of these acquisitions have been done to swell the ranks of these tech giants’ own deep-learning teams.
But some startups have focused their efforts on applying deep learning to very niche industry needs. For example, San Francisco–based Enlitic is using deep learning to help physicians spot signs of certain diseases or health conditions in medical images taken by X-ray or MRI machines. And Atomwise, in Mountain View, Calif., has applied deep learning to the discovery of new pharmaceuticals.
Other startups aim to build platforms for broader categories that apply to many industries. Seattle-based Dato has created a deep-learning tool kit used by developers at companies such as Cisco and PayPal. Clarifai, a startup in New York City, currently provides tools for automatically filtering and tagging images and video segments; its deep-learning technology is beneficial to end users as diverse as travel-photography sites, real estate agents, and online-comment moderators looking to filter out pornography.
“When you think of everything from your email inbox to the advertisements you see with search results to image tagging, every possible aspect of these products is already benefiting from machine learning or will very soon,” says Matthew Zeiler, Clarifai’s founder and CEO.
The challenge for deep-learning startups will be finding their niche in a crowded space. They would be wise to avoid directly challenging the tech giants in providing machine-learning services, says Naveen Rao, cofounder and CEO of Nervana Systems. His company has bet on building an optimized deep-learning platform that can provide results in record time when running neural networks on standard GPU hardware. Nervana has also been developing its own specialized chips that could give deep learning an additional performance boost.
Ersatz Labs, a startup in Pacifica, Calif., found out just how perilous it can be to take on the industry’s Goliaths. Ersatz suspended development of its cloud machine-learning ambitions last year after falling short on fundraising. Dave Sullivan, the company’s CEO, points to the difficulties of selling services or products that enable people to do deep learning when many of those people work for large tech companies that prefer building their own tools and hiring internally.
The truth of that statement was borne out at the 2016 Google I/O conference, where Google announced that it has, for more than a year, been using its own custom-built microchips—named Tensor Processing Units—to boost its machine-learning applications. Such specialized chips provide the hardware to run the Google Cloud Machine Learning platform, which launched in 2016.
“Everyone’s waiting for that killer app in deep learning right now, and nobody has figured it out yet,” Sullivan says. “It’s not platforms, though.”
This article appears in the August 2016 print issue as “For Sale: Deep Learning.”
Jeremy Hsu has been working as a science and technology journalist in New York City since 2008. He has written on subjects as diverse as supercomputing and wearable electronics for IEEE Spectrum. When he’s not trying to wrap his head around the latest quantum computing news for Spectrum, he also contributes to a variety of publications such as Scientific American, Discover, Popular Science, and others. He is a graduate of New York University’s Science, Health & Environmental Reporting Program.