Subspace Rebuilt the Internet for Real-Time Applications
The company uses "Internet weather" mapping, automatic rerouting, and dedicated fiber to speed up traffic
The Internet was designed to move data worldwide, and to do so in spite of disruptions from natural disasters, nuclear attacks, or other catastrophes. At first the goal was just to increase the volume of data moving over networks. But with the rising importance of real-time applications like videoconferencing and online gaming, what now matters most is reducing latency—the time it takes to move data across the network.
By forcing vast numbers of people to work and socialize remotely, the ongoing COVID-19 pandemic has greatly increased the demand for time-sensitive applications. The challenges begin at one end of the network, where the data's sender is located, and continue along the route all the way to the user waiting to receive the data at the other end. When you pass data in real time along multiple routes among a number of separate points, delays and disruptions often ensue. This explains the dropped calls and interruptions in conference calls.
One way to minimize such delays is by cutting a path through the Internet, one that takes into account the traffic conditions up ahead. My company, Subspace, has built such a network using custom hardware and a proprietary fiber-optic backbone. And we've shown it doesn't have to be complicated—users don't have to do anything more complicated than logging onto a Web portal. Put together, Subspace has created a "weather map" for the Internet that can spot choppy or stormy parts of the network and work around them for better, faster real-time data movement.
The online transformation occasioned by the current pandemic can be seen in a single statistic. In December 2019 the videoconferencing company Zoom had 10 million daily participants, and by April of the following year it had 300 million. Most of those new recruits to the real-time Internet were taken by surprise by problems that have been plaguing online gamers for decades.
Subspace was founded in early 2018. When we started, we anticipated that Internet performance for real-time applications wasn't optimal, but it turned out to be far worse than we had imagined. More than 20 percent of Internet-connected devices experienced performance issues at any given time, and 80 percent had major disruptions several times a day.
We initially focused on multiplayer games, where a player's experience depends on real-time network performance and every millisecond counts. In the second half of 2019, we deployed our network and technology for one of the largest game developers in the world, resulting in an order-of-magnitude increase in engagement and doubling the number of players with a competitive connection.
Internet performance directly affects online gaming in two ways: First you must download the game, a one-time request for a large amount of data—something that today's Internet supports well. Playing the game requires small transfers of data to synchronize a player's actions with the larger state of the game—something the Internet does not support nearly as well.
Gamers' problems have to do with latency, variations in latency called jitter, and disruptions in receiving data called packet loss. For instance, high-latency connections limit the speed of "matchmaking," or the process of connecting players to one another, by restricting the pool of players who can join quickly. Slower matchmaking in turn can cause frustrated players to quit before a game starts, leaving a still smaller matchmaking pool, which further limits options for the remaining players and creates a vicious cycle.
In 2020, when COVID-19 pushed the world to videoconferencing and distance learning, these performance issues suddenly began to affect many more people. For example, people who worked on IT help desks began working remotely, and managers had to scramble to find ways for those workers to answer calls in a clear and reliable way. That's far harder to do from a person's home than from a central office that's on a robust fiber-optic cable line. On top of that, call volume at contact centers is also at an all-time high. Zendesk, a customer-service software provider, found that support tickets increased by 30 percent during the period of February 2020 to February 2021, compared with the previous year. The company also estimates that call volume will stabilize at about 20 percent higher than the prepandemic average.
The shifts in online usage created by the pandemic are also strengthening the case to further democratize the Internet—the idea that there must be a universal, consistent standard of use to everyone, regardless of who or where they are. This is not an unqualified good, because email has very different requirements from those of an online game or a videoconference.
In the 1990s, Internet access was expanded from the world of the military and certain educational organizations to a truly universal system. Then, content delivery networks (CDNs) like Akamai and Cloudflare democratized data caching by putting commonly requested data, such as images and videos, into data centers and servers closer to the "last mile" to the ultimate users. Finally, Amazon, Microsoft, and others built cloud-computing data centers that put artificial intelligence, video editing, and other computationally intensive projects closer to last-mile users.
Connections between nodes are designed around delivering as much data as possible, rather than delivering data consistently or with minimal delay.
But there's still one final stage of democratization that hasn't happened—the democratization of the paths through which data is routed. The Internet connects hundreds of millions of nodes, but the actual performance of the paths connecting these nodes varies wildly, even in major cities. Connections between nodes are designed around delivering as much data as possible, rather than delivering data consistently or with minimal delay.
To use the analogy of a highway: Imagine you're in the middle of a road trip from Los Angeles to Chicago, and a prolonged blizzard is raging in the Rocky Mountains. While driving through Denver would typically be the most direct (and quickest) route, the blizzard will slow you down at best, or at worst result in an accident. Instead, it might make more sense to detour through Dallas. In doing so, you would be responding to the actual current conditions of the route, rather than relying on what their capabilities should be.
Democratized network elements wouldn't necessarily choose the best route based on the lowest cost or highest capacity. Instead, as Google Maps, Waze, and other navigation and route-planning apps do for drivers, a fully democratized Internet would route data along the pathway with the best performance and stability. In other words, the route with the most throughput or the least number of hops would not be automatically prioritized.
Many people got a crash course in remote work and videoconference meetings during the pandemic. If you're one such individual, you've almost certainly been stuck in at least one stuttering or lagging call. Video calls and other real-time applications demonstrate the ways in which the Internet's current infrastructure is ill-equipped to handle them.
Latency is simple to explain: The little bundles of data called packets take longer than expected to get from sender to receiver. Latency is unavoidable—signals take time to travel over any distance—but additional latency is undesirable. One common source of unwanted latency is a router along a signal's path that's clogged with too many data packets trying to get to different destinations at the same time.
Jitter is to latency as acceleration is to speed. Just as acceleration indicates a change in speed, jitter is a change in the average latency of a sequence of signal transmissions. When data packets take varying amounts of time to get from sender to receiver in a video call, it can cause the other person's video to begin a cycle of stuttering, freezing, and suddenly speeding up momentarily.
Sometimes, data packets can just vanish. They may be routed to the wrong destination, or, in a wireless transmission, they may be blocked by an unexpected obstruction. Receivers try to anticipate a certain amount of packet loss by sending redundant data packets, but if the rate of loss is too great, a video can freeze, or worse, cut out entirely.
The traditional emphasis on pushing more data through the network ignores all the things that cause latency—issues like instability, geographic distance, or circuitous paths. This is why you can have a Wi-Fi connection of 100 megabits per second and still have a choppy Zoom call. When that happens, the network elements connecting you to the others in your call aren't delivering a consistent performance.
Internet routing often takes circuitous paths—following national borders, mountain ranges, and more—just as driving cross-country often requires several highways. Even worse, ISP and carrier networks don't know what exists beyond themselves, and as they pass packets to one another, they often backtrack. The last mile in particular—akin to pulling off the interstate and onto local roads—is thorny, as traffic changes hands between carriers based on cost, politics, and ownership. It's this indirect routing, networks' lack of awareness of the entire Internet, and last-mile inconsistency that make delivering data with minimal delay extremely difficult.
A better solution is to reroute data to the path with the best performance at the moment. This may sound simple enough in theory, but it can be complicated to implement for a few reasons.
For one, the emergence of Netflix and other video-streaming platforms over the past 20 years has tended to impede real-time applications. Because such platforms prioritize putting often-requested data closer to network edges, these networks have become less conducive to latency-sensitive video calls and online games. At the same time, while ISPs have advertised—and provided—faster upload and download speeds over time, established network infrastructures have only become more entrenched. It's a perfect case of the adage "If all you have is a hammer, everything looks like a nail."
A more significant problem is that ISPs and CDNs have no practical control over data after it's been routed through their networks. Just because you pay a particular ISP for service doesn't mean that every request you make stays confined to the parts of the network they control. In fact, more often than not, requests don't.
One operator might route data along an optimal path in its own network, and transfer the data to another operator's network, with no idea that the second operator's network is currently clogged. What operators need is an eye in the sky to coordinate around potential and emerging delays that they themselves might not be aware of. That's one aspect of what Subspace does.
In essence, Subspace has created its own real-time mapping of Internet traffic and conditions, similar to the way Waze maps traffic on roads and highways. And like Waze, which uses the information it gathers to reroute people based on the current traffic conditions, Subspace can do the same with Internet traffic, seeing beyond any one portion controlled by a particular operator.
Subspace uses custom global routers and routing systems, as well as dedicated fiber mesh networks, to provide alternative pathways for routes that, for one reason or another, tend to suffer from latency more than most. This hardware has been installed inside more than 100 data-center facilities worldwide. An IT administrator can easily arrange to route outgoing traffic through the Subspace network and thus get that traffic to its destination sooner than the traditional public domain name system (DNS) could manage.
In essence, Subspace has created its own real-time mapping of Internet traffic and conditions, similar to the way Waze maps traffic on roads and highways.
Subspace uses custom software to direct the traffic around any roadblocks that may lie between it and its target destination. In real time, the software takes network measurements of latency (in milliseconds), jitter (in latency variation), and packet loss (in the number of successfully delivered data packets within a time interval) on all possible paths. Whenever there is an unusual or unexpected latency spike—what we like to call "Internet weather"—the software automatically reroutes traffic across the entire network as needed.
Enterprises have tried to avoid bad Internet weather by building private networks using technologies such as SD-WAN (software-defined wide area networking) and MPLS (multiprotocol label switching). However, these methods work only when an entire workforce is reporting to a handful of centralized offices. If large numbers of employees are working from home, each home has to be treated as a branch office, making the logistics too complex and costly.
Besides random bad weather, there are some traffic problems on the public Internet that arise as side effects of certain security measures. Take the act of vandalism known as a distributed denial-of-service (DDoS) attack, in which malicious actors flood servers with packets in order to overload the systems. It's a common scourge of multiplayer games. To thwart such attacks, the industry standard "DDoS scrubbing" technique attempts to separate malicious traffic from "safe" traffic. However, getting traffic to a scrubbing center often means routing it through hairpin twists and turns, detours that can add upwards of 100 milliseconds in latency.
Subspace instead protects against DDoS attacks by acting as a traffic filter itself, without changing the path that packets take or in any way adding latency. In the last two years, we estimate that Subspace has already prevented hundreds of DDoS attacks on multiplayer games.
The tricks that helped the Internet grow in its early decades are no longer delivering the expected bang for their buck, as people now demand more from networks than just bandwidth. Just pushing large volumes of data through the network can no longer sustain innovation.
The Internet instead needs stable, direct, speed-of-light communication, delivered by a dedicated network. Until now, we've been limited to working with large companies to address the particular network needs they might have. However, we've recently made our network available to any application developer in an effort to give any Internet application more network performance.
With this new, improved Internet, people won't suffer through choppy Zoom calls. Surgeons performing telemedicine won't be cut off in mid-suture. And the physical, augmented, virtual-realities-merging metaverse will at last become possible.
This article appears in the November 2021 print issue as "The Internet's Coming Sunny Days."
- We Must Break the Latency Barrier - IEEE Spectrum ›
- Breaking the Latency Barrier - IEEE Spectrum ›
- What Can the Metaverse Learn From Second Life? - IEEE Spectrum ›