System Ingests AT&T Network Logs to Reveal Root Cause of Errors

Behind the easy connectivity that much of the world enjoys, commercial networks are hard at work establishing connections, authenticating users, and verifying services. When an error occurs, it can be hard for providers to pinpoint the root cause because an error message may be generated in a different spot within a network than the place where the actual error happened.

To hone in on the source of such errors, researchers have analyzed error logs related to millions of messages exchanged through AT&T’s network. The group’s aim was to learn about latent events in particular. Latency errors may cause delays in call propagation and transmission, disconnection issues, and network bottlenecks. Each error event can produce a sequence of messages whose type and frequency could vary based on the latency between the various network elements, network load, and other events.

“We have come up with a set of algorithms that can group the raw error data into events described by important keywords,” says Siddhartha Satpathi, a PhD candidate in electrical engineering at the University of Illinois at Urbana Champaign. “We are not identifying the cause of the events, we are simply separating the messages into groups, where each group consists of messages generated by a single event. Additionally, we identify the key messages which are associated with each event.” Then, a network operator can use these groupings to identify the root cause.

In a real network, Satpathi explains, errors that come from different geographical locations could be related to one another, and sometimes one physical error leads to thousands of error messages. He uses the example of Alice from Illinois who’s visiting California, making a phone call to Bob in New York. Before connecting the call, the base station close to Alice in California needs to verify her credentials, which are in her home station in Illinois.

Once that’s done, the call is routed through the network from California to New York. If a router breaks down somewhere along that network, it would result in error reports from all the connected networks and locations (California, New York, and Illinois). This group of error messages in the error log is what the researchers called an “event.”

That’s where the new algorithm comes in. The size of the error logs makes it impossible for a human engineer to go through the messages and figure out which ones were caused by the same event.

“Our algorithm groups these messages into few important events,” says Satpathi. “It also outputs some frequently occurring messages in these discovered events. This grouping of messages make the message log human interpretable, and can help an engineer decipher the root cause of the error.” The group recently published its work on network message logs in the journal IEEE/ACM Transactions on Networking.

In their research, Satpathi’s team considered comprised 97 million messages, of 39,330 types, sent over 15 days. These included syslog texts (raw-text messages generated by software associated with specific network elements, say a server, relay, or base station to a logging server, and which include a timestamp, and the message text describing the error) and alarms (which indicate specific fault conditions in a network element). The researchers then applied a two-stage algorithm, called Change-point Detection–Latent Dirichlet Allocation (CD-LDA), which uses the existing LDA algorithm as a subroutine, to this data.

The six hours that it took to run LDA on this dataset could be reduced, Satpathi says, by using faster versions of the LDA algorithm. This makes the study “very scalable,” he adds, for detecting errors on a commercial network.

internet at&t journal watch algorithms base stations error reporting wireless

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

System Ingests AT&T Network Logs to Reveal Root Cause of Errors

By analyzing millions of error messages in AT&T’s network data, researchers developed an algorithm that could help carriers detect problems faster

Vision 60 Quadruped Gets Arm Upgrade

Chiplet Boosts GPU Efficiency by 50%

Chess by Telegraph: A Surprising 1844 Innovation

Related Stories

6G Reflector Chip Tech Offers Road to 33 Gb/s

How 5G’s Rollout Rattled Hundreds of Pilots

Improving Cell Reception By Making Signals Noisier

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

Enjoy more free content and benefits by creating an account

Saving articles to read later requires an IEEE Spectrum account

The Institute content is only available for members

Downloading full PDF issues is exclusive for IEEE Members

Downloading this e-book is exclusive for IEEE Members

Access to Spectrum 's Digital Edition is exclusive for IEEE Members

Following topics is a feature exclusive for IEEE Members

Adding your response to an article requires an IEEE Spectrum account

Create an account to access more content and features on IEEE Spectrum , including the ability to save articles to read later, download Spectrum Collections, and participate in conversations with readers and editors. For more exclusive content and features, consider Joining IEEE .

Join the world’s largest professional organization devoted to engineering and applied sciences and get access to all of Spectrum’s articles, archives, PDF downloads, and other benefits. Learn more about IEEE →

Join the world’s largest professional organization devoted to engineering and applied sciences and get access to this e-book plus all of IEEE Spectrum’s articles, archives, PDF downloads, and other benefits. Learn more about IEEE →

Access Thousands of Articles — Completely Free

Create an account and get exclusive content and features: Save articles, download collections, and post comments — all free! For full access and benefits, subscribe to Spectrum.

System Ingests AT&T Network Logs to Reveal Root Cause of Errors

By analyzing millions of error messages in AT&T’s network data, researchers developed an algorithm that could help carriers detect problems faster

Vision 60 Quadruped Gets Arm Upgrade

Chiplet Boosts GPU Efficiency by 50%

Chess by Telegraph: A Surprising 1844 Innovation

Related Stories

6G Reflector Chip Tech Offers Road to 33 Gb/s

How 5G’s Rollout Rattled Hundreds of Pilots

Improving Cell Reception By Making Signals Noisier