The December 2023 issue of IEEE Spectrum is here!

Close bar

How to Turn the Lights Back on After a Blackout

Restarting the grid after a total failure is trickier than it may appear

4 min read
​A grid operator works in a control room.
Jacob Hannah/The New York Times/Redux

Restoring power quickly after a major blackout can mean the difference between life and death, but cold starting an entire electrical grid is a complex and delicate process. A hybrid computer model from Sandia National Laboratories that combines optimization, physical simulations, and cognitive models of grid operators promises to come up with a fast and reliable plan to get the lights back on.

While power outages are always disruptive, they typically impact only smaller portions of the overall grid. A complete loss of power over the entire network is much more serious, and requires operators to effectively jump-start the grid with so-called “black start” generators. This involves a complicated balancing act to avoid mismatches between energy generation and consumption, as different sections of the grid are gradually brought back online. Get it wrong and the grid can collapse again.

“You wind up having to basically feel around in the dark to make sure, ‘Does reality match up with what all of my data tells me?’ ”
—Kevin Stamber, Sandia National Laboratories

Such events are thankfully rare, says Kevin Stamber , who led the project at Sandia, but there have been some close calls in recent years. When Texas’s power system experienced major disruptions during winter storms last February, operators were just minutes away from a complete grid failure that could potentially have taken months to resolve, he says. With climate change increasing the frequency of extreme weather events, and the growing threat of cyberattacks on critical power infrastructure, the danger is only likely to increase.

That prompted Stamber’s team to come up with a new method for creating black start plans better able to cope with the often unpredictable behavior of real-world power systems. That’s no easy feat though, he says. “It’s a very, very delicate process on a very large system,” he says. “They [black starts] are complex, challenging, difficult to solve and require a substantial computational commitment to be able to get to a solution.”

The gold standard approach treats a black start as an optimization problem, aimed at working out the best order in which to restore different grid components, such as generators, substations and power lines. Existing techniques tend to assume that the operator has full visibility and control over the grid though, which often isn’t the case.

What happens when the power grid fails?

In a blackout, key components may be damaged, says Stamber, and operators may not have a full picture of what is available to them. And while utilities are likely to have a rough idea what the load on different parts of the grid should be, there are no guarantees. “You wind up having to basically feel around in the dark to make sure, ‘Does reality match up with what all of my data tells me?’ ” says Stamber.

That prompted the team to pair a cutting-edge optimization approach created by researchers at Lawrence Livermore National Laboratory and the University of California, Berkeley, with additional modules designed to simulate how the grid could react to the restoration plan, and how the operators implementing it would behave.

Researchers envisage something along the lines of a Choose Your Own Adventure book, but for grid operators.

The optimization model’s goal is to restore power as quickly as possible, while ensuring that the load on the grid is stable and doesn’t cause another failure. It produces a restoration schedule outlining what order to power up different generators and when to connect different portions of the grid. This is also checked against a physical model of power flow to make sure each step is feasible.

This restoration plan is then fed to a cognitive model of a grid operator built using the ACT-R framework , which makes it possible to simulate human decision making. The model was built by encoding expert knowledge about how to carry out key steps involved in grid restoration, and is able to read the restoration plan and use a simulated console to implement it.

However, the console is also hooked up to a dynamic physics-based simulation of the grid, which is designed to mimic how the network responds to the operators’ actions, sometimes in hard to predict or challenging ways. The cognitive model is presented with information on the grid’s response through the simulated console, and if steps from the restoration plan cause any stability issues, it can take corrective action before moving onto the next step.

By simulating how an operator might deploy a restoration plan and react to the grids behavior, Stamber hopes to create plans much more tolerant of unexpected behavior. He envisages something along the lines of a Choose Your Own Adventure book, but for grid operators. “There are certain points along the way where things don’t go the way you’re expecting, and you wind up in a different portion of the book,” he says.

The idea of incorporating the cognitive behavior of the operator is an interesting one, says Saifur Rahman , a professor of electrical engineering at Virginia Tech and 2023 IEEE President and CEO. But he points out that in a real-world system control center there are multiple operators with different perspectives interacting with each other. Also, so far, the team has tested the approach only on the IEEE Reliability Test System (RTS-96), which is small compared to a real-life power system, Rahman notes. “In order to be credible in a real situation I would have liked to see it tested on a 20,000- or 30,000-node system,” he says.

Part of the reason the team didn’t, says Stamber, is that they simply ran out of time and budget for the project. But he also thinks their approach does need some work to make it more tractable on larger systems, perhaps by breaking the optimization problem up into smaller subproblems that are less computationally intensive. Either way, the team is now looking for potential utility partners that can work with them on applying their techniques to more realistic problems.

The Conversation (3)
Mauno Aho
Mauno Aho 17 Feb, 2023

Fingrid (Finland) had a trial in Lapland September 2014. The power was shut down in Rovaniemi town area and the purpose was to boot up the grid using small hydro power generating stations to supply larger ones with seed power. After 45 minutes power outage the test had to be suspended and the isolated part of the grid returned back to national grid. Still there were some local power outages hours after.Had everything gone according the plans this local blackout would have lasted some 15 minutes and then areas would have been gradually connected to the isolated grid.

David Morton
David Morton 16 Feb, 2023

During the Great Northeast blackout of November, 1965 that affected most of the northeastern US states and part of Canada, engineers discovered that they needed electricity to start the machines that generated electricity. "Without any power, the boilers had to be manually lit. One engineer, Andrew Corry, recalled that 'in Boston at the time off the blackout, we lost everything. We used wood --wood!-- to start a [steam boiler] furnace. That's how we started and got back. Every other utility, I'm sure, has similar stories.'" --from Power: A Survey History of Electric Power Technology Since 1945 (IEEE,2000)

Anjan Saha
Anjan Saha 09 Feb, 2023

After Total Black out due to Grid Failure, the time taken to Normalize the Grid Power

is quite long even hours together. Synchornize the Generating Stations with Grid by Synchro phasor Relay and bringing back to grid frequency(60 Hz/50 Hz) needs communications between Grid operators, Generating Stations and Industrial & Domestic Power Suppliers.

Loads to Power Grid are connected in phased manner after blackout so that Grid Frequecy Stabilize and Large Generator does not Oscilate and goes out of the synchronism. GPS Clock also required to see the timing of Synchronism.Operator has to do part by part.