LIPA report details PSEG LI's failures in Isaias storm response

Mark Harrington for Newsday

November 17, 2020

Originally published in Newsday on November 17, 2020.

For weeks before Tropical Storm Isaias landed on Aug. 4, PSEG Long Island officials knew of significant computer-system problems that ultimately crippled their response to the storm, according to a scathing new LIPA report obtained by Newsday.

But a combination of "mismanagement" and the lack of the autonomy from their New Jersey parent stifled their ability to fix them — even to this day, according to the report, which finds the problems so urgent that it recommends LIPA consider terminating PSEG’s contract if it can’t be reformed.

The report, by a LIPA task force investigating PSEG’s response to the storm that caused more than 646,000 outages, puts the blame for the widely criticized response on "mismanagement" by PSEG, charging the company "did not adequately prepare" for the storm, didn’t properly stress-test computer and telecom systems, and didn’t have manual backup plans in place.

The report also provides new indications that the region is not free of the problems that left more than 500,000 customers without power and in the dark about restoration times over a sweltering summer week.

According to the report, "many" of the root causes of PSEG’s failed response remain uncorrected. Recent tests of PSEG’s plan to fix the high-volume telephone system that failed during the storm show it "continue[s] to fail under only moderate testing loads." And the outage management computer system also "continues to fail when subjected to appropriate stress testing," LIPA’s report says, citing "deficiencies" in PSEG management of projects and outside vendors.

In an interview, LIPA chief Tom Falcone blamed PSEG’s "absentee management in New Jersey" and a "lack of accountability to Long Island" as core problems.

[Thomas Falcone, CEO of LIPA, wrote:]

Among the emails scoured by investigators on the task force was one from the PSEG Long Island operations supervisor, who found the outage management system, after a recent upgrade, was "NOT even managing on a day to day basis, and we are definitely NOT prepared for [a] weather event."

The email went up the chain of command, ultimately landing with PSEG Long Island chief operating officer Dan Eichhorn. Eichhorn was unable because of the management protocols of PSEG's New Jersey parent to undo the recent upgrade of the system, the report said.

"Unfortunately, PSEG Long Island’s most senior officer could not simply act on his own judgment and order the rollback, despite this high-risk situation with a [computer] system that was ‘not prepared for [a] weather event,’" Falcone wrote in an introduction to the report.

PSEG Long Island operates the LIPA-owned Long Island service territory under a long-term contract that has already paid the New Jersey company $467 million over seven years.

The state Department of Public Service is also investigating PSEG’s storm response, and in a letter Monday to the LIPA board it identified more than 70 potential violations of PSEG’s emergency response plan related to the storm.

Chauvin, of PSEG, said the company was "committed to continuing to work with LIPA and with the New York State Department of Public Service to provide information and improve our future performance for the people of Long Island."

The LIPA task force report details a management system at PSEG that leaves many key Long Island functions, including information technology, reporting back to New Jersey, where PSEG is based. That system can offer cost reductions and give Long Island access to higher levels of expertise.

But in the case of Isaias, it left a broken computer system largely out of order weeks before the storm threatened the region. By the time Isaias was preparing to wallop Long Island, there was a bottleneck of more than two hours of customer calls and texts reporting outages, LIPA’s report found. That bottleneck would only compound as the afternoon wore on, with hundreds of thousands of customer outage calls and texts, leading the entire system to fail by the afternoon of Aug 4.

Falcone in his remarks noted that LIPA "relied too much" on PSEG’s claims that it was "meeting contractual obligations," including for stress-testing the computer system, rather than verify the claims itself. "It is now clear that whatever stress testing PSEG Long Island performed, it did not test end-to-end functionality" under "realistic severe storm conditions." LIPA said it will now independently verify and validate those stress tests.

LIPA also found that PSEG was "not transparent" about computer system problems, including those detailed in the email exchange, which LIPA didn’t learn about until October, Falcone said. LIPA also missed signals of the "declining quality" of system problems, he acknowledged, blaming the management structure between PSEG’s Long Island and New Jersey operations.

State Sen. Todd Kaminsky (D-Long Beach) said he was shocked to find that most of the computer and telecom issues remain and said he was worried PSEG isn't prepared to respond. "The report makes clear that we’re not better off than we were the day after Isaias, which is quite concerning," Kaminsky said.

Falcone said the computer system "hits a certain wall of performance" when outage calls accumulate past 100,000 to 200,000 "and then it just stops."

LIPA trustee Matthew Cordaro said he was pleased to see LIPA’s task force "take a hard stance" in dealing with PSEG’s deficiencies. But he said LIPA should have cracked down sooner. "I’ve seen this coming the last several years," he said. "PSEG would make very bland reports to the board that really didn’t mean much and skirted all the issues that relate to their storm response."

Falcone said LIPA has options if it ultimately resorts to terminating PSEG's contract, including finding a new utility contractor, finding multiple different contractors to handle specific utility functions that are now centralized by PSEG, or even turning the Long Island utility into a fully public municipal utility.

The report found that more than a million calls and texts were lost or left unanswered in the aftermath of the storm.

It also found that PSEG didn’t come clean about prestorm problems with the outage management system, even after it crashed during Isaias "and LIPA launched its investigation."

PSEG was "not transparent about what it knew until questioned about the reports the system was failing before the storm," the report says, calling PSEG’s own analysis of its Isaias response "at best incomplete" while "resolutely ignoring management deficiencies while attempting to shift the blame to vendors."

"The root cause here was bad management," Falcone said. "We need a cure for bad management."

LIPA's findings

Mismanagement was root cause. PSEG Long Island did not adequately prepare for weather events before Isaias. They did not prepare IT systems for stresses and surge, nor did they have business continuity plans in place in the event an IT system failed. Crashed IT systems with no manual backup plan caused customer power outages to last longer than necessary.

Voice communications failed outright. Communications failed due to faulty systems architecture, inadequate capacity and inherent system errors that were undiscovered due to lack of testing. More than a million calls and texts were lost or unanswered. Data from customers on the extent of outages was lost.

Outage management system was failing before Isaias hit. PSEG Long Island unwisely implemented a new software version of the OMS in June at the beginning of the 2020 Atlantic hurricane season and did not adequately test the new system. PSEG Long Island IT staff already knew in July, before Isaias, that the OMS was not working during “blue-sky” conditions and did not fix it or revert to a prior version before the storm.

Interconnected systems dragged each other down. There was no way to isolate those systems from each other. The Long Island customer communication and storm recovery systems need new architecture.

Faulty estimated times of restoration misled the public. The failure caused PSEG Long Island to issue overly optimistic estimated time of restorations, causing confusion, hardship and a lack of trust, and the problem persisted.

Without the outage management system, PSEG LI inefficiently managed the recovery. There was inefficient management of field resources during Isaias, increasing downtime. Bad data drove inefficient field decisions. PSEG Long Island was unable to effectively switch to decentralized management of field resources.

Drills, training are inadequate. PSEG LI does not have a well-worked-out Emergency Response Plan.

90 days later, many system defects remain uncorrected. PSEG’s lack of strong internal IT technical and management competency has resulted in several false starts and overreliance on vendor solutions.

PSEG lacks transparency. Before Isaias, PSEG IT managers and PSEG Long Island management knew that the OMS was failing but took inadequate corrective action. They did not inform LIPA of this high-risk situation. Even after the OMS crashed during Isaias and LIPA launched its investigation, PSEG was not transparent about what it knew until questioned about reports the system was failing before the storm.