How one IT update gone wrong has sparked chaos around the world
A global IT outage has caused problems at airports and many other organizations.
If the world needed a wake-up call on just how fragile its IT systems are, it just got one: A small update gone wrong appears to be the culprit for a global meltdown.
Airlines, banks, supermarkets, and healthcare providers from Japan and Australia to the UK, Switzerland, and Singapore had computer services go down Thursday night, which disrupted operations at the heart of the modern economy.
On Friday, some JPMorgan employees found they could not log on to the bank’s systems, Bloomberg reported. Malfunctions have hit Norway’s central bank, too.
In Japan, cash-register issues at McDonald’s branches have forced the closure of almost one-third of the fast-food chain’s stores there. Computers at the Australian retailer Woolworths, among many others, have experienced the “blue screen of death.”
Meanwhile, airports and airlines on multiple continents had to delay or ground flights, and healthcare systems in Britain were down.
The outages appeared to emerge after Microsoft reported problems with its online services, linked to an issue at the cybersecurity giant CrowdStrike.
CrowdStrike CEO George Kurtz said in an interview with CNBC that he wanted to “personally apologize to every organization, every group, and every person who’s been impacted by this.”
He said on X the issue was caused by a “defect found in a single content update for Windows,” with a fix now deployed.
“There’s a single file that drives some additional logic on how we look for bad actors,” Kurtz told CNBC. “This logic was pushed out and caused an issue only in the Microsoft environment, specific to this bug that we had.”
Kurtz said that depending on each organization’s computer brand and network, some systems could be rebooted to employ the fix automatically, while others may require more-manual steps.
“We’re looking at ways to automate those sort of fixes so that it isn’t as manual,” he said.
CrowdStrike, headquartered in Austin, is a giant of the online-security industry. It was added to the S&P 500 in June and has made itself a vital vendor of cybersecurity software to some of the world’s most powerful companies, governments, and institutions.
The company, founded in 2011, saw its reputation soar as it found itself playing an integral role in handling some of the most high-profile cybersecurity cases of the past decade, such as the Democratic National Committee hack in 2016 and the Sony Pictures breach in 2014.
That meant CrowdStrike was vital in protecting online operations for a huge number of organizations. But if the past 24 hours have proved anything, it’s that widespread reliance on a single company has the potential to cause serious problems.
“The severity of the problem boils down to how long it lasts,” Dan Coatsworth, an investment analyst at AJ Bell, said. “A few hours’ disruption is unhelpful but not a catastrophe. Prolonged disruption is another matter, potentially causing damage to companies and economies.”
A Microsoft spokesperson said it was “aware of an issue affecting Windows devices due to an update from a third-party software platform” and anticipated that a “resolution is forthcoming.”
Bug check
According to Microsoft’s Azure status page, it’s a particular issue with CrowdStrike’s Falcon “agent” that was affecting systems. The Falcon agent is meant to act as a sensor to detect and block attacks on IT systems, as well as track and record threats as they happen to give companies as quick an insight as possible into looming risks.
Microsoft’s status page said the agent “may encounter a bug check” of its own and “get stuck in a restarting state.” In other words, the agent meant to detect bugs is getting checked to see if it’s a bug itself — and causing problems as a result.
Omer Grossman, the global chief information officer at the online-security firm CyberArk, said CrowdStrike’s product in question “runs with high privileges” across the devices and systems it’s meant to protect in different company networks, meaning a malfunction can be brutal.
Grossman added that it was easy for a malfunction to occur. Causes could include human error — if, say, there was a “developer who downloaded an update without sufficient quality control,” he said — or “the complex and intriguing scenario of a deep cyberattack.”
CrowdStrike said in a statement that the outage was not caused by a “security incident or cyberattack.”
It’s far from clear how long it will take to clean up now. Brody Nisbet, who runs the company’s threat-hunting operations, summed up the situation in three words on X: “It’s a mess.”