Summary: A seemingly minor software update to the Falcon sensor program by cybersecurity company CrowdStrike led to a global IT outage, causing disruption in various sectors. The issue was traced back to a coding error, causing millions of Windows computers to crash.
The Cause of the Outage
A significant global IT outage occurred on Friday, and it was traced back to a single software update. This update was part of the Falcon sensor program managed by the US cybersecurity company, CrowdStrike. A coding error in this update caused millions of Windows computers around the world to experience the dreaded “Blue Screen of Death.”
Immediate Fallout
The impact of this small update was enormous. Airports were thrown into chaos, supermarket check-outs malfunctioned, and journalists found themselves unable to use essential tools to report on the incident. Although CrowdStrike managed to fix the issue within hours, some users are still struggling with device problems.
Behind the Update
CrowdStrike, a Texas-based cybersecurity company, specializes in ransomware, malware, and internet security solutions primarily for businesses and large organizations. On Friday, July 19th, at 4:09 am UTC (2:09 pm AEST), they released a sensor configuration update targeting Windows systems. This update, like many others, was part of their Falcon program, a cloud-based cybersecurity solution offering automated protection against malware, antivirus support, and incident response capabilities.
These updates are routine and happen multiple times a day. However, this particular update included a “logic error,” a coding mistake that caused the program to malfunction.
The Logic Error
The error was intended to target malicious system communication tools commonly used in cyber attacks. However, instead of improving the program, it triggered a “logic error” that led to operating system crashes on Windows systems. Mac and Linux users were unaffected.
Ajay Unni, the chief executive of StickmanCyber, explained that the update was designed to be a patch, meant to enhance the program’s performance. Instead, it caused millions of Windows PCs to display the “Blue Screen of Death,” with many devices entering a reboot loop. Anyone using Falcon on Windows version 7.11 or above was potentially affected.
Fixing the Problem
Mr. Unni mentioned that the problematic file, known as Channel File 291, needed to be deleted. This could be done remotely if the systems were online, but offline systems required manual deletion. Not all users could remove the file remotely, necessitating manual intervention for some.
Resolution Efforts
CrowdStrike responded quickly, acknowledging the issue within an hour and stating that they were working on a fix. By 5:27 am UTC (3:27 pm AEST) on Friday, they had pushed out an update to replace the flawed configuration files. Despite this, many users in Australia experienced the issue around 3 pm, struggling to get their devices operational for several hours.
The company assured that the outage was not the result of a cyber attack but was conducting a “root cause analysis” to understand the problem better. Mark Jones, a senior partner and cyber expert at Tesserent, noted that while the rollback of the configuration update seemed effective, it would take hours to deploy it across entire systems, including servers and multiple desktops.
Jones also mentioned that depending on the environment, there might be lingering issues stemming from this outage.
The incident highlights how a seemingly minor software update can have significant global ramifications, affecting millions of devices and causing widespread disruption.