The story of how CrowdStrike released an update on a Friday and brought down thousands, tens of thousands, or maybe even hundreds of thousands of computers around the world.
On Friday, CrowdStrike issued an update that resulted in a significant disruption affecting a substantial number of computers worldwide. The precise number of impacted devices remains uncertain, with estimates ranging from several thousand to potentially hundreds of thousands.
Ever heard the unspoken rule: “Never release on Friday”? We
have, but CrowdStrike hasn’t. They released a tiny driver on an ordinary Friday
morning, which became the cause of a huge outage all over the world.
An inaccurate update to CrowdStrike's EDR (Endpoint
Detection and Response) software has caused Windows devices globally to
experience the Blue Screen of Death (BSOD), impacting corporate users. This
issue has disrupted airport information systems in various countries such as
the US, Spain, Germany, and the Netherlands.
Who else was affected by CrowdStrike’s Friday release and
how to roll back bricked computers — all in this post…
What happened
The series of events commenced on the early morning of
Friday when corporate users globally began experiencing issues with Windows.
Initially, the problem was attributed to a malfunction in Microsoft Azure, but
later on, CrowdStrike verified that the csagent.sys or C-00000291*.sys driver
for its CrowdStrike EDR was the actual root cause. It was this driver that led
to an influx of comical office photos featuring the infamous blue screens.
Blue screen of death on all computers = a day off for airport linemen |
If we wanted to list everyone affected by this outage, such a list sure wouldn’t fit into this post – or dozens of them. So instead we’ll briefly cover the main victims of CrowdStrike’s negligence. Airline companies, airports, and people who want to either go home or go off on a long-awaited vacation were the most affected:
- London’s Heathrow Airport, like many others, announced flight delays due to a technology glitch;
- Scandinavian Airlines posted a notice on its website saying, “Some customers may experience difficulties with their bookings due to an IT issue affecting several countries. SAS is fully operational but delays are expected”;
- In New Zealand, banking, communications and transportation systems are experiencing problems.
Various medical centers, chain stores, the New York subway,
the largest bank in South Africa and many other organizations that make lives
more comfortable and convenient on a daily basis were affected. The fullest
list of those affected by the outage we can find is here — and it’s growing by
the minute.
How to fix it
At this stage, it’s rather problematic estimating how long
it’ll take to fully restore the affected computers around the world. Things are
complicated by the fact that users need to manually reboot their computers in
Safe Mode. And in large corporations, this is usually impossible to do on your
own without the help of a system administrator.
Nevertheless, here are the instructions for how to get rid
of the blue screen of death caused by the CrowdStrike driver update:
- Boot your computer in Safe Mode;
- Go to C:\Windows\System32\drivers\CrowdStrike;
- Locate and delete the csagent.sys or C-00000291*.sys file;
- Restart your computer in normal mode.
And while your sysadmins are doing this, you could use a
hack that’s come out of India today: employees of one of the country’s airports
have started filling out boarding passes… manually.
How the failure could have been avoided
Avoiding this situation should have been easy. Firstly, the
update should not have been rolled out on a Friday. This is a rule that has
been common knowledge in the industry for ages: if a problem arises, there is
not enough time to fix it before the weekend, so system administrators at all
affected companies must work throughout the weekend to address the issues.
It’s important to be as responsible as possible about the
quality of updates released. We prioritize the responsibility of ensuring the
quality of updates that are released. Kaspersky initiated a program in 2009 to
avoid widespread failures among our customers, and successfully completed anSOC 2 audit, validating the security of our internal procedures. Over the past
15 years, each update has undergone thorough performance testing across
different configurations and operating system versions. This approach enables
us to proactively identify and address potential issues.
The principle of granular releases should be adhered to.
Updates should be distributed gradually, rather than being deployed to all
customers simultaneously. This approach enables us to respond promptly and halt
an update if necessary. In the event that our users encounter an issue, we log
it and prioritize its resolution at all levels of the organization.
To effectively address cybersecurity incidents, it is
crucial to not only rectify the apparent damage but also identify the
underlying root cause to prevent future recurrences. Before implementing
software updates on the company’s production infrastructure, it is essential tothoroughly test them on the test infrastructure for operability and potential
errors. Furthermore, changes should be implemented gradually, with continuous
monitoring to promptly detect any failures.
Incident response should rely on a comprehensive strategy
that involves implementing security measures from a reputable vendor with
stringent internal standards for security, quality, and service availability.
The Kaspersky Next suite of solutions can serve as the foundation for this initiative. By leveraging these tools, your organization can enhance its information security system and ensure business continuity. Whether you choose to enhance protection incrementally or all at once, safeguard your infrastructure today to mitigate the impact of future globaldisruptions on your customers.!
Kaspersky is here to assist you in making a decision:
switch to Kaspersky and gain access to two years of Kaspersky Next EDROptimum for the cost of one. Enjoy top-notch, dependable cybersecurity
protection!