CrowdStrike outage explained: What caused it and what’s next (2024)

What might be considered the largest IT outage in history was triggered by a botched software update from security vendor CrowdStrike, affecting millions of Windows systems around the world. Insurers estimate the outage will cost U.S. Fortune 500 companies $5.4 billion.

The outage occurred July 19, 2024, with millions of Windows systems failing and showing the infamous blue screen of death (BSOD).

CrowdStrike -- the company at the core of the outage -- is an endpoint security vendor whose primary technology is the Falcon platform, which helps protect systems against potential threats in a bid to minimize cybersecurity risks.

In many respects, the outage was a real manifestation of fears that computing users had at the end of the last century with the Y2K bug. With Y2K, the fear was that a bug in software systems would trigger widespread technology failures. While the CrowdStrike failure was not Y2K, it was a software issue that did, in fact, trigger massive disruption on a scale that has not been seen before.

What caused the outage?

The CrowdStrike Falcon platform is widely used by organizations of all sizes across many industries. It is the pervasiveness of CrowdStrike's technology and its integration into so many mission-critical operations and industries that amplified the effect.

The outage was not a Microsoft Windows flaw directly, but rather a flaw in CrowdStrike Falcon that triggered the issue.

Falcon hooks into the Microsoft Windows OS as a Windows kernel process. The process has high privileges, giving Falcon the ability to monitor operations in real time across the OS. There was a logic flaw in Falcon sensor version 7.11 and above, causing it to crash. Due to CrowdStrike Falcon's tight integration into the Microsoft Windows kernel, it resulted in a Windows system crash and BSOD.

The flaw in CrowdStrike Falcon was inside of a sensor configuration update. The sensor is regularly updated -- sometimes multiple times daily -- to provide users with mitigation and threat protection.

The flawed update was contained in a file that CrowdStrike refers to as "channel files," which specifically provide configuration updates for behavioral protections. Channel file 291 is an update that was supposed to help improve how Falcon evaluates named pipe execution on Microsoft Windows. Named pipes are a common type of communication mechanism for interprocess communications on Microsoft Windows.

With channel file 291, CrowdStrike inadvertently introduced a logic error, causing the Falcon sensor to crash and, subsequently, Windows systems in which it was integrated.

The flaw isn't in all versions of channel file 291. The problematic version is channel file 291 (C-00000291*.sys) with timestamp 2024-07-19 0409 UTC. Channel file 291 timestamped 2024-07-19 0527 UTC or later does not have the logic flaw. By that time, CrowdStrike had noticed its error and reverted the change. But, for many of its users, that reversion came too late as they had already updated, leading to BSOD and inoperable systems.

CrowdStrike outage explained: What caused it and what’s next (1)

What services were affected?

Microsoft estimated that approximately 8.5 million Windows devices were directly affected by the CrowdStrike logic error flaw. That's less than 1% of Microsoft's global Windows install base.

But, despite the small percentage of the overall Windows install base, the systems affected were those running critical operations. Services affected include the following.

Airlines and airports

The outage grounded thousands of flights worldwide, leading to significant delays and cancellations of more than 10,000 flights around the world. In the United States, affected airlines included Delta, United and American Airlines. These airlines were forced to cancel hundreds of flights until systems were restored. Globally, multiple airlines and airports were affected, including KLM, Porter Airlines, Toronto Pearson International Airport, Zurich Airport and Amsterdam Schiphol Airport.

Public transit

Public transit in multiple cities was affected, including Chicago, Cincinnati, Minneapolis, New York City and Washington, D.C.

Healthcare

Hospitals and healthcare clinics around the world faced significant disruptions in appointment systems, leading to delays and cancellations. Some states also reported 911 emergency services being affected, including Alaska, Indiana and New Hampshire.

Financial services

Online banking systems and financial institutions around the world were affected by the outage. Multiple payment platforms were directly affected, and there were individuals who did not get their paychecks when expected.

Media and broadcasting

Multiple media and broadcast outlets around the world, including British broadcaster Sky News, were taken off the air by the outage.

Analysis of the CrowdStrike outage

In this podcast, TechTarget Security editors Rob Wright, Alex Culafi and Arielle Waldman assess last week's CrowdStrike outage and the organization's response.

Why Apple and Linux were not affected

CrowdStrike's software doesn't just run on Microsoft Windows; it also runs on Apple's macOS and the Linux OS.

But the July outage only affected Microsoft Windows. The root cause of the outage was a faulty sensor configuration update that specifically affected Windows systems. The channel file 291 update was never issued to macOS or Linux systems as the update deals with named pipe execution that only occurs on the Microsoft Windows OS.

The way that the Falcon sensor integrates as a Windows kernel process is also not the same in macOS or Linux. Those OSes have different integration points to limit potential risk.

However, there was a reported incident in June from Linux vendor Red Hat, where the Falcon sensor -- running as an eBPF program in Linux -- triggered a kernel panic. In Linux, a kernel panic is a type of crash, though typically not as dramatic as BSOD. That issue was resolved without Red Hat reporting any major incidents.

The dangers of putting all your eggs in one IT basket

Discover the possible consequences of relying on a concentrated and interconnected pool of vendors for all your infrastructure needs.

What happens when the IT infrastructure is too big to fail?

CrowdStrike chaos shows risks of concentrated big IT

CrowdStrike disaster exposes a hard truth about IT

How long will it take businesses to recover from this outage?

CrowdStrike itself was able to identify and deploy a fix for the issue in 79 minutes. While CrowdStrike quickly identified and deployed a fix for the issue, the recovery process for businesses is complex and time-consuming. Among the issues is that, once the problematic update was installed, the underlying Windows OS would trigger BSOD, rendering the system inoperative using the normal boot process.

IT administrators had to manually boot affected systems into Safe Mode or the Windows Recovery Environment to delete the problematic channel file 291 and restore normal operations. That process is labor-intensive, especially for organizations with many affected devices. In some cases, the process also required physical access to each machine, adding further time and effort to the process.

Some businesses were able to apply the fix within a few days. However, the process was not straightforward for all, particularly those with extensive IT infrastructure and encrypted drives. The use of the Microsoft Windows BitLocker encryption technology by some organizations made it significantly more time-consuming to recover as BitLocker recovery keys were required.

It is estimated that it could potentially take months for some organizations to entirely recover all affected systems from the outage.

The latest news on CrowdStrike's recovery efforts

BitLocker workaround may offer aid for CrowdStrike customers

CrowdStrike: 97% of Windows sensors back online after outage

CrowdStrike outage underscores software testing dilemmas

Hackers take advantage of outage

While the outage was not due to a cyberattack, threat actors have taken advantage of the incident.

According to a blog post from CrowdStrike, the security vendor has received reports of the following malicious activity:

  • Phishing emails sent to customers posing as CrowdStrike support.
  • Fake phone calls impersonating CrowdStrike staff.
  • Selling scripts claiming to automate recovery from the botched update.
  • Posing as independent researchers saying the outage was due to a cyberattack and offering remediation insights.

CISA urges individuals and organizations to only follow instructions from legitimate sources and avoid opening suspicious emails and links.

How can businesses be better prepared for tech outages?

The CrowdStrike Windows outage highlighted the vulnerabilities of modern society's heavy reliance on technology. While system backups and automated processes are essential, having manual procedures in place can significantly enhance business continuity during tech outages.

But there are a few things businesses can do to be better prepared for tech outages, including the following.

Test all updates before deploying to production

It has been a best practice for years to allow automated updates to ensure systems are always up to date. However, the CrowdStrike issue laid bare the underlying risk with that approach. For mission-critical systems, testing updates before deployment or having some form of staging environment before pushing updates to production might help to mitigate some risk.

Develop and document manual workarounds

Manual workarounds ensure critical business processes can continue even when technology fails. This approach was common before the digital age and, in the event of outage, can serve as a fallback. Documenting and practicing manual procedures can help mitigate the effect of outages, ensuring businesses can still operate and serve their customers, even during an outage.

Perform disaster recovery and business continuity planning

Outages happen for any number of different reasons. Having extensive disaster recovery and business continuity practices and plans in place is critical. Part of that effort should include the use of redundant systems and infrastructure to minimize downtime and ensure critical functions can switch to backup systems when needed.

Sean Michael Kerner is an IT consultant, technology enthusiast and tinkerer. He has pulled Token Ring, configured NetWare and been known to compile his own Linux kernel. He consults with industry and media organizations on technology issues.

For more information about the CrowdStrike outage, read the following articles:

Is today's CrowdStrike outage a sign of the new normal?

CrowdStrike chaos casts a long shadow on cybersecurity

CrowdStrike outage explained: What caused it and what’s next (2024)

References

Top Articles
NFL announces five games for 2023 International Series
NFL announces schedule for five international games in 2023
Xre-02022
How To Fix Epson Printer Error Code 0x9e
Yogabella Babysitter
Hertz Car Rental Partnership | Uber
5 Bijwerkingen van zwemmen in een zwembad met te veel chloor - Bereik uw gezondheidsdoelen met praktische hulpmiddelen voor eten en fitness, deskundige bronnen en een betrokken gemeenschap.
Select The Best Reagents For The Reaction Below.
Imbigswoo
Alaska Bücher in der richtigen Reihenfolge
A.e.a.o.n.m.s
Jessica Renee Johnson Update 2023
Unit 1 Lesson 5 Practice Problems Answer Key
Letter F Logos - 178+ Best Letter F Logo Ideas. Free Letter F Logo Maker. | 99designs
Dallas Cowboys On Sirius Xm Radio
Aberration Surface Entrances
Craiglist Kpr
Who called you from +19192464227 (9192464227): 5 reviews
Tygodnik Polityka - Polityka.pl
50 Shades Of Grey Movie 123Movies
Amazing deals for Abercrombie & Fitch Co. on Goodshop!
Georgetown 10 Day Weather
Selfservice Bright Lending
Lakewood Campground Golf Cart Rental
Where to eat: the 50 best restaurants in Freiburg im Breisgau
Xfinity Outage Map Fredericksburg Va
Litter Robot 3 RED SOLID LIGHT
Integer Division Matlab
Inkwell, pen rests and nib boxes made of pewter, glass and porcelain.
Craigslist Wilkes Barre Pa Pets
Craigslist Ludington Michigan
What Is a Yurt Tent?
Ullu Coupon Code
Yu-Gi-Oh Card Database
Ugly Daughter From Grown Ups
Have you seen this child? Caroline Victoria Teague
Nextdoor Myvidster
What Happened To Father Anthony Mary Ewtn
Leland Nc Craigslist
Kstate Qualtrics
Asian Grocery Williamsburg Va
Studentvue Columbia Heights
Plead Irksomely Crossword
Housing Intranet Unt
craigslist | michigan
Www Usps Com Passport Scheduler
Traumasoft Butler
Craigslist Malone New York
Stosh's Kolaches Photos
antelope valley for sale "lancaster ca" - craigslist
Puss In Boots: The Last Wish Showtimes Near Valdosta Cinemas
OSF OnCall Urgent Care treats minor illnesses and injuries
Latest Posts
Article information

Author: Kerri Lueilwitz

Last Updated:

Views: 6199

Rating: 4.7 / 5 (47 voted)

Reviews: 86% of readers found this page helpful

Author information

Name: Kerri Lueilwitz

Birthday: 1992-10-31

Address: Suite 878 3699 Chantelle Roads, Colebury, NC 68599

Phone: +6111989609516

Job: Chief Farming Manager

Hobby: Mycology, Stone skipping, Dowsing, Whittling, Taxidermy, Sand art, Roller skating

Introduction: My name is Kerri Lueilwitz, I am a courageous, gentle, quaint, thankful, outstanding, brave, vast person who loves writing and wants to share my knowledge and understanding with you.