The 2024 CrowdStrike outage, often referred to as “Y2K24,” was a significant IT incident that occurred on July 19, 2024. The outage was triggered by a faulty update to CrowdStrike’s Falcon Sensor security software, which caused widespread issues on Microsoft Windows computers running the software.
What Happened?
CrowdStrike, a prominent cybersecurity company, released an update to its Falcon Sensor software. This update contained a logic error that led to the crashing of approximately 8.5 million Windows systems worldwide. The error was traced back to a misconfiguration in the software’s channel file, which was not properly validated before being deployed.
For more information on the outage, read Navigating the Global Tech Outage: Essential Steps to Recovery
Immediate Impact
The outage had a massive and immediate impact across various sectors:
- Air Transport: Airports and airlines experienced significant disruptions. Baggage handling systems failed, leading to delays and cancellations.
- Finance: Banks and financial institutions faced operational challenges, affecting transactions and online banking services.
- Healthcare: Hospitals and clinics experienced system failures, impacting patient care and administrative functions.
- Retail: Point-of-sale systems in retail stores went down, causing interruptions in sales and inventory management.
- Government Services: Various governmental services, including emergency services and public websites, were disrupted.
The financial damage from the outage was estimated to be at least $10 billion1. The incident also highlighted vulnerabilities in IT infrastructure and the risks associated with centralized software updates.
Response, Recovery, and Lessons Learned
CrowdStrike quickly identified the error and released a fix within hours and worked with many of their customers to get their systems back online. CrowdStrike was able to build an automated recovery for any computer online long enough to accept it, before it received the next BSOD (blue screen of death) and reboot automatically. However, many affected systems required manual intervention to restore functionality, leading to prolonged outages in some cases. Manually recovering workstations was relatively easy (though time-consuming), though recovering cloud servers with encrypted drives was a bit more challenging. Microsoft also provided a recovery tool to help affected users repair their systems.
The reality for many companies was that their IT staff was running from system to system for several days until all systems were back online. The 2024 CrowdStrike outage underscored how vulnerable companies are to forces outside their control. It also emphasized the need for robust disaster recovery plans and the ability to quickly respond to and mitigate the effects of such incidents.
Are there Alternatives?
There are several alternatives to CrowdStrike for endpoint detection and response (EDR) and cybersecurity solutions. Here are some of the top options:
- SentinelOne Singularity Complete: Known for its robust threat detection and response capabilities, SentinelOne is often praised for its ease of deployment and comprehensive protection.
- Microsoft Defender for Endpoint: This is a great choice for organizations that are already using Microsoft products. It integrates seamlessly with other Microsoft services and offers strong security features. Existing Microsoft 365 customer should check to see if Defender for Endpoint is included in their existing subscriptions.
- Palo Alto Networks Cortex XDR: Ideal for those looking to transition into extended detection and response (XDR), Cortex XDR provides advanced threat detection and response across multiple environments.
- Bitdefender GravityZone: Known for proactive endpoint protection, Bitdefender offers a range of security features and is suitable for businesses of all sizes.
- Sophos Intercept X: This solution combines deep learning malware detection with exploit prevention, making it a strong competitor in the EDR space.
- Huntress: Huntress is a managed cybersecurity platform offering 24/7 detection and response for endpoints and identities, tailored for SMBs and IT providers.
- Carbon Black: Now part of VMware, Carbon Black offers comprehensive endpoint protection and is known for its strong threat-hunting capabilities.
- ThreatLocker: ThreatLocker EDR is a policy-based Endpoint Detection and Response solution that monitors for unusual events or Indicators of Compromise (IoCs), sending alerts and taking automated actions if anomalies are detected to provide Zero Trust Endpoint Protection.
- Cylance: Utilizing AI and machine learning, Cylance provides predictive threat detection and prevention.
- ESET PROTECT: Offers advanced threat defense and full disk encryption, making it a versatile option for endpoint protection.
- Zscaler: Focuses on secure internet access and threat protection, making it a good alternative for cloud-based security.
- Okta: While primarily an identity management solution, Okta also offers security features that can complement endpoint protection.
This is of course not a complete list of all options, but rather some of the more common choices by businesses. Each of these alternatives has its own strengths and may be better suited to different organizational needs.
What are Companies Doing Next?
The CrowdStrike outage in July 2024 had a significant impact on businesses worldwide. It might not be a surprise that many companies are re-evaluating their cybersecurity vendor lineup. Here are some key changes companies are implementing:
- Enhanced Testing Protocols: Companies are now conducting more rigorous testing of software updates before deployment. This includes extensive regression testing and compatibility checks to prevent similar issues.
- Staggered Rollouts: Instead of deploying updates to all systems simultaneously, many organizations are adopting staggered rollouts. This approach helps identify and address potential issues on a smaller scale before they affect the entire network.
- Backup and Recovery Plans: Businesses are strengthening their backup and disaster recovery plans. This includes regular backups and ensuring that recovery processes are quick and efficient to minimize downtime.
- Increased Monitoring: Continuous monitoring of systems has become a priority. Companies are investing in advanced monitoring tools to detect anomalies and potential issues early.
- Vendor Management: Organizations are placing greater emphasis on vendor management, ensuring that third-party providers have robust security measures and reliable update processes.
- Employee Training: There is a renewed focus on training employees to recognize and respond to IT issues promptly. This includes regular drills and updated protocols for handling outages.
These steps aim to enhance resilience and reduce the risk of similar incidents in the future. How has your company been affected by the outage?