Responding to the CrowdStrike outage

What can IT service leaders learn; considerations for the future

July 19, 2024

For continuing updates and action items, visit our CrowdStrike incident response page.

CrowdStrike, a US-based cybersecurity solution providing endpoint detection and response (EDR) services, experienced a significant technical issue during the early hours of July 19, 2024. This incident, reportedly due to an update made to CrowdStrike’s Falcon antivirus software, primarily affected Windows PCs causing a widespread downtime on these systems with Mac and Linux hosts remaining unaffected. The outage has affected organizations from hospitals to commercial airline flights and even emergency call centers around the globe.

CrowdStrike is one of the most popular EDR solutions currently installed at over 29,000 organizations. EDR is a popular category of cybersecurity solutions because it operates directly on laptops, desktops and servers to identify and block common cyber-attacks, such as malware and ransomware.

As organizations look to respond to this event, we recommend that they consider the following areas:

1. Start with securely restoring your operations.

2. Understand your organization’s exposure to similar threats and take steps to reduce them.

3. Build resiliency to account for and endure future cybersecurity/IT outages.

Securely restoring operations

We recognize there are significant pressures to bring back ‘business as usual’ as quickly as possible. However, it is important that organizations incorporate the following to avoid generating new operational impacts or opening the door to future cyber-attacks.

Go to the source of truth for fixes. The ‘latest’ news from social media is often untested or specific to individual environments. In addition, cyber threat actors will generate malicious websites indicating quick fixes that result in malware deployment. You should always go to a vendor site or other reputable source, such as your managed security provider, for fix information.
Reinforce your data protections following the fix. Procedures to restore Windows devices require the distribution of ‘BitLocker’ keys which provide encryption to disk-based storage. These keys should be centrally managed and securely stored. The required distribution of keys to users reduces this control’s effectiveness and organizations should plan to resecure these keys following system restoration.
Monitor for phishing emails. As with any major issue, bad actors are looking to capitalize and have begun reaching out to organizations masquerading as CrowdStrike support. If you are a CrowdStrike customer, contact their support directly for assistance. Cybersecurity vendors will not proactively reach out and almost always only respond to support tickets.

Understanding and reducing your exposure

This issue highlights a systemic change in how digital systems have evolved into the central nervous systems of business operations that require careful governance like any other enterprise risk. During your post-incident debrief, we recommend evaluating the following areas to understand your exposure and drive further research and investments to reduce risk to future events:

Inventory and evaluate the risks associated with which vendors receive 'implicit' trust in your environment. The impact of the CrowdStrike outage is far-reaching due to the level of advanced access it had, not just to its own software, but to the underlying Windows environment. This is a more common operating model in today’s ‘as-a-service’ world. Identifying what vendors receive this access, the level of access, and its purpose are critical to understanding your exposure to similar events and to accounting for potential business impacts.
- Heavily regulated industries such as financial services should anticipate specific questions regarding these vendors as well as third-party risk management practices (see below) in the upcoming assessment cycles.
Assess your technology stack diversification. This involves steps to review how beholden you are to a single provider which would otherwise adversely affect your operations should one provider go down (think vendor lock-in and single-point-of-failure). Choices exist in the marketplace, which could easily cover your business objective and aid in effective risk management and contingency planning. For example, consider the recent impact of the CDK Global outage which affected nearly 30,000 car dealerships, or the Change Healthcare event that impeded revenue cycle processes across the healthcare industry.
Review your third-party risk management practices. As IT becomes more specialized and critical to businesses, they are often turning to third parties for support. Organizations should consistently evaluate their third-party providers and even their vendors (fourth- and fifth-party) regardless of their market share. Ensuring the organization consistently aligns with your internal and external regulatory expectations is fundamental. "Trust, but verify" is table stakes.
Build your understanding of system identities. When we think of system access, our first thought goes to people. However, system access is often granted to other IT elements to operate program interactions (non-human identities). Similar to our review and cataloging of human users, organizations should work to inventory and understand non-human identities and the roles they play in software updates.

Maturing your resiliency

In today’s increasingly interconnected world, organizations can work to address risks from these types of events but those risks will never be removed from our digital society. Organizations can increase their operational resilience to these events by developing or maturing a business continuity program. We recommend that organizations consider the following areas to build and enhance their operational resilience.

Develop and test your business continuity program. Develop and update a business continuity plan. This documents critical business functions and identifies downtime, and manual procedures to sustain critical business operations during these events, even in a limited capacity. Maintaining these critical business functions (operations) requires continuity strategies and activities for alternate staffing and vendor redundancies.
Consider ‘enterprise-as-a-system’ thinking. This approach links business functions to underlying IT to add operational context to IT risks. It focuses on using risk management principles to build an in-depth understanding of complex interconnections between systems and how each influences enterprise risk.
Mature configuration management and vulnerability program execution. These common cybersecurity approaches require significant operational discipline to deploy and sustain your cyber and IT hygiene. This review is focused on both quickly responding to risks and balancing inadvertently exposing the organization to new risks are core traits of an effective cyber program.

What to expect in the future?

While this situation and its impacts are still unfolding, it raises important questions and considerations for organizations to monitor:

Focus on how much trust we place in our technology vendors.
Reviews and tighter controls around how deeply we let vendors into our environment.
Potential legal ramifications when a trusted provider causes an outage. Further considerations for implications to parties who recommend specific technology products.

How can RSM help?

We are operating a command center to respond rapidly to the changing landscape of this outage. Reach out to us at recovery@rsmus.com for support. In addition, our team members across the U.S. and Canada can be hands in to help you to respond and recover from this incident, as well as strategize for the future through the following ways:

System recovery services to build your strategy for restoration as well as provide the ‘arms and legs’ to deploy this in the field.
Managed cybersecurity monitoring and endpoint detection and response services, which utilizes the EDR solution, SentinelOne.
Business continuity and crisis management services to identify critical business functions, develop downtime procedures and implement effective communication protocols for managing disruptive events.
Cybersecurity strategy and assessment services to evaluate your risk, maturity of current state controls, and define a plan for the future maturity.
IT and cybersecurity design and implementation services to get hands-on experience in your environment to improve and operate your cybersecurity controls and to identify, respond, and triage potential incidents in the future.
Managed cloud and IT infrastructure services to manage and support your IT operations and availability.

RSM contributors

Tauseef Ghazi

Partner

View full bio >
Robert Snodgrass

Principal, Risk Consulting
Daniel Gabriel

Principal, Risk Consulting

View full bio >

Do you know how to protect your business from the latest cybersecurity threats?

Our one-day workshops enable you to understand current trends and challenges and strengthen your business’s cybersecurity approach.

Fight back against cybersecurity threats

THE POWER OF BEING UNDERSTOOD

ASSURANCE | TAX | CONSULTING

RSM US LLP is a limited liability partnership and the U.S. member firm of RSM International, a global network of independent assurance, tax and consulting firms. The member firms of RSM International collaborate to provide services to global clients, but are separate and distinct legal entities that cannot obligate each other. Each member firm is responsible only for its own acts and omissions, and not those of any other party. Visit rsmus.com/about for more information regarding RSM US LLP and RSM International.

Technology alliance platforms

Featured topics

Real Economy publications

Platform user insights and resources

About us

Experience RSM

Spotlight on culture

Work with us