By Dan Johnson, Global Director, Compliance and Continuity, Ensono Your company may have a business continuity and disaster recovery plan on file. But are you ready to put it into action?
Threats to IT infrastructure are a nonstop headline item and an ever-present enterprise concern. But the July 2024 CrowdStrike outage was nonetheless shocking.
The massive number of systems affected1, the instantaneous impact on some of the most basic functions of everyday life, the staggering costs—Fortune 500 companies are facing an estimated $5.4 billion in losses, with insured losses likely to account for just a fraction of that amount2—and the fact that it wasn’t the work of the “bad guys”, gave this event a different feel.
It has also underscored the critical importance of having an up-to-date and thoroughly tested business continuity and disaster recovery (BCDR) plan ready to activate.
If you work at a large company with complex IT, you probably already have a BCDR plan. This plan describes how a company will maintain operations in the event of a cyberattack, natural disaster, pandemic outbreak or other emergency, as well as how it will recover afterward. Most plans cover topics like:
How employees will continue to work during an emergency – including VPNs, remote collaboration tools and communication protocols
How data and IT systems will be backed up and restored – including cloud backups, offsite storage and disaster recovery sites
How vital supply chains will continue without interruption – including alternate suppliers and communication and logistics resiliency
How business will continue during an emergency – including critical business functions and resources required to support them
How key vendors and stakeholders will be contacted – including contact information for key business personnel, vendors and leaders
How the plan will be maintained – including cadence of plan reviews and exercise history
However, your plan is not worth the paper it’s printed on if you don’t test it. Companies that encounter the most surprises during a disaster are the ones that don’t test their BCDR plans or perform tests that are as close to reality as possible. And surprises during a disaster can have negative consequences, especially for revenue.
Testing your plans regularly and comprehensively will make your teams more aware of the processes involved to adequately recover from a disaster event. You’ll also uncover gaps in your recovery processes. That’s a good thing: It’s better to encounter issues during testing than discover them during an actual disaster. Preparation is key with any successful BCDR program. Failing to prepare is preparing to fail.
Remember: Your plan is only as good as your vendors
Working with third-party providers is a fundamental, necessary and in most ways positive aspect of enterprise business operations. But as the CrowdStrike event showed, even the most reputable and experienced entities can make consequential mistakes. When your infrastructure is integrated with someone else’s, their errors, miscalculations and weaknesses become yours.
But this incident also demonstrated that while vendors and partners introduce risk, they are who you will rely on for help when the worst does happen. CrowdStrike deployed the fix for the issue to its customers within 79 minutes. Here at Ensono, we were able to get our clients’ critical services up and running within 24 hours of the outage.3
The bottom-line lesson is to partner wisely and transparently. Ask each of your vendors if they have a BCDR plan and what the results were from their last exercise. Ask to see their exercise summary document to understand the scope and results of their most recent test. If they aren’t fully prepared, then neither are you. Just as importantly, ask them for detailed accounts of how they have responded in real-life disaster scenarios. A true ally will have stopped at nothing to get their clients back to business-as-usual as quickly, completely and safely as possible.
1David Weston, “Helping our customers through the CrowdStrike outage,” microsoft.com, July 2024. 2Matthew Lerner, “Insured losses from CrowdStrike outage could reach $1.8B: Parametrix,” businessinsurance.com, July 2024. 3Thomas Moor and Jack Middleton, “Why QA Matters in Light of the CrowdStrike Incident,” Ensono.com