The key word in the phrase ‘preparing for the worst’ is ‘preparing.’ Here’s how to put your company ahead by thinking ahead.
In the late 1990s we prepared for Y2K. Then the tragedies of 9/11 forced us to redefine disaster and reassess corporate disaster recovery and continuity of operations plans. Then the large-scale blackouts of 2003 caught most businesses unprepared once again. So, what have we learned?
Every organization, large and small needs a continuity of operations plan (COOP). It’s a seemingly obvious lesson but it’s still worth stating. A surprising number of organizations both large and small have no plan in place or limited, incomplete, and or untested plans. If your organization has no plan, start with a simple assessment and evaluate a scenario in which you lose access to your facility indefinitely. Does this put you out of business? If it does and you don’t have a continuity of operations plan in place then get started on one immediately.
"Possible" is a moving target ñ reassess your scenarios
Again, with record power outages this year spread across two continents and broad geographic areas, our definition of what is possible has been expanded. Organizations need to realistically factor in the ever-changing definition of ‘possible’ into their COOP planning scenarios.
COOPs do not necessarily have to be grandiose or expensive. Larger businesses typically require more formal planning and preparation, but smaller organizations might only need to take regular backups of critical systems, store the data offsite, and give somefore thought to where and how business could resume in the event of a partial or total loss. Maybe that includes figuring out where to rent a tent and secure some cell phones on short notice. On a larger scale, it might mean contracting with external organizations to manage disaster recovery stores of equipment. Any planning is usually better than no planning, and often humble efforts uncover major vulnerabilities that can then be properly addressed.
The basic COOP
An initial COOP effort should follow three basic steps: initiation and assessment, planning, and risk mitigation
Initiation and assessment
The first part of the initiation and assessment step is having a mission statement clearly defining the mission and goals of the COOP effort. Goals typically include: Ensuring the safety of employees and any other people potentially on site; protecting assets, data, equipment, records etc.; resuming operations within some specified time period; minimizing losses, sponsorship of mitigation efforts; providing a mechanism to monitor, alert, declare an emergency and manage the event as well as recovery; and documenting a clear chain of command that provides for continued leadership.
The next step is defining the requirements, scope and assumptions which include the scenarios addressed as well as those not addressed by the plan. You then will have to assess your vulnerabilities. That is, inventory your organization from a point of failure perspective including everything from people and paper clips to critical business systems and facilities/physical plant. Finally you should assess the business impact of an outage. The cost and probability of an outage are often used as justification for risk mitigation activities as well as funding for more robust planning efforts.
There are several things that should be defined in your planning process: monitoring mechanisms for all critical components and systems; alerting systems and procedures that detail what happens when your monitoring mechanism indicates a failure or outage; the emergency declaration process and emergency management team; define the event management algorithms that specify how business is resumed after an event is declared; the post event recovery process which details how normal operations are resumed after a disaster or service interruption.
Based on the vulnerabilities uncovered and the business impact of an associated outage COOP efforts typically include spin-off projects that address clear risks to the organization. The value proposition for these projects is fairly straightforward, with the cost of the mitigation project weighed against the probability and cost of an outage.
These calculations should include all costs. IT departments will often neglect to weigh in the cost of lost productivity when, with a few simple and realistic assumptions, those costs can at least be determined to an order of magnitude level.
Also consider that requirements for having a continuity of operations plan are becoming more commonplace in relevant insurance and financial transactions. You don’t want to scramble for a plan when your bank asks you for one as part of the process to secure a line of credit.
The plan is never done. When the initial planning and risk mitigation effort is concluded, the plan should then fall into a continuous process improvement cycle where scenarios are periodically reevaluated (probability, severity, and frequency) and plans are tested and improved. Formal training and change management go hand in hand with testing. If the plan is not managed as an ongoing concern, periodically reassessed and updated it will become increasingly invalid as time progresses eventually.
Successful continuity of operations planning requires more than just managerial support for an individual effort; it must become part of the organizational culture and ultimately inseparable from daily operations and every aspect of the business.
We have gone from the early 1990s–when we spoke of disaster recovery and return to operations times measured in days–through the internet boom, Y2K and 9/11. We’ve arrived at the expectation of zero downtime and immediate recovery, a tall order under the best of circumstances. There are numerous organizations, technologies and an ever-growing body of professional expertise available to support your COOP effort. Make use of them.
Thomas Di Santi is with Delta Corporate Services, an information systems consulting company based in Parsippany, N.J.