Recently my colleague Christian Bertrand covered the role and importance of planning in ensuring resilience and uptime in his blog post “Why You Need a Power Failure Strategy.” In this post I want to build on that subject to talk about the maximization of uptime for mission critical systems in the event of an extended power outage.
Throughout online banking, railways, aircraft operating and control, electric power generation and transmission, and computer and telecoms networks we find different types of mission critical, business critical, and safety critical systems.
Although these terms are often used interchangeably, they are not completely the same thing. For example, the difference between safety critical systems and mission critical systems is that if a mission critical system failed, it may result in the failure of some goal-directed activity. However, a failure of a safety critical system may result in environmental damage, injury, or even fatalities.
For the purpose of this blog, I’m going to use the generalized term mission critical to cover all three concepts – mission, business and safety critical. In this context I want to straightaway say that if lost, the mission critical system will almost certainly result in severe financial or reputational damage, and at worst it may cause risk of death and destruction. So we are talking about a very serious subject.
There are three essential steps which must be taken to ensure the maximization of uptime (and minimization of business risk), and the first of these is to understand which applications are truly mission critical.
Step 1 – ASSESSMENT
While initially this might seem completely obvious, in practice it does require a careful analysis of systems to determine which really need to be kept running during a prolonged loss of utility power. Believe me when I say that I have seen an air terminal completely shut down because an emergency exit did not fail safe during a power cut, so it pays to be granular.
If you took the example of a municipal hospital; you would at minimum need to protect emergency lighting, power to life support systems and ICU, and perhaps even CCTV and security for the entire duration of a blackout. On the other hand, the operating theater (assuming it was not in use when the outage started) could be shut down and surgery postponed until a later date when conditions permitted.
Step 2 – REGULATORY COMPLIANCE
Many sectors have regulated requirements for mission critical systems e.g., in the maritime industries regulations such as UR-E10 and IEC 60945 are there to ensure safe vessels and clean seas. The power requirement is especially acute at sea, as emergencies inevitably put life at risk. Therefore, there are stringent rules concerning secure power for shipboard lighting, communications equipment, and propulsion, navigation and transmission systems. (You can read more about this subject in the blog post The Strength of Global Standards and Classification Societies for Reliable Power in Marine Applications).
Step 3: RUNTIME REQUIREMENT
The third step is to understand how much runtime is actually required and again, to determine this will require some careful calculation. Unlike a computer system, which may require only minutes of runtime to properly shutdown during an outage, delicate equipment and integrated processes may require substantially more time to graceful shutdown. For example, equipment at a semiconductor manufacturing plant is not tolerant to any breaks in the production process and a power outage can result in both costly equipment damage and potentially a total loss of work in process.
Maximum Uptime is a philosophy which begins in planning and remains a continuous process through every step of design, construction, commissioning, operations, failure analysis and recommissioning. More help and guidance is available in our free White Paper – Maximizing Uptime in Mission Critical Facilities.