Power: How Much Reliability is Enough?
A close look at the costs of downtime can identify the elements of the power infrastructure an organization needs to protect critical operations.
By James Warren and James Kim
The importance of reliable power is often underestimated. When a loss of power triggers a series of events that lead to financial losses or damage to the brand, many companies wonder, “How did this ever happen?” Unfortunately, it is only upon failure that most companies understand the importance of high power reliability and begin to engineer systems to deal with their once hidden vulnerabilities.
To analyze the need for power reliability, it is important to establish expectations. Seven questions can help.
- How much does a facility outage cost per minute, per hour, per day, per month? At what point will a company not recover?
The answers to these questions often provide justification for capital expenditures. In and of itself, the magnitude of loss from a power outage is not often the determining factor. It is when the loss is measured against annual profits that most executives take notice. The Sarbanes-Oxley Act has also introduced a twist. This law basically says a corporation cannot knowingly ignore issues that may have an impact on stock value. Some companies address critical facility deficiencies solely to comply with Sarbanes-Oxley requirements.
- Could the corporate brand be damaged if delivery of products or services is perceived as being unreliable or the company is seen as unresponsive?
Damage to the corporate brand can be a long-term effect of failure. For example, reports of large-scale failures may lead customers or investors in other directions. In certain cases, a failure to provide service or access to information could have ramifications for years. Value can’t be easily placed on a tarnished brand.
- What part of the day, week, month, or year is the facility required to operate? What parts of the facility are critical to the organization?
If the facility is critical for only certain periods of the day or week, trading activities, for example, then opportunities for maintenance exist without affecting the system. Expensive electrical bypasses, known as wraps, may not be necessary. These often take the form of electrical circuits around uninterruptible power supply (UPS) systems or critical distribution equipment. Identifying the critical applications, processes or infrastructure will help reduce costs by providing redundancy only where necessary.
- Can the facility be brought down for maintenance?
Understanding which components of a facility are critical is fundamental to establishing a cost-effective solution. It does little good to protect one aspect of a facility while other weaker links still exist. This will also help establish the maintenance program, which in turn affects the electrical topology. The need to concurrently maintain equipment is much more expensive and involves more risks than having regular maintenance windows. As with anything man-made, failure is inevitable. Without maintenance, failure arrives earlier than later. Therefore, maintenance is crucial for long-term reliability.
- How serious is an unplanned outage within the system? Does the system need to be fault-tolerant?
Fault tolerance is the ability to survive a single failure without affecting the critical load. Once fault-tolerance is identified as a requirement, substantial costs are involved with providing the electrical infrastructure necessary to alleviate failure. System- plus-system architecture (two active paths of power) is employed to mitigate failures — a significant investment for the security of both power and information.
- Once a failure has occurred, how quickly does the problem need to be remedied?
The answer to this question varies by facility. In some cases, the organization may only need enough time for a controlled shutdown, after which the system can be repaired. These facilities may only require UPS/battery technology. In other cases, however, repairs may have to be completed while the facility continues full operation. This often requires access to generators for longer utility outages.
- What constitutes a failure?
The IEEE Gold Book, which supplies the base data for reliability calculations, does not provide the industry with a common definition; a committee is currently addressing this issue. For now, most would agree that a failure occurs when the facility can no longer provide the service that it is expected to deliver. Common failures include broken communication or fiber links, loss of a single or multiple critical panelboards or loss of power to critical cooling. Once failure modes are identified, a design can be crafted to lessen the likelihood that the failure will occur.
Options for power reliability
Once the expectations and the need for reliable power are understood, plans for specific electrical infrastructure upgrades or installation can be considered. There may be several options for meeting the specific requirements for power reliability needs; however, each option will most likely require incorporating equipment specially designed to perform a specific function. Some of the available design options and applications are described below.
- Engine generator. An engine generator can be used as an alternate electric power source in case of a local utility outage. Unlike a utility source, an on-site backup generator can be a fairly “soft” source of power. Considerations should include not only how much of a load a generator can serve in kilowatts, but also what types of loads.
Choosing an appropriate engine generator is not as simple as looking at a facility’s load in kilowatts and supplying an engine with a nameplate to match that kilowatt load. In general, when a power source is sized to match the load closely, any sudden change in the operating load characteristics will cause the power source to react. An example would be a large motor turning itself on and off, causing the generator voltage to dip or spike momentarily. Such cycling may cause a sensitive load downstream to interpret the generator as an undesirable source of power — sometimes with permanent damage to the load occurring.
There are also cases of incompatibility issues between a UPS system and a generator. Because a UPS system draws power from its source in a non-linear or a non-continuous fashion, the engine generator must be specified and designed to accommodate the specific UPS load characteristics.
- Uninterruptible power supply system. If the majority of the facility equipment necessary for continued business operation is computer-based and momentary outages are not acceptable, protection by an engine generator will be inadequate. Typical computer-based equipment cannot survive a power disruption lasting more than 16 to 20 milliseconds. A typical generator will take anywhere from 10 to 30 seconds to start up and re-energize the loads.
If only a controlled shutdown of computers is required, UPS-only protection is sufficient. However, if continued business operation is required, an engine generator in combination with a UPS system would be desired. A UPS system in this case would function as a “bridging” device between the utility and the generator. An outage or any type of utility power variation not acceptable to computers will be transparent to the loads.
- Automatic static transfer switch. This type of equipment is typically used in conjunction with a pair of upstream power conditioning units, such as UPS systems, providing power source redundancy to facility loads. Two sources of UPS power would be available to the load. If one UPS system fails or needs to be taken out of service for maintenance, the facility loads would continue to be protected by the second UPS system.
- Dual power path topology. Usually reserved for a Tier 4 facility where the system needs to be concurrently maintainable and fault tolerant, this type of design is the most expensive. It requires two equal systems to be in place, with one system completely independent of the other. Any equipment failure or maintenance outage event can be completely isolated, so that it is transparent to the loads. A complete system failure in one system would have no impact on the other or the loads.
Risks associated with each design option
While there are benefits to each design option, there are also risks. A generator, UPS system or automatic static transfer switch is designed to improve power reliability, but these systems require maintenance and are prone to fail without it. The facility executive should have a full understanding of the failure consequences of each system before a design strategy is chosen. For example, if a facility consisting of a single UPS system or a single engine generator experiences a failure in either the UPS system or the generator, facility loads would be left unprotected and exposed to possible power disturbances. The time required to fix the problem and restore the failed system may leave the unprotected loads vulnerable for days, weeks or even months.
In a design involving an automatic static transfer switch, an additional UPS system can be added to the system to provide redundancy in case of a UPS failure. However, the automatic static transfer switch itself does not provide complete protection against all failure scenarios. If the automatic static transfer switch itself fails, the critical loads will be lost even if the two UPS systems are available.
Losing Power Can Have Serious Consequences
Increased reliance on computers to perform critical functions means many companies face significant consequences if power is lost. Consider these examples:
Hospitals:
- Centralized data storage facilities lose critical power, preventing doctors from retrieving patient information.
- Donor organ parameter search prevented because of power loss.
Broadcasting:
- A sporting event loses power in its booth and can’t transmit, resulting in lost advertising revenues.
Government:
- The state agency responsible for providing state police with license plate and driver information has failed, so that police officers cannot conduct background searches before approaching stopped cars.
Retail:
- Data on merchandise sold during the day is not transmitted back to the supply house, leaving shelves empty during peak buying periods.
- A manufacturer cannot retrieve data on how its products are selling in certain markets, so it cannot adapt its advertising campaign accordingly.
Insurance:
- A single facility is responsible for the entire online business unit and suffers a critical power loss because of a natural disaster.
- All the policy records are lost, negatively affecting policyholders’ ability to file claims.
Manufacturing:
- A small data center in the United States has become responsible for providing data to five worldwide intelligent manufacturing factories. Power failure interrupts manufacturing; company incurs penalties from their customers. Deadlines are missed for processing orders; company incurs penalties.
|
Defining Critical Facilities
The Uptime Institute has developed a tier system for defining facility types. The tier system doesn’t attempt to define rigidly every aspect of power topology, but rather addresses the larger picture — maintenance and fault tolerance.
It is easy to understand why scrutiny is placed on systems that need to be concurrently maintainable and fault-tolerant. Those two factors drive the cost per kw. When maintenance is needed, reliability levels drop. To bring reliability during maintenance back to acceptable levels, additional infrastructure is included to bolster the system.
Duration of Critical Periods
|
Maintenance
|
Fault Tolerant
|
Electrical cost per kilowatt of critical power
|
|
<8 hours a day
|
7x24
|
7x24xForever
|
None
|
Requires Shutdown
|
Concurrent
|
Tier 1
|
|
|
|
X
|
X
|
|
|
$50-$250
|
Tier 2
|
X
|
|
|
|
X
|
|
|
$250-$1,000
|
Tier 3
|
X
|
X
|
X
|
|
|
X
|
|
$2,000-$3,000
|
Tier 4
|
X
|
X
|
X
|
|
|
X
|
X
|
$2,500-$5,300
|
|
Budgeting for Reliability
Design option
|
Amount of time needed to
run after failure
|
Cost per kw
|
Uninterruptible Power Supply (UPS)
|
10-15 minutes typical
|
$100 - $250
|
Generator
|
8-24 hours typical
|
$200 - $350
|
Automatic Static Transfer Switch
|
Continuous
|
$100 - $150
|
Dual Power Path Topology
|
Continuous
|
$2,500 - $5,000
|
|
James Warren, PE, is the vice president of engineering, Americas, for EYP Mission Critical Facilities. His experience includes design of generator plants, redundant UPS systems, grounding systems, DC power plants, and emergency life safety systems for both new and renovated mission-critical facilities.
James Kim, PE, is a senior electrical engineer with EYP Mission Critical Facilities, and has more than seven years of mission critical engineering experience. His experience includes UPS systems, diesel generators, medium voltage switchgear and power distribution.
Related Topics: