While focused on protecting networks from security attacks and connectivity failures, administrators often inadvertently miss the subtle, ever-present danger of environmental threats. These threats include temperature, humidity, water leaks, intrusion, human error, vibration, and power outages.
Environmental Threats and Their Costs
The most common environmental threats to server rooms are temperature, humidity, water leaks, human error, intrusion, vibration, and power outage. Many of these threats, such as temperature and humidity, are related, which complicates environment monitoring and heightens the need for an automated, sophisticated system.
Temperature
Temperature is the main environmental threat to computer hardware. The generally accepted, ideal temperature is between 68 and 74 degrees Fahrenheit (20 to 24 degrees Celsius).
Excessive heat degrades network performance and causes downtime. As the temperature increases, a heat sinks fan works harder to cool the central processing unit (CPU). Continuous overworking causes the fan to fail, leading to a machine overheating. A machine shuts down when it reaches an unsafe temperature in order to prevent permanent damage. An administrator must then be located, day or night, go to the machine, and reboot it after it has cooled. Consequently, services hosted by a down machine are unavailable until it is restarted, which can take minutes or hours. If the server hosts critical services (e.g., e-commerce, user validation, email) that are not distributed to backup servers, revenues can be lost, users cannot login, and communications are interrupted. If the shut down is not done properly, data can be lost.
Excessive heat and rapid temperature changes also damage equipment. Rapid temperature increases can increase humidity, while rapid drops can cause water in humid air to condense on equipment. Together, heat and moisture accelerate the break down of materials used in microchips, motherboards, and hard drives, which is called premature aging. In worst cases, a machine won't shut down when the temperature exceeds safe levels, and circuits are damaged. Ultimately, heat-damaged equipment must be replaced, increasing the cost of network maintenance.
Controlling temperature is becoming more important and more difficult because of changes in equipment design and greater use of network services. New equipment runs hotter because it runs faster and does more work. Also, more circuits are placed closer and closer together, trapping heat in a smaller space. Smaller equipment also means that more equipment can be placed in the same space, usually packed tighter together. The increase in density of equipment causes a rise in the amount of heat dissipating in a rack cabinet. Increased network usage also increases heat, so as usage levels change during the day, so does the temperature and the need for cooling. For networks that operate near capacity 24 hours a day, every day of the year, there is little, if any, time for machines to cool down.
Humidity
When the temperature is between 68 and 74 degrees Fahrenheit (20 to 24 degrees Celsius), the relative humidity (i.e., the amount of water in the air) should be between 40% and 50%.
A high humidity level can produce the following problems in the server room:
A persistent low humidity level can produce the following problems:
Water Leaks
Proper planning moves equipment away from water pipes that might burst, basements that might flood, or roofs that might leak. However, there are other water leaks that are more difficult to recognize and detect. Blocked ventilation systems can cause condensation if warm, moist air is not removed quickly. If vents are located above or behind machines, condensation can form small puddles that no one sees. Standalone air conditioners are especially vulnerable to water leaks if condensation is not properly removed. Even small amounts of water near air intakes raise humidity levels and fill servers with moisture.
In addition, water from small pipe leaks can travel for long distances behind walls and continue for a long time before anyone notices it. Server rooms with raised floors are particularly vulnerable. All of the cables and wires for an entire network are concealed beneath floor panels. While this approach keeps cords safe from being accidentally unplugged, it makes monitoring their physical status difficult. Cables may be soaking in water for a long period before anyone notices. This situation breaks down insulation, and the loss of insulation causes signal leakage and performance degradation.
Human Error
Administrators/personnel can unknowingly create environment problems in server rooms by:
Similarly, cleaning crews sometimes close doors that should be left open for ventilation, thus increasing the temperature and reducing airflow.
Intrusion
Intruders, such as disgruntled employees and industrial spies, often strike at the most critical yet vulnerable points: the physical devices that store and control access to data. The small and delicate nature of modern computing equipment makes it easy to damage or steal; hard drives are compact enough to carry out in a briefcase, backpack, coat pocket, or purse.
Less sinister, but just as potentially harmful, are animal intrusions. Rodents, insects, birds and even larger animals have found their way into highly sensitive areas to wreak havoc upon equipment. Tiny contaminants, such as fur, dust, and dander, can cause component failure. Mice and rats chew through cable.
Vibration
Too much movement loosens connections within the server housing unseating boards and chips.
Vibration can also damage the hard drive disk, which rotates at extremely high speeds. Being bumped or moved can cause the platter, where the information is stored, and the head, which reads the information, to physically connect, causing scratches that permanently harm the disk drive.
Generally, vibration comes from mundane sources: being too close to halls or walkways, or being moved or bumped. Good space planning can keep shocks to a minimum, but IT staff should still monitor the situation. Some vibrations, such as those generated by a failing air conditioner, actually serve as warnings. Most machines vibrate more as performance worsens, so tracking fluctuations in equipment vibration becomes an important means to predicting failures.
Power Outage
Power outages, "brown outs," and voltage dips and spikes represent big problems for computing equipment. A simple hiccup in power levels, let alone a lightning strike, can cause servers to fail. In best-case scenarios, this costs precious time before rebooting. In worst-case scenarios, circuitry is irreparably damaged and must be replaced.
Weaknesses of Current Monitoring Practices
In a typical business, three groups monitor the environment: network administrators, security personnel, and facility maintenance employees. Network administrators often rely on a single thermometer and subjective notions about "comfort" to control the temperature of server rooms and data centers. In addition, security personnel and facility maintenance departments monitor areas outside of the server rooms. These three groups usually attempt to coordinate their efforts, but they maintain separate systems and practices. Ultimately, network administrators are primarily responsible for protecting hardware.
This approach has the following weaknesses:
An effective server environment monitoring system addresses the weaknesses in the current practice of having personnel monitor the environment.
ENVIROMUX Server Environment Monitoring Solutions
NTI offers two server environment monitoring solutions, the ENVIROMUX-SEMS-16 and ENVIROMUX-MINI. Both units monitor critical environmental conditions (such as temperature, humidity, and water leakage) that could destroy network components in a server room. When a sensor exceeds a configurable threshold, the system will notify the selected administrators/staff via email, SNMP traps, Web-page alerts and a visual indicator (LED). The systems connect to your IP network, so they can be configured and monitored from any workstation with a Web browser.
Both systems provide the following benefits:
ENVIROMUX-SEMS-16
The ENVIROMUX-SEMS-16 supports the following sensors/devices:
The ENVIROMUX-SEMS-16 supports the following alerts:
ENVIROMUX-MINI
The ENVIROMUX-MINI accommodates the following sensors/devices:
The ENVIROMUX-MINI supports the following alerts:
The Amazon Kindle Experience
I hadn’t really had a good opportunity play with it until a recent trip that I took with the family. I just got back yesterday so I figured I’d jot down a few thoughts while the experience was fresh in my mind.Where Is Your IT Company Hiding?
In the 7 years I’ve been selling our services, I can tell you the number one reason why companies switch to KTC Digital is because their current IT provider doesn’t call them back. The first operational rule for any service business: be available to your customers. If you’re going to provide a service in the first place why WOULDN’T you want to make yourself available?Building Your LinkedIn Network
These days, social networking is all the craze and that includes the business world. LinkedIn is a website that is designed for professionals to, “Get the most from your professional network.”