In last posts (part-1 and part-2) we have discussed background and driving forces for Green Capacity Planning. Today we will discuss the monitoring aspects of it. Monitoring is very important for any capacity planning process and Green capacity planning is not an exception. For the sake of management and easiness the post is further divided into two parts: monitoring basics and major monitoring tools available in the market.
IT equipments were rarely monitored for their energy consumption. The efficiency of system administrators and other IT people is decided by the availability and performance of IT Infrastructure. They are responsible for meeting the performance SLAs and availability around 99.99% in complex 24 X 7 environments. That is the reason that monitoring focus is mainly on these aspects of the Infrastructure and most of the metrics being captured by native utilities or third-party tools fall into availability or performance categories.
On the other hand facilities management people are responsible for energy, cooling, lighting and other aspects of a data center. It’s their responsibility to ensure that sufficient supporting infrastructure is always available to support the data center. As we have seen that power consumption is linearly related to device’s utilization there need to be synergy between IT management and facilities disciplines. However that is not the case. IT capacity planners analyze the impact of business demand on underlying infrastructure and forecast the capacity requirements. The impact of forecast on the supporting Infrastructure is not in the scope of their work. On the other hand, facilities teams usually don’t consider the impact of business demand on the supporting infrastructure. To summarize both disciplines work in isolation and rarely feed information to each other and this results in situations where you run out of power and data center migration becomes the need. Other than financial implications this impacts business services and unnecessary overhead which could have been avoided.
To overcome this situation a holistic approach is required where we take inputs from facilities management and IT management and come up with complete picture of data center infrastructure usage and impact of business demand. Peter Drucker rightly said “If you can’t measure it, you can’t manage it”. This is very true for Capacity Management process. To effectively manage a data center IT managers should be able to see what is happening on both the IT and Infrastructure sides.
It makes more sense for new-generation server hardware which has significantly improved over the years where power consumption of a server is dynamic and depends on the workload it carries out. This is good in terms of power efficiency but anticipating the data center’s energy requirements has become challenging. Also with the ever increasing cost of energy, the operating cost of these components are significant comparing to the total operating cost of a data center. According to Gartner report published on March’10, energy savings from well-managed data centers can reduce operating expenses by as much as 20%.
There are broadly two ways to measure the power consumption:
The Conventional Way
Conventionally IT managers base their energy planning on fundamentally flawed power calculations: vendor faceplate power specifications or the de-rating of these specifications. Both lead to an inaccurate energy requirements.
Historically server power benchmarks were not available and thus the only way for initial data center power planning was to rely on power data provided by system vendors in the form of faceplate values. But the use of faceplate value is flawed at the first place as it indicates the maximum power requirements for each component irrespective of its configuration or utilization. But the power consumption of a system is linearly correlated with its utilization. Because of this, a huge gap exists between the data center’s anticipated power requirement and the actual power required by its equipment. Another option which is used is fixed de-rating where an arbitrarily percentage or number is subtracted from the nameplate value considering that the system’s faceplate rating is higher than its actual use. For example a 1,000 watt rated server would be de-rated by a fixed 20 percent which means you are assuming that it would consume 800 watts. However this is not true as its power consumption would be dependent on the utilization and estimated value most often is grossly inflated. As you might think, finding the correct percentage by which de-rating should be done is nearly impossible without any measurement tool. Two servers of the same manufacturer and model can consume different powers because of the utilization.
Figure given below depicts power usage of IBM x3630 M3 system and its relation with server CPU utilization. The red line shows constant power usage at 675 watts as per the server faceplate value. The Spec benchmark data reveal different story where even 100% CPU utilization the maximum power draw of the server is 259 watts.
Now if we blindly use 675 watts as basis of our data center’s energy requirements it will result in huge unused energy. Other than financial ramifications there is a huge risk of replacing or building a new data center assuming that your existing data center has reached out of gas when the fact is it has lots of unused power available. Over-provisioning of power not only increases operational expenditure, but leads to unnecessarily high capital expenditure (Capex).
An Intelligent Way
Considering an important fact that server power consumption is dynamic we need to have a sophisticated way to measure actual power draw of a server based on the configuration and utilization. This is where DCIM tools play an important role. DCIM which is commonly known as Data center infrastructure management provides performance and utilization data for IT assets and physical infrastructure throughout the data center. The data collected at low-level infrastructure level aid domain experts (e.g., capacity planners and facilities planners), to conduct intelligent analysis. According to Gartner report “DCIM: Going Beyond IT” published in March 2010, DCIM tools are expected to grow to 60% in 2014 from 1% market penetration in 2010. DCIM doesn’t replace systems performance management or facilities management systems however it takes facets of each and apply them to data center infrastructures. It drives performance throughout the data center by monitoring and collection of low-level infrastructure data to enable intelligent analysis by IT capacity planners and facilities planners and thus enable holistic analysis of the overall infrastructure.
Now based on the technology it uses to collect the data we can categorize DCIM tools into hardware-based and software-based. In hardware-based approach power meters or sensors are installed with every device which measure the power usage and send the data to the centralized server. However hardware-based solutions are intrusive, expensive and time consuming to install the device in a large complex data centers. Software-based solutions on the other hand, are also available which monitors the device over the network through Simple Network Management Protocol (SNMP) protocol.
DCIM vendors are emerging very fast and in recent two years vendor market has become crowded. Also existing vendors are integrating their products to offer a common tool for data center management. By capturing power consumption data at the device level data center managers can gain a more-detailed view of their data centers and thus make informed decisions about equipment placement, cooling efficiency, power consumption and upgrades, and capacity planning. Predictive modeling is also an important component of these tools which provided cost effective and accurate solution for many designed data centers.
That’s it for now. In the next post we will further dive into monitoring aspects and discuss major market players and pros and cons of them.