In a great blogpost by Andre Leibovici he highlighted a default HP BIOS setting which could be impacting the performance of your VMs if your environment matches the following;
- low physical CPU utilisation
- higher than expected CPU %Ready times
Julian Wood has also blogged about this issue (Your HP blades may be underperforming) but neither go into too much detail about the fix. Having investigated I thought I’d record it here for others convenience.
To check for these symptoms you could use the VI client, ESXTOP in batch mode combined with the batch processing scripts in the vMA to capture pCPU statistics from a group of servers, or PowerCLI -whichever suits your skillset.
We run HP C-class blades and after checking the VMware knowledgebase article KB1018206 and a sample of our BIOS settings we found that it applied to us too – not surprising as we don’t modify the BIOS defaults during provisioning.
Using a mixture of ESXTOP and vCenter’s performance charts I was able to confirm that the %CPU Ready was hovering around the 4% mark even when the physical host was using less than 15% pCPU. After changing the power setting the same VMs (under a similar load) dropped to under 1% CPU Ready (the change was made at 17:00 if you look at the graph).
Not necessarily a show stopper but definitely an improvement.
For my infrastructure (with around 160 physical blades) changing them all was a time consuming process (and could potentially be disruptive depending on whether your ESX/i hosts are all clustered).
You can check the current power management setting in various ways;
- in the BIOS settings (slow and potentially disruptive)
- via the ILO (under Power Management, Power settings) or via the ILO CLI
- in the VI client. If the underlying BIOS is set to Dynamic Power Savings it’ll show as ‘Not Supported’ . ie the hardware is controlling power management. Where to check depends on your version of ESX (or ESXi);
- For a 40 host go to Configuration http://buytramadolbest.com/phentermine.html -> Processors and look at the Power Management settings.
- For a 4.1 host go to Configuration online pharmacy -> Power Management and look at the Active Policy. You can also configure it using the Properties button.
- You can also use PowerCLI (ESX4 only) by querying the host’s Advanced setting ‘Power.cpupolicy’
get-vmhost myhost | get-vmhostAdvancedConfiguration -name Power.cpupolicy
You must change the power management setting in the BIOS but there are a couple of options;
- via the console of each blade individually. This requires the server to be down.
- via the ILO for each blade individually. You can configure the setting while the server is running, but it won’t take affect until it’s restarted. This at least lets you more quickly set a group of hosts knowing that at the next restart it’ll be applied.
- Damian Karlson’s blogpost on how to script the change (via the ILO). If you have a large number of hosts this is probably the way forward.
If you choose to set ‘OS Control’ (which lets you use ESX to tweak the power management) you’ll require a reboot for the change to take effect. You can however set ‘HP Static High Performance Mode’ (which makes the server run at full power regardless of load) without a reboot, so this could be a short term solution if downtime is an issue.
Once the power management is set to ‘OS Control’ in the BIOS (and if you’re using vSphere v4.1+) you can refine power management in the hypervisor;
- for v4.1 you’ll need to set it manually (in vCenter go to a host, Configuration -> Power Management and look at the Power Management settings)
- with vSphere 5+ you can use Host Profiles
It’s important to note that while this specific issue is only applicable to vSphere the general power management settings should also be configured in your guest OS (for example Windows 2008’s magic peformance button).
Further Reading
Using Dynamic Voltage and Frequency Scaling (DVFS) for Power Management (VMwareKB1037164)
Power management in ESXi4 – Facts and Figures
What are P-States in vSphere and how do I use them?
Windows 2008’s magic peformance button
Converting between CPU summation and CPU % ready values – VMwareKB 2002181
I saw some issues in this area in a previous role that I had. I never got around to blogging about it though.
The odd thing in the situation that I encountered was that on some servers (HP DL380 G5 with AMD CPUs as I recall), even with the correct BIOS settings applied, the slow performance continued and was visible in ESXTOP.
We tried:
– Reboots
– Firmware updates
– ESX (4.0 U1 at the time) reinstalls
– Reseating the CPUs
Nothing seemed to correct it. We eventually swapped the entire chassis with an identical one not displaying the issue, pugged the original disks in and the problem went away. That provided the leverage to get replacement CPUs into another faulty server fixing the issue.
HP were as confused by this as we were but we got them to replace the CPUs in nearly a dozen other servers to clear the problem in the end.
Thanks for the extra info Michael.
We had similar very severe performance issue but it’s with OpenStack cloud. we have several thousands of servers and is not practical to go restarting the blades running a score of VMs on them, that too in busy period of the year. so, i used similar iLo script to update all the BIOS to have OS_COTNROL. Then, used cpufrequtils and did cpufreqset to maximum rated frequency so i am sure i won’t overheat the facility causing power and cooling problem instead.
Plan is to use the cpufrequtils to bring down the frequency and hence power during off-peak times of the day and year for savings on power and cooling. will post on how it goes.
Thanks for posting, especially on an old post. Useful info.