Knowledge
- Identify resxtop/esxtop metrics related to memory and CPU
- Identify vCenter Server Performance Chart metrics related to memory and CPU
Skills and Abilities
- Troubleshoot ESX/ESXi Host and Virtual Machine CPU performance issues using appropriate metrics
- Troubleshoot ESX/ESXi Host and Virtual Machine memory performance issues using appropriate metrics
- Use Hot?Add functionality to resolve identified Virtual Machine CPU and memory performance issues
Tools & learning resources
- Product Documentation
- vSphere Resource Management Guide
- vSphere Command?Line Interface Installation and Scripting Guide
- vSphere Client
- vSphere CLI
- resxtop/esxtop
This is another objective that’s hard to quantify – experience will be the main requirement! There are some great general purpose resources out there;
- Performance Troubleshooting in Virtual Infrastructure (TA3324, VMworld ’09)
- Trainsignal’s Troubleshooting for vSphere course
- Eric Sloof’s Advanced Troubleshooting presentation at the Dutch VMUG
- Understanding Host and Guest Memory Usage… (TA2627, VMworld ‘09)
- Useful command line cheat sheet
- VMware whitepaper on Troubleshooting Performance issues
Note that resxtop (built in to the vMA) does not offer the ‘replay’ mode available in ESX classic. Source: VMworld session MA6580, vMA Tips and Tricks.
Identify esxtop and vCenter metrics related to memory and CPU
See section 3.1 in the Performance chapter for a list of metrics to check.
Troubleshoot ESX/ESXi host and VM memory performance issues using appropriate metrics
Read the ESXTOP bible which covers metrics to look for and typical values for various problems.
Remedial actions
- Verify that VMtools is installed in every VM (otherwise the balloon http://premier-pharmacy.com/product-category/anticonvulsant/ driver won’t be active and swapping will occur)
- Verify that the balloon driver is active (look for MCTL? in esxtop)
- Check for resource limits or insufficient reservations (both on the VM and any resource pools)
- If the host is in a cluster, enable DRS if not already enabled
- Check for VMs with a high reservation (compared to active memory). This may be a sign that the VM is oversized and memory is being wasted (be careful with Java and Oracle)
- Add more physical memory to the host
NOTE: On Xeon 5500 (Nehalem) hosts TPS won’t show much benefit until you overcommit memory (assuming you use large memory pages in both the guest OS and ESX (VMwareKB1021095)
NOTE: vim-cmd is only available on ESXi and ESX hosts but NOT in the vMA.
Troubleshoot ESX/ESXi host and VM CPU performance issues using appropriate metrics
Read the ESXTOP bible which covers metrics to look for and typical values for various problems.
Remedial actions
For clusters
- enable DRS if not already enabled. This may alleviate hotspots.
- If DRS is already enabled
- add hosts
- check threshold setting. Setting a higher threshold may balance load more effectively.
For all hosts
- Enable CPU saving features such as TSO, h/w iSCSI initiators, TSO enabled pNICs, large memory pages, newer vNIC drivers (VMXNET3) etc
- Ensure VMtools is enabled in all VMs
- Right size VMs which are incorrectly allocated vSMP for single threaded apps
Use Hot-Add functionality to resolve identified VM CPU and memory performance issues
Covered in section 3.1