It was a typical Friday. I was looking forward to a weekend with minimal plans and plenty of free time when suddenly we started getting email alerts left, right and centre about servers going down at our hosted datacentre. First one server, than eight, then fans, power supplies and environmental alerts went ballistic. There goes the weekend I thought…
It turned out that heavy rains has caused a leak in the roof at our datacentre (bad hosting company, go stand in the corner), resulting in water falling onto one of our production (isn’t it always?) HP bladecentres. Electronics and water obviously don’t mix well but the HP hardware managed surprisingly well. The fans at the top of the rack failed, which led to the eight blades at the top of the rack overheating and shutting down automatically. That probably saved the data and the blade hardware.
So where does Oracle licencing fit into this? Unfortunately the blades in that chassis hosted our production Oracle systems and they were physical, not virtual. This was largely due to Oracle’s infamous support stance on VMware as we run most other systems virtually. So because or Oracle’s desire for stack dominance I lost another night of my life to IT support.
Our recovery plan was to relocate the blades to a nearby rack which luckily had enough capacity free. Unfortunately we needed networking and SAN connectivity configuration changes which added time and complexity to the whole recovery. Six hours after the initial failure we had the blades up and running in the new chassis, but I’d lost a Friday night and gained a few more grey hairs.
How simple could this have been? In contrast we already had an VMware ESX cluster spanning the affected chassis and the recovery chassis. Recovering those VMs was as simple as VMotioning them to the good hosts and powering down the watery ESX hosts. About ten mins would have done it. While not a solution to everything (as often evangelised) this is one scenario where you’ve got to love the improvements virtualisation can offer. Simples!