Wednesday, February 4, 2009

Software patching is the other benefit of virtualization

Many thanks to Philip Sellers for this blog entry!

Our sys admin group seems to be constantly grappling with software patches. We feel constantly behind and reactive to new patches and firmwares that are released and it's a never ending cycle. Since I joined the company a little over 2 and half years ago, I've been asked to write a patch plan for our Windows servers twice, maybe three times. Unfortunately, we have never been able to make these patches happen consistently. We'll make a big push to patch and that seems to break lots of things, which forces us to stop again and fall further behind. Lately, we don't feel we have a choice but to apply some of the recent security holes that Microsoft has plugged. So we're faced with what seems like a catch-22 and so we have pulled the trigger, and have bitten the bullet.

Our last patch push was handled very well, with virtually no problems arising from the patches applied, except with our Citrix farm. Citrix Presentation Servers didn't like one or two of the patches - can't really tell you what happened there, but I know we reverted to pre-patched disks because of problems. The good is that we are finally (mostly) up to date. The bad - the last push was handled almost 100% manually, which anyone in our field will tell you is NOT the way to patch. It's too time consuming, monotonous and wasteful.

What is different today and what has allowed us to realistically look at automated patching today is our virtualization using VMware. Since the last patch plan I drew up, we've virtualized much of our datacenter. Also, most of our newer sprawl has been contained in virtual servers. Our datacenter today is about 80% virtualized to about 20% physical for Windows servers. We began investigating VMware's Update Manager product several months ago and we've been really impressed with the results.

Every good patch plan has a few basics that have to be included, in my opinion. First, you have to know what patches need to be applied - so you need to connect to a patch repository. There are third party software solutions that do a great job of this for a broad group of software products. VMware's Update Manager uses Shavlik to provide much of its update database. The second thing that plan should include is fail-back and recovery. There are times when patches just don't provide the expected results and being able to revert is always critical. Third, you should be able to control the time updates are applied and minimize the amount of sys admin interaction required. For Windows, that can be accomplished via group policy and Active Directory structure or using a third party software like Update Manager. Fourth, you have to make room for exceptions. Every network has these, whether it's the mission critical server that can't afford downtime or it's the self-important system with dictated uptime due to political reasons.

Since we have caught up to current patches on our systems, we've drafted a new patch plan in hopes of keeping up with never getting behind like that again. We settled upon using Microsoft's WSUS (Windows Server Update Services) and VMware's Update Manager as our two pronged solution. These two products hit our two major categories of Windows servers - WSUS physical and Update Manager for virtual servers. Both software allow for the approval of updates and reporting against the baseline of approved updates to see which systems require patching. From there, you can begin the remediation process to bring these systems up to the baseline.

Update Manager also brings the inherent benefits of virtualization to the table when patching is concerned. The Update Manager workflow and scheduler includes rollback snapshots with automatic removals to the workflows. This is a big capability, as we all well know that sometimes patches cause problems or even fail to install. The scheduling features are robust and allow for a fully customized rollout schedule while the administrator just sits back and watches the rollout occur. And, with any automation, there comes a small risk of missing something during the install, but so far, our experience is that the software reports back any problems so that you can give them attention individually. Its also a great solution for our DMZ since the updates are mounted as virtual CD's and installed from this. It addresses the problem of patches filling up a server because they are downloaded, executed, but never cleaned up using Automatic Update. All in all, we feel like we've found a winner.