Lessons learned – P2V Exchange 2007

I did a physical to virtual conversion of some Exchange 2007 servers, running on Server 2008 last weekend. While everything went fine in general, there were a few lessons to be learned. There’s a lot of forum threads and blog posts written about this topic, but I figured I’d put up some of my experiences anyway.

I will start by describing the environment. The old servers were running on physical Dell hardware, at a remote location. The connection to the new site was 1Gbps end-to-end. The new environment is a fresh vSphere 5.5 cluster. I used VMware Converter Standalone version 5.5.2 for the conversion. Due to the nature of the tool, and the source being a physical server, the conversion was done “hot”, with the source servers on.

The prep

I started out by doing an inventory of the source servers. Checked disk sizes, memory usage, cpu usage. Made note of each service running, and whether they were automatically started or not. One of the first things you notice after a conversion is that your event log isn’t a pretty sight. Certain hardware is always left over after a P2V, which will have to be removed. So I also made a note of any “special” hardware that might be running, that has to be removed after conversion. Things like usb devices, display adapters, disk controllers (SCSI), HBA’s, network cards etc.

Anything as intensive as Exchange (such as SQL, sharepoint, active directory), needs to shut up before doing something like a P2V. Otherwise you will end up with either a non-functional virtual machine, or inconsistencies or the like.

I started by unmounting the Mail DB’s, and DB’s for public folders. Just to be on the safe side. They will be unmounted when you shut down the services anyway, but I guess I’m just pedantic that way. Some guides suggest that you could just unmount the databases and then start the conversion. I wanted to be safe so…

The services I stopped on a 2007 machine with the CAS, Hub, and Mailbox roles were:

– Microsoft exchange active directory topology service
– Microsoft exchange transport log search
– Microsoft exchange transport
– Microsoft exchange service host
– Microsoft exchange search indexer
– Microsoft exchange replication service
– Microsoft exchange mail submission
– Microsoft exchange mailbox assistants
– Microsoft exchange file distribution
– Microsoft exchange anti-spam update
– Microsoft exchange information store
– Microsoft exchange system attendant
– Microsoft search (exchange)
– IIS admin service
– World wide web publishing service

I also stopped services for Backup Exec, and for the AV-product. I’ve noticed AV-products tend to mess with VMWare Converter, at least in some cases.

On the Edge-server I stopped the following services (in addition to BE and the AV-stuff):

– Microsoft exchange ADAM
– Microsoft exchange transport log search
– Microsoft exchange transport
– Microsoft exchange anti-spam update
– Microsoft exchange credential service

The conversion

Conversion ran on the machines itself, using the “Powered-on machine”-option, and selecting “This local machine”. Pretty much default settings. Finalize synchronization after conversion. Converted the hard drives to thin. No changes to running services or anything like that. I usually don’t install VMWare Tools automatically, and I don’t uninstall VMware Converter components automatically either. I don’t trust automatics, and I usually take care of those post-conversion by hand.

Conversion ran at a comfortable 20-40MB/s and was done in a reasonable time. Considering it’s VMWare Converter.

Post-Conversion

Every P2V conversion guide says: After conversion, shut down the old physical machine and disconnect it from the network to make sure it never comes online again. There is a reason for this. Due to the environment, and lack of OOB management (no iDRAC, ILO or the like) there was no way to shut down or remove it from the network completely, without losing the ability to rollback. You always kind of want the option to go back to the old server, in case your conversion really goes tits up.

Anyway, the original machine was renamed, dropped from the domain and dropped from all networks except one. And in that network, I changed the IP. This way I still had a way in if I needed, but nothing to point back to the old server. Right? Wrong.

Here’s where service principal names come in. SPN’s can mess things up very quickly unless you are careful. In this case, even though the old server was renamed, and removed / changed in all networks, there were still things referring to the old server, namely SPNs. There are a number of uses for them, for instance Kerberos authentication. An exchange server has a number of SPN records, not just the regular HOST/server.name ones. There were also records like SMTP/ and MAIL/ and EXCHANGE/. Even though I had rebooted the server, the old SPNs had not disappeared. New ones were simply added. I didn’t want to start Exchange to see if the records would be removed/changed at that point, so I simply deleted all the SPNs that still referenced the old server name. I left the ones pointing to the new name, as they would not conflict with anything.

I had records pointing to the old server name for all of the following records: One pointing to the current name (call it server_old) and the other pointing to server (the original pre-virtualization name):

spnt_vanhalla_maililla_CLEAN_2
Actual server names removed to protect the innocent

 

Prior to removing the records, the new converted virtual server would not log into the domain. The error I received upon login was: “Error: The security database on the server does not have a computer account for this workstation trust relationship.”

I was able to login using the local account, so I knew I wasn’t completely hosed. The error message led me on a wild goose chase, though. The server had a computer account under the correct name in the domain (on all domain controllers). I tried resetting the computer account, I tried removing it, dropping the server out of the domain and then back. No help.

Eventually I started looking at SPN records using ADSIedit on one of the DCs. Under the domain context, find the computer account for the old account, and look under serviceprincipalname. Remove the SPNs from the old physical server compunter account that are pointing to the new virtualized server. Reboot the new virtual machine. There should be no conflicting names anywhere in the domain, and the login should now work. As it did!