Wednesday 16 October 2013

Migrating ESXi4 to ESXi5 and changing vDistSw port groups without downtime

Scope: 


To move VMs from ESXi4 to ESXi5 without downtime means you won't be updating VMware tools (evidently future versions won't require a reboot--they promise this is the last). You can migrate to VMFS5 or upgrade datastore in place, but that won't be covered here either.

We're moving from ESXi4 hosts and vDist Switch port groups to ESXi5 hosts with different vDist Switch labels (but the same VLANs).

The method is different for VMs running on VMFS.  VMs with RDMs can't VMotion, but ours are part of Microsoft Clusters so we only have a small outage while they service is failed over from one VM to another.

Prerequisites and Setup before starting:


We needed an extra ESXi4 hosts and an extra ESXi5 host which I call "migration hosts" (or stepping stone hosts).  That's just to get the vDist Switches changed without interrupting the VMs network connections.


First we needed to move the VMKernel and VMotion off the vDist Switch so the host is not dependent on it (as we'll be removing the vDistSwitch as this host is moved to vCenter5).

Make sure you have one or two spare uplinks, you might need to remove a few spares from the vDist Switch.  Move the VMkernel configuration from the vDistSwitch to a newly created local port group from vCenter: Host > Configuration > Networking > vSphere Distributed Switch > Manage Virtual Adapters and "Migrate" (do VMkernel first, then do VMotion after you've finished VMKernel--two separate steps). 


 
who said VMKernel was dead in ESXi?  (^;

ESXi4 migration host: needs local port groups created and assigned VLANs for all VMs being migrated with vMotion.  Local port groups do not need the same name, but they need the same VLAN IDs so the VMs traffic will flow.  I did this a text file with all my port group names and VLAN IDs and a script that ran these commands on my ESXI4 migration host.

____________________________________________________
cat vDistPortGroups-new-only.txt | while read LINES ; do
pgn=`echo ${LINES} | awk '{print $1}'`
vlanid=`echo ${LINES} | cut -c1-3`
#echo $vlanid is vlanid and $pgn is portgroup name
echo esxcfg-vswitch -A "${pgn}" vSwitch0
echo esxcfg-vswitch -v ${vlanid} -p "${pgn}" vSwitch0
done
____________________________________________________

The ESXi4 migration host needs to be configured with the vDistSwitch used by your VMs that will be migrated and these new local port groups you've created with the commands from the script above.  the ESXi4 migration host is your stepping stone from VMs on ESXi4 with vDistSw to VMs on ESXi4 with local port groups.

Run the same commands on your ESXi5 migration host to get it ready as it will be a stepping stone from ESXi4 on local port groups to ESXi5 with local port groups (it will also be the stepping stone form ESXi5 on local port groups to ESXi5 on vDistSw).

Migrating VMs

ESXi4 to ESXi4 migration host:

1.  vMotion VMs from ESXI4 host to ESXi4 migration host.

2.  Migrate VMs to local port groups with Wizard; go to the Networking section of vCenter and highlight the vDist Switch and right click on "Migrate Virtual Machine Networking" to start the wizard.  Select the vDistSwitch port groups and VMs on your host.  I have this all mapped out in a spreadsheet as it can be confusing and I don't want to miss any VMs or NICs.

3.  Remove the vDistSw from your ESXi4 migration host

4.  Disconnect your ESXi4 migration host from vCenter4

5.  Remove your ESXi4 host from vCenter4.  Warning says you will loose VMs and resource pool info, but it's info in vCenter, not info on the ESXi host (read carefully, it's accurate, worrying, but no risk to VMs that are running).  Of course at this point if you have a host failure you won't get any HA benefits because you've moved the host from the HA cluster by removing it from vCenter).

6.  On vCenter5 (completely different hosts and vCenter and Nexus1K VMs, etc.) add the ESXi4 migration host which is running your migrating VMs.

7.  As our Nexus1K vDistSw VEM code on our ESXi4 and ESXi5 hosts are incompatible, we need an ESXi5 migration how in this step.  It needs the same local port groups as ESXI4 migration hosts to be configured, and you just vMotion your VMs from ESXi4 migration host to ESXi5 migration host (controlled by vCenter5).  Now you have your migration VMs on an ESXi5 host but they're still running on local port groups.  As the ESXi5 migration host has both local port groups and the new ESXi5 vDistSw port groups, you can use the same Virtual Machine Network Migration Wizard to move the VMs to their final vDistSw port groups.

8.  Finally, migrate the VMs with a regular vMotion to the finally "puka" ESXi5 host with the good vDistSw port groups.  The only difference between this host and the previous one is that there are no local port groups on this host.

9.  disconnect your ESXI4 migration host from vCenter5, reconnect it to vCenter4, "lather, rinse & repeat"

My checklist also includes making sure the HA/DRS settings are all correct and updated as new VMs are migrated into vCenter5 and cleaning up the migration hosts.


Migrating Microsoft Cluster VMs involves shutting down the passive/inactive node/VM, recording then removing the RDM's (remove, do not remove and delete) from the VM configuration (.vnx).  Disconnect/remove from vCenter4 the host with the passive RDM VM, add to vCenter5.  migrate to ESXi5 host while powered down, change VLANs for vNICS, add RDM's back by adding existing disks and browsing and selecting them--ensuring same SCSI ID is assigned.  Power up VM, failover MS Cluster so you can "lather, rinse & repeat" same steps on the other node and you're done!  This method does mean short interruption of service as MS cluster is failed over.

good luck!