Friday 14 November 2014

how to see the .vmx file in esxi?

I needed to check for sure the chain of snapshot files for my VM. I could see the disks in vCenter, and could see the .vmsd file, but before breaking the file lock on an old snapshot file and forcing a delete, I needed to be sure. In ESX you just ssh to the host and vi the .vmx file. ESXi doesn't allow you to do that on a running VM. So, what to do? create a vm-support bundle of the host, then unzip and untar it. find the vmfs directory and one more directory under there are the VMFS datastores listed with their .vmx files intact. Open them in your fave editor, and enjoy!

Thursday 13 November 2014

what is vmname-vss_manifestss9.zip?

Never seen this before in the home directory of my VMs.  Related to the VSS snapshot created for the VADP backup that is having issues?

It's 32K in size, but other than that all I know is nothing on Google search, nothing on VMware.com or support.emc.com searches either.

No lock on the file, so I downloaded it and had a look:














So, looking inside, this manifest is to do with the VSS writer and the VADP backup.  Probably only left this file behind 'cause of problems with the backup.  Lots of DLLs listed in the writer xlm file:

<?xml version="1.0"?>
-<WRITER_METADATA version="1.1" xmlns="x-schema:#VssWriterMetadataInfo"><IDENTIFICATION dataSource="OTHER" usage="BOOTABLE_SYSTEM_STATE" friendlyName="System Writer" instanceId="b4973554-e918-490d-a887-46fc0a85c5a5" writerId="e8132975-6f93-4464-a53e-1050253ae220"/><RESTORE_METHOD rebootRequired="yes" writerRestore="never" method="REPLACE_AT_REBOOT"/>-<BACKUP_LOCATIONS>-<FILE_GROUP componentFlags="0" selectableForRestore="no" selectable="no" notifyOnBackupComplete="no" restoreMetadata="no" caption="System Files" componentName="System Files"><FILE_LIST filespecBackupType="3855" recursive="yes" filespec="*" path="C:\WINDOWS\system32\CatRoot\{127D0A1D-4EF2-11D1-8608-00C04FC295EE}"/><FILE_LIST filespecBackupType="3855" recursive="yes" filespec="*" path="C:\WINDOWS\system32\CatRoot\{F750E6C3-38EE-11D1-85E5-00C04FC295EE}"/><FILE_LIST filespecBackupType="3855" recursive="yes" filespec="*" path="C:\WINDOWS\system32\CatRoot2\{127D0A1D-4EF2-11D1-8608-00C04FC295EE}"/><FILE_LIST filespecBackupType="3855" recursive="yes" filespec="*" path="C:\WINDOWS\system32\CatRoot2\{F750E6C3-38EE-11D1-85E5-00C04FC295EE}"/><FILE_LIST filespecBackupType="3855" filespec="acgenral.dll" path="c:\windows\apppatch"/><FILE_LIST filespecBackupType="3855" filespec="aclayers.dll" path="c:\windows\apppatch"/><FILE_LIST filespecBackupType="3855" filespec="acres.dll" path="c:\windows\apppatch"/><FILE_LIST filespecBackupType="3855" filespec="acspecfc.dll" path="c:\windows\apppatch"/><FILE_LIST filespecBackupType="3855" filespec="acxtrnal.dll" path="c:\windows\apppatch"/><FILE_LIST filespecBackupType="3855" filespec="admwprox.dll" path="c:\windows\system32"/><FILE_LIST filespecBackupType="3855" filespec="admwprox.dll" path="c:\windows\syswow64"/><FILE_LIST filespecBackupType="3855" filespec="adsiis.dll" path="c:\windows\system32\inetsrv"/><FILE_LIST filespecBackupType="3855" filespec="adsiis.dll" path="c:\windows\syswow64\inetsrv"/><FILE_LIST filespecBackupType="3855" filespec="ahui.exe"

the backup.xml file isn't so much windoze gobeldy gook:

-<WRITER_COMPONENTS writerId="a6ad56c2-b509-4e6c-bb19-49d8f43532f0" instanceId="1c8717c4-c53e-4aac-8738-b510483836f8"><COMPONENT backupSucceeded="yes" componentType="filegroup" componentName="WMI"/></WRITER_COMPONENTS>

So, the backup reports as suceeded, but these files weren't cleaned up.

KC

Tuesday 11 November 2014

An error occurred while consolidating disks: msg.fileio.lock.(Can't consolidate VM snapshot)



1.  Tried clicking on consolidate -fail
2.  Tried creating and deleting snapshot -  succeeded, but didn't allow me to consolidate snapshot
3.  Tried creating and snapshot with memory state unticked -- same as above, no-go.
4.  Tried cloning the VM. The clone had consolidated disks, but didn't want the outage of switching VMs, plus the hassle of new mac address on the cloned VM with ghost vNIC issue.

5.  But then tried storage vMotioning the VM.  Bingo!  success!

This originating problem was caused by VADP backup.  Seems vSphere 5.5 handles snapshots differently, or maybe, as I read here.

******************
UPDATE: 13/11/2014
******************

overlooked two important facts here:

1) the old snapshots on your old disk are not cleaned up when you sVMotion to a new disk.
2) Trying to delete them from DataStore browser doesn't work.  Are these files still locked?  Shouldn't be as the .vmsd file is empty and the disks referenced as active by the VM is no longer the delta VMDK.  More to come.

I hope this helps someone avoid an outage for their VMs.

KC

Sunday 2 November 2014

Migrating the Whole Stack



Goal:  to migrate as many of our varied services and VMs with the least interruption to our customers.  vMotion and storage vMotion whenever possible.  Similar to my previous post here.

Preparation
New vBlock with all storage, compute, network and virtualisation components above was installed fresh so no in-place upgrades were needed as this method is simpler and has less risk.    


Prepare ESX4 migration storage

Storage: Create a VMFS3 datastore on the new vBlock.  Old vSphere v4 won’t recognise the new VMF5 so this is one of many “hops” that mean an extra step to avoid downtime on the VMs.

Prepare ESX4 migration host

Storage: Split HBAs so one is mapped to the old storage for the VMs before migration (unchanged) Then map the new VMFS3 datastore to the second HBA .  This is making the first migration host a “bridge” between the old storage and the new storage, which the VMs step across via vMotion..  

Prepare ESXi5 migration host 1

CPU: put in VMware cluster with EVC mode set to “Neehelam”.  This is so you can vMotion the VMs from the old CPU chipset on ESX4 to new CPU chipset on ESXi5.  This migration host is an extra step to avoid you needing to shutdown the VM to move it to newer CPUs.

LAN: Split pNICs on the 1st ESXi5 migration host by removing one (we have two) from the vDSwitch and assigning it to standard virtual switch and port groups identical to the ESX4 VM networking.

Prepare ESXi5 migration host 2

LAN: Split pNICs on the  2nd ESXi5 migration host by removing one from the vDSwitch and assigning it to standard virtual switch and port groups identical to the ESX4 VM networking.


Steps:

1.  vMotion first batch of VMs to the IBM/NetApp/ESX4 migration host on vCenter4
disconnect migration host from vCenter4 and connect to vCenter5

2.  sVMotion VMs to temporary VMFS3 datastore on the vBlock
configure vBlock vMotion pNIC on ESX4 host to enable vMotion to vBlock host

3.  migrate VMs to ESXi5 migration host with vMotion to get from old CPU to new CPU and vSphere5

4.  migrate VMs to 2nd ESXi migration host with vMotion to get from EVC to access new CPU features 

5.  use network migration wizard to move VMs from standard vDSwitch port groups to vDSwitch port groups with same vLANs.

6.  finally, migrate VMs to “permanent” ESXi5 host 
check cluster settings for VMs  (HA, DRS) in final vSphere5 clusters

7.  reconfigure vMotion disabling vBlock pNIC and enabling IBM/DataCore configured pNIC for vMotion ready for next batch of VMs to be migrated

8.  disconnect ESXi4 migration host from vCenter5 and reconnect to vCenter4.

repeat steps 1-11 until VMs are migrated

Next:  Exchange, SQL Server, and other VMs with RDM storage

EMC Networker BMR Bare Metal Recovery


Preparation
Disconnect the NIC of the machine you're about to recover, if it's not completely dead yet.  I always build a new machine identical to the one we're recovering.  Since they're always virtual, it's easy to set the same OS, vRAM, vNICs (mac address, if needed), vCPUs and VMDKs.

No need for an operating system or NW client to be installed on the new client as the BMR will do that.

Steps 
EMC has a video that shows this in action here or search YouTube for the same video.
Also, see page 634 of the Networker Administration Guide.

I thought it glosses over a few details which might be of interest.

Gotcha
My learning was around the version of the Wizard/ISO needed.  If your Networker server is v8.1.1 and your client is still running NW764, then don't try the Windows with NW 8.1 BMR.  It seems to work in loading and letting you fill out the fields for the wizard, and even formats the partition on your recovery client.  But it bombs right after trying to restore the files/folders to the partitions, with no real error message.  The logs are pretty unhelpful.

Learnings
You might need to wipe your recovery server if you need to make a few attempts. If the tool bombs and dumps you to DOS prompt and you try to restart it, you may have issues.  We got error that there was already a machine on the network with this ip address (even though it had not recovered the files/folders successfully at that stage).  Trashing the new recovery server's C drive from VMware and creating a new one got around this easily enough.

There doesn't seem to be a Linux version of this tool either, although you could make one easily enough.

I thought this tool might just bring back the crucial registry and disk partitions and OS needed to boot, then you might need to do another restore to get the rest of the data, but it did restore everything for me.  Nice.

Is the EMC documentation any good in your view?  There's a lot there, like:

Note:  By default, the Windows 2012 System Writer does not report Win32 Service Files as a part
of systems components. As a result, the volumes that contain Win32 Service Files are not
considered critical and the DISASTER_RECOVERY:\ save set will not include a volume that
contains files for an installed service. To configure the Windows 2012 server to report
Win32 Service Files as a part of system components, set the ReportWin32ServicesNonSystemState registry sub key to 0. Microsoft KB article 2792088 provides more information.


It mentions Windows storage spaces, storage pools, synthetic full backup as well, which I've not learned about yet.

My experience is that BMR formats and restores C drive and one more.  You'll need to run the Networker client software to restore any other data disks in the usual way.

Another thing you probably already noticed is that BMR doesn't know anything about the Virutal Machine.  That means the vNIC.  If you recover with BMR to a new VM, and the software on your server cares if the MAC address changes, then you'll do an extra step of changing the mac address of your vNIC to "manual" and use the copy/paste