vSphere vDR 1.2 LVM limitation and workaround
Posted by JohnMurrayUK on 20th February, 2011
One of our users at virtualDCS was recently experiencing problems recovering data using the vSphere ‘VMware Data Recovery’ (vDR) release 1.2 within their CentOS Linux VM using the ‘File Level Recovery’ (FLR) tool.
I won’t go into detail with regards to describing the tool, as it has been expertly described here, and documented here already.
Although the vDR appliance was reporting successful backups, the FLR utility was not mounting all the partitions when accessing a selected restore point. The virtual machine in question was running CentOS 5.4 32bit, and had just a single vmdk but had a specific partition layout which caused issues with vDR.
The disk was configured with a small /boot partition and two larger LVM partitions as follows:
[root@localhost ~]#fdisk -l Disk /dev/sda: 85.8 GB, 85899345920 bytes 255 heads, 63 sectors/track, 10443 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sda1 * 1 13 104391 83 Linux /dev/sda2 14 1305 10377990 8e Linux LVM /dev/sda3 1306 10443 73400985 8e Linux LVM
It was these two 8e LVM partitions that vDR had issue with.
The standard vDR FLR mount looked ok, but only recovered the non-LVM partition on /dev/sda1 as follows:
[root@localhost /opt/vdr/VMwareRestoreClient]#./VdrFileRestore -a 172.16.1.10 (98) "Mon Feb 7 00:09:09 2011" (99) "Tue Feb 8 02:53:03 2011" Please input restore point to mount from list above 98 Created "/root/2011-02-07-00.09.09/Mount1" Restore point has been mounted... "/vcenter.domain.homelab/Datacentre One/host/Clus1/Resources/CustomerClone/CusClone1/CusClone1-WEB-1" root mount point -> "/root/2011-02-07-00.09.09" Please input "unmount" to terminate application and remove mount point
In order to see the details of the problem you need to run the FLR tool in verbose mode with the -v switch as follows:
[root@localhost /opt/vdr/VMwareRestoreClient]#./VdrFileRestore -a 172.16.1.10 (98) "Mon Feb 7 00:09:09 2011" (99) "Tue Feb 8 02:53:03 2011" Please input restore point to mount from list above 98 findRestorePointNdx: searching for 98 Restore Point 98 has been found... "/vcenter.domain.homelab/Datacentre One/host/Clus1/Resources/CustomerClone/CusClone1/CusClone1-WEB-1" "/SCSI-0:2/" "Mon Feb 7 00:09:09 2011" Initializing vix... VixDiskLib: config options: libdir '/opt/vdr/VMwareRestoreClient/disklibpluginvcdr', tmpDir '/tmp/vmware-root'. VixDiskLib: Could not load default plugins from /opt/vdr/VMwareRestoreClient/disklibpluginvcdr/plugins32/libdiskLibPlugin.so: Cannot open library: /opt/vdr/VMwareRestoreClient/disklibpluginvcdr/plugins32/libdiskLibPlugin.so: cannot open shared object file: No such file or directory. DISKLIB-PLUGIN : Not loading plugin /opt/vdr/VMwareRestoreClient/disklibpluginvcdr/plugins32/libvdrplugin.so.1.0: Not a shared library. VMware VixDiskLib (1.2) Release build-254294 Using system libcrypto, version 9080CF VixDiskLib: Failed to load libvixDiskLibVim.so : Error = libvixDiskLibVim.so: cannot open shared object file: No such file or directory. Msg_Reset: [msg.dictionary.load.openFailed] Cannot open file "/etc/vmware/config": No such file or directory. ---------------------------------------- PREF Optional preferences file not found at /etc/vmware/config. Using default values. Msg_Reset: [msg.dictionary.load.openFailed] Cannot open file "/usr/lib/vmware/settings": No such file or directory. ---------------------------------------- PREF Optional preferences file not found at /usr/lib/vmware/settings. Using default values. Msg_Reset: [msg.dictionary.load.openFailed] Cannot open file "/usr/lib/vmware/config": No such file or directory. ---------------------------------------- PREF Optional preferences file not found at /usr/lib/vmware/config. Using default values. Msg_Reset: [msg.dictionary.load.openFailed] Cannot open file "/root/.vmware/config": No such file or directory. ---------------------------------------- PREF Optional preferences file not found at /root/.vmware/config. Using default values. Msg_Reset: [msg.dictionary.load.openFailed] Cannot open file "/root/.vmware/preferences": No such file or directory. ---------------------------------------- PREF Failed to load user preferences. DISKLIB-LINK : Opened 'vdr://vdr://vdrip:1.1.1.40<>vcuser:<>vcpass:<>vcsrvr:<>vmuuid:<>destid:39<>sessdate:129415109490000000<>datastore:P4500-DS05<>vmdk_name:CusClone1-WEB-1.vmdk<>oppid:4499' (0x1e): plugin, 167772160 sectors / 80 GB. DISKLIB-LIB : Opened "vdr://vdr://vdrip:1.1.1.40<>vcuser:<>vcpass:<>vcsrvr:<>vmuuid:<>destid:39<>sessdate:129415109490000000<>datastore:P4500-DS05<>vmdk_name:CusClone1-WEB-1.vmdk<>oppid:4499" (flags 0x1e, type plugin). DISKLIB-LIB : CREATE CHILD: "/tmp/flr-4499-w4U6cS" -- twoGbMaxExtentSparse grainSize=128 DISKLIB-DSCPTR: "/tmp/flr-4499-w4U6cS" : creation successful. PREF early PreferenceGet(filePosix.coalesce.enable), using default PREF early PreferenceGet(filePosix.coalesce.aligned), using default PREF early PreferenceGet(filePosix.coalesce.count), using default PREF early PreferenceGet(filePosix.coalesce.size), using default PREF early PreferenceGet(aioCusClone1r.numThreads), using default --- Mounting Virtual Disk: /tmp/flr-4499-w4U6cS --- SNAPSHOT: IsDiskModifySafe: Scanning directory of file /tmp/flr-4499-w4U6cS for vmx files. Disk flat file mounted under /var/run/vmware/fuse/2848693010656666867 VixMntapi_OpenDisks: Mounted disk /tmp/flr-4499-w4U6cS at /var/run/vmware/fuse/2848693010656666867/flat. Mounting Partition 1 from disk /tmp/flr-4499-w4U6cS Created "/root/2011-02-07-00.09.09/Mount1" MountsDone: LVM volume detected, start: 106928640, flat file: "/var/run/vmware/fuse/2848693010656666867/flat" MountsDone: LVM volume detected, start: 10733990400, flat file: "/var/run/vmware/fuse/2848693010656666867/flat" System: running "lvm version 2>&1" System: start results... File descriptor 3 (pipe:[1356862]) leaked on lvm invocation. Parent PID 4701: sh File descriptor 4 (pipe:[1356862]) leaked on lvm invocation. Parent PID 4701: sh LVM version: 2.02.46-RHEL5 (2009-09-15) Library version: 1.02.32 (2009-05-21) Driver version: 4.11.5 System: end results... System: command "lvm version 2>&1" completed successfully LoopMountSetup: Setup loop device for "/dev/loop1" (offset: 106928640) : "/var/run/vmware/fuse/2848693010656666867/flat" LoopMountSetup: Setup loop device for "/dev/loop2" (offset: 2144055808) : "/var/run/vmware/fuse/2848693010656666867/flat" System: running "lvm vgdisplay 2>&1" System: start results... File descriptor 3 (pipe:[1356862]) leaked on lvm invocation. Parent PID 4706: sh File descriptor 4 (pipe:[1356862]) leaked on lvm invocation. Parent PID 4706: sh Couldn't find device with uuid '1KPTt2-2Kya-Wk4H-MDz7-0tgJ-a82T-N6OsIX'. --- Volume group --- VG Name VolGroup00 System ID Format lvm2 Metadata Areas 1 Metadata Sequence No 24 VG Access read/write VG Status resizable MAX LV 0 Cur LV 4 Open LV 4 Max PV 0 Cur PV 2 Act PV 1 VG Size 79.88 GB PE Size 32.00 MB Total PE 2556 Alloc PE / Size 2556 / 79.88 GB Free PE / Size 0 / 0 VG UUID KTg9lK-J48t-P6sw-03lC-TjAX-d5n6-8qcAEx System: end results... System: command "lvm vgdisplay 2>&1" completed successfully LVMFindInfo: found "VG Name" -> "VolGroup00" System: running "env LVM_SYSTEM_DIR=/tmp/flr-4499-2kqxja/ lvm pvscan /dev/loop1 /dev/loop2 2>&1" System: start results... File descriptor 3 (pipe:[1356862]) leaked on lvm invocation. Parent PID 4710: sh File descriptor 4 (pipe:[1356862]) leaked on lvm invocation. Parent PID 4710: sh Couldn't find device with uuid '1KPTt2-2Kya-Wk4H-MDz7-0tgJ-a82T-N6OsIX'. PV /dev/loop1 VG VolGroup00 lvm2 [9.88 GB / 0 free] PV unknown device VG VolGroup00 lvm2 [70.00 GB / 0 free] Total: 2 [79.88 GB] / in use: 2 [79.88 GB] / in no VG: 0 [0 ] System: end results... System: command "env LVM_SYSTEM_DIR=/tmp/flr-4499-2kqxja/ lvm pvscan /dev/loop1 /dev/loop2 2>&1" completed successfully System: running "env LVM_SYSTEM_DIR=/tmp/flr-4499-2kqxja/ lvm pvdisplay /dev/loop1 /dev/loop2 2>&1" System: start results... File descriptor 3 (pipe:[1356862]) leaked on lvm invocation. Parent PID 4714: sh File descriptor 4 (pipe:[1356862]) leaked on lvm invocation. Parent PID 4714: sh Couldn't find device with uuid '1KPTt2-2Kya-Wk4H-MDz7-0tgJ-a82T-N6OsIX'. Couldn't find device with uuid '1KPTt2-2Kya-Wk4H-MDz7-0tgJ-a82T-N6OsIX'. No physical volume label read from /dev/loop2 Failed to read physical volume "/dev/loop2" --- Physical volume --- PV Name /dev/loop1 VG Name VolGroup00 PV Size 9.90 GB / not usable 22.76 MB Allocatable yes (but full) PE Size (KByte) 32768 Total PE 316 Free PE 0 Allocated PE 316 PV UUID Qqk2st-jiXP-k281-A1Ug-nCtM-rn0a-I8eXlX System: end results... System: command "env LVM_SYSTEM_DIR=/tmp/flr-4499-2kqxja/ lvm pvdisplay /dev/loop1 /dev/loop2 2>&1" failed with error 1280 LoopDestroy: Removed loop device "/dev/loop1" (offset: 106928640) : "/var/run/vmware/fuse/2848693010656666867/flat" LoopDestroy: Removed loop device "/dev/loop2" (offset: 10733990400) : "/var/run/vmware/fuse/2848693010656666867/flat" LoopMountSetup: LVM mounts terminating due to fatal error VdrVixMountDone: Failed 1 Restore point has been mounted... "/vcenter.domain.homelab/Datacentre One/host/Clus1/Resources/CustomerClone/CusClone1/CusClone1-WEB-1" root mount point -> "/root/2011-02-07-00.09.09" Please input "unmount" to terminate application and remove mount point
Once again, only the /boot non-LVM partition held on /dev/sda1 was mounted. You can see from the results above that the LVM mounts failed due to a fatal error.
I wasn’t sure whether there was an undocumented incompatibility with my LVM version or fuse version, so I took the easy route and logged a call with VMware Support SR# 1589684961. After eliminating the obvious the ticket was escalated to a Research Engineer who was excellent (aren’t they all?). He told me that VMware was aware of an issue with multiple LVM partitions and were expecting to include a fix in an upcoming relase of vDR.
That was great, but my customer needed to ensure his backup process allowed FLR restores. I had to find a workaround that could be implemented without requiring a reboot as the virtual machine in question was aiming for 100% uptime.
My plan was to add a new vmdk to the VM, and migrate the data off the two existing LVM partitions, remove them both, than create a single LVM partition on the original disk and migrate the data back, before removing the temporary disk.
This is the procedure I used:
***hot add new 80GB thin SCSI disk as SCSI0:1
echo "scsi add-single-device" 0 0 1 0 > /proc/scsi/scsi
***partition the new disk
fdisk /dev/sdb n p 1
Accept first and last cylinders to use all space
***format partition as LVM type 8e
t 1 8e w
***prepare the new partition for LVM
pvcreate /dev/sdb1
***add the partition to the existing LVM Volume Group
vgextend VolGroup00 /dev/sdb1
***move the data off /dev/sda2 and /dev/sda3
pvmove /dev/sda2 /dev/sdb1 pvmove /dev/sda3 /dev/sdb1
***remove /dev/sda2 and /dev/sda3 from the VolGroup
vgreduce VolGroup00 /dev/sda2 vgreduce VolGroup00 /dev/sda3
***unprepare the original partitions
pvremove /dev/sda2 pvremove /dev/sda3
***delete the original partitions and create a single new bigger one
fdisk /dev/sda d 2 d 3 n p 2
Accept first and last cylinders to use all space
t 2 8e w
***instead of rebooting to recognise the partition you can just run
partprobe
I didn’t have parted installed, so before I could probe the partitions, I had to run
yum install parted
***prepare the new partition for LVM:
pvcreate /dev/sda2
***add the partition to the existing Vol Group
vgextend VolGroup00 /dev/sda2
***next move the data back off /dev/sdb1
pvmove /dev/sdb1 /dev/sda2
***remove the temp disk from the LVM Volume Group
vgreduce VolGroup00 /dev/sdb1
***unprepare the partition
pvremove /dev/sdb1
***delete the partition
fdisk /dev/sdb d 1
***remove the temporary disk from the virtual machine using vcenter then finally:
echo "scsi remove-single-device" 0 0 1 0 > /proc/scsi/scsi
I have no idea if the above will be of use to anyone, so please let me know if you find it helpful in any way. The new version of vDR will include a new version of the FLR tool anyway so let’s hope the issue is resolved in that.


Subscribe via Email

Tweets that mention vSphere vDR 1.2 LVM limitation and workaround « Adaptive Thinking -- Topsy.com said
[...] This post was mentioned on Twitter by John Murray, virtualDCS Team. virtualDCS Team said: RT @JohnMurrayUK: Blog: vSphere vDR 1.2 LVM limitation and workaround http://bit.ly/ebe9jH #virtualDCS [...]