top of page
Writer's pictureNizel Adams

Restoring a corrupted VM datastore

Updated: Jul 24, 2021





This is IT... sometimes things go awry & you need to be prepared for it. Once in a blue moon you run into the problem of your datastores going missing which causes your VMs not to boot after an update or migration where you had to shutdown the host.


1. Enable SSH on your host


2. Remote into your host


PuTTy is a go to tool for remote troubleshooting. You can easily telnet or SSH into devices such as an ESXi host.

A. Enter your ESXi host or vSphere server's IP address then select Open



3. Use partedUtil to gather all disk information & to see if the entire disk is corrupted, the partition table or just your vmfs datastore


4. Load up a Linux machine

A. Install vmfs-tools

vmfs-tools usually isn't easily available via a normal repository so you can install it via going directly to github or using the below commands :


Add RPM Fusion repositories to your system:


Fedora 22 and later:

sudo dnf install https://download1.rpmfusion.org/free/fedora/rpmfusion-free-release-$(rpm -E %fedora).noarch.rpm https://download1.rpmfusion.org/nonfree/fedora/rpmfusion-nonfree-release-$(rpm -E %fedora).noarch.rpm

Silverblue 29 and later:

sudo rpm-ostree install https://download1.rpmfusion.org/free/fedora/rpmfusion-free-release-$(rpm -E %fedora).noarch.rpm https://download1.rpmfusion.org/nonfree/fedora/rpmfusion-nonfree-release-$(rpm -E %fedora).noarch.rpm

# You will have to reboot for the rpmfusion repositories to appear)


RHEL 8 or compatible like CentOS

sudo dnf install --nogpgcheck https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
sudo dnf install --nogpgcheck https://download1.rpmfusion.org/free/el/rpmfusion-free-release-8.noarch.rpm https://download1.rpmfusion.org/nonfree/el/rpmfusion-nonfree-release-8.noarch.rpm

CentOS 8

sudo dnf config-manager --enable PowerTools

RHEL 8

sudo subscription-manager repos --enable "codeready-builder-for-rhel-8-*-rpms"

RHEL 7 or compatible like CentOS:

sudo yum localinstall --nogpgcheck https://download1.rpmfusion.org/free/el/rpmfusion-free-release-7.noarch.rpm https://download1.rpmfusion.org/nonfree/el/rpmfusion-nonfree-release-7.noarch.rpm

RHEL 6 or compatible like CentOS:

sudo yum localinstall --nogpgcheck https://download1.rpmfusion.org/free/el/rpmfusion-free-release-6.noarch.rpm https://download1.rpmfusion.org/nonfree/el/rpmfusion-nonfree-release-6.noarch.rpm

Once the repository is installed download latest rpmsphere-release rpm from

https://github.com/rpmsphere/noarch/tree/master/r

Install rpmsphere-release rpm:

# rpm -Uvh rpmsphere-release*rpm

Install vmfs-tools rpm package:

# dnf install vmfs-tools         


For Windows users in a 100% virtual environment:

Install Samba tools in-order to read Windows file shares:

sudo yum install samba-client samba-common cifs-utils guestmount

Test the connection to your fileshare:

sudo smbclient -L X.X.X.X/share -U homeUser

To exit smb type "exit"


Create a mount point where you'll place the damaged vmfs:

mkdir -p /mnt/vmfs
mkdir -p /mnt/vmdk

Mount the Windows Share:

sudo mount -t cifs -o username=<username> //X.X.X.X/share /mnt/vmfs

Give SELinux's qemu access to use samba:

sudo setsebool -P virt_use_samba 1


Edit the FUSE's config file to allow all users to see mounted files:

sudo vi /etc/fuse.conf


Remove the # from user_allow_other & save


Try to mount the vmdk using vmfs-tools first

vmfs-fuse /mnt/vmfs/<path.vmdk> /mnt/vmdk

Mount the vmdk file using guest mount:

sudo guestmount -a <path/filename.vmdk> -i -o allow_other --ro /mnt/vmdk


If that doesn't work manually check the partitions then manually specify which one to mount:

virt-filesystems -a <path/filename.vmdk>


Your output should give you something like this:

/dev/sda1

/dev/sda2

/dev/sda5

/dev/sda6

/dev/sda8


Then, include those devices in each of your commands:

sudo guestmount -a <path/filename.vmdk> -m/dev/sda1 -o allow_other --ro /mnt/vmdk


If at any point you receive an unknown status on the corrupt partition or don't see it at all don't fret as that just means it can't read the partition so it doesn't know what it is. In the example below the datastore is on partition 3, but as you can see guestmount cannot find the 3rd partition. Always remember to back up your files!


guestmount: /dev/sda1 (vfat)

guestmount: /dev/sda2 (vfat)

guestmount: /dev/sda5 (vfat)

guestmount: /dev/sda6 (vfat)

guestmount: /dev/sda7 (unknown)

guestmount: /dev/sda8 (vfat)

guestmount: /dev/sda9 (unknown)


Alternative 2 - fvmfs - The open source vmfs driver


Extract the zip

Test your connection to the vmfs (this is all one line)

java.exe -jar fvmfs.jar ssh://root:password@X.X.X.X/vmfs/devices/disks/<disk name> info


NordVPN Offer purple white

Alternative 3 - Rebuild the partition


If all else fails you can try Vmware's guide to rebuild the partition:


Alternative 4 - DiskInternals VMFS Recovery

You can use DiskInternal's VMFS Recovery software to view or repair the vmfs partition automatically:


Once installed locate the drive where the corrupted datastore is located and choose "reader."


Navigate to the vmdk file in question, right click on it choose "Expert > Mount as Disk"


Double click on the datastore you're looking for to open. If you can see the files then they can be recovered easily.


If you can't then right click on the partition choose "Open partition" then select Full Recovery > VMFS then hit "Next." Choose the file types you want to recover and continue. You can also select "Fast recovery"

In-order to export the files you'll have to pay for a license which is expensive ($700-$1700 depending on the license)



Conclusion:


Unfortunately, most vmfs tools are vaporware & haven't been updated in 5-8 years. This means almost all of them are completely unable to read newer vmfs formats like vmfs 5/6. The only up to date tool is DiskInternal's VMFS Recovery which at $700+tax minimum is much too expensive for a lot of small businesses. It also doesn't make sense to pay $700 to recover a QA environment (since we all know your production environment should have DR or at least a physical backup) or a handful of servers unless the time it takes to reconfigure is extensive. With no competition to DiskInternals, it looks like those in the worst case scenario will end up forking over a decent amount of cash.








 

Comments


bottom of page