Removing trickier VMFS-datastores

Ok, maybe tricky isn’t the right word, but at least I couldn’t find anything written on this particular issue. Maybe it’s too simple a solution even for the VMware KB, but anyway.

I was cleaning out some local datastores (Smart Array 420 and 420i controllers) and ran into an issue where I was unable to remove the VMFS datastore because of a file in use error. It didn’t give me specifics; just told me that there were file(s) in use, and/or that the datastore was busy. After a fair amount of googling I started throwing some commands at it through the ssh. There’s a vmkfstools command that can break any existing locks, and it warns you that it will do it forcibly. So I tried that, given that there was nothing on the datastore that I couldn’t afford to lose (the point, after all, was to remove it). Despite grave warnings, vmkfstools was unable to break the lock and didn’t really give me a proper reason.

Looking at the vmkernel logs (/var/log/vmkernel.log by default), I saw the same references to files being in use, but no exact reference as to what files and where. No virtual machines were running anymore, and I had deleted most everything that I could off the datastore by hand already. There was a rather specific error message relating to corruption, and googling that got me exactly diddley. The datastore had had some problems previously, some hardware had been replaced, so there were a lot of variables and things that could have affected the case.

The solution, how ever, was much simpler. ESXi (5.1 update 1), a standalone server not attached to any cluster, was shoving logfiles onto the datastore I was trying to remove. Obviously, there would be ‘file in use’-errors. D’uh. So, from the host level, I went to the Configuration tab, and from there Advanced Settings. From there, Syslog -> If it is null (and it can be null), the logs are all reset if and when you reboot the host. If there’s a path, in the style of [datastore]/path, it’ll use that instead.

So for this particular case, I set a null path, which raises a warning that logs are being stored in a non-persistent location, but it then allowed me to delete the datastore (and/or detach it first) without issue.

I was probably thrown off by the vmkernel messages about corruption, though they may have played a part in why certain files and folders couldn’t be deleted by hand using datastore browser or the command line.

After everything was done, I redirected the logs back to one of the datastores, which clears the warning (no reboot needed here, or when I set the null path earlier).

I tried to find the specific error messages but I couldn’t. I may have them somewhere so I’ll shove them in here if I find them.

Some of the commands that helped me along were:

esxcli storage filesystem list ## This lists the filesystems that the server knows about, including their UUID, label and path. These are needed for many vmkfstools commands, so it’s a good place to start

vmkfstools -B /vmfs/devices/disks/naa.unique_disk_or_partition_goes_here ## This tries to ‘forcibly’ break any existing locks to the partition that may prevent you from proceeding. Didn’t work in my case, but also didn’t tell me anything useful..

vmkfstools -V ## re-read and reload vmfs metadata information

Some of the sites and blogs that helped me along:

VMWare KB article 1009570
VMWare KB article 2004201
VMWare KB article 2032823
VMWare KB article 1009565
VMWare KB article 2011220
VMWare KB article 2004605

Thanks to everyone who wrote those.

Leave a Reply

Your email address will not be published. Required fields are marked *