28 January 2022

Nutanix LCM – Insufficient space on ESXi scratch disk

I was running into an issue where I could not run the Nutanix LCM Inventory action on a cluster because the scratch disk on an ESXi host was too small. And it seems reasonable to assume that if you’re reading this article, then you too probably have encountered an with the pre-check “test_esxi_scratch_space“.

I’ve seen the issue a few times now, and in my experience, the ESXi host has always just had its scratch disk set to the wrong disk. The first couple of times I saw this, the fix was to update the scratch disk from within the advanced host settings in VCSA. The part I did not like about fixing the issue this way, was that it required a reboot of the host for the setting to take hold. Having to schedule a maintenance period or create downtime is never an ideal solution. Thankfully I learned that there is another way to repoint the scratch disk that requires no downtime and it only requires a few lines of CLI against the ‘problem’ ESXi host.

Start by connecting via SSH to the ESXi host that is having the issue with the scratch disk.

Run the command “ls -ll /scratch” to find which volume is currently set as the scratch disk

root@ESXi# ls -ll /scratch
lrwxrwxrwx    1 root     root            49 May  8 23:40 /scratch -> /vmfs/volumes/5xyzxyz6-dxyzxyzb-1c73-ac1xyzxyz990

Run the “df -h” command to list all of the disks on the host and their sizes

root@ESXi# df -h

Filesystem   Size   Used Available Use% Mounted on
NFS          1.6T   1.4T    127.4G  92% /vmfs/volumes/OS-XXX-Repoxxx
VMFS-5      52.0G   1.1G     50.9G   2% /vmfs/volumes/NTNX-local-ds-17xyzz340111-B
vfat         4.0G  27.6M      4.0G   1% /vmfs/volumes/5xyzxyz-1234xyzz-12xy-1234xyzz1234
vfat       285.8M 205.8M     80.0M  72% /vmfs/volumes/5xyzxyz6-dxyzxyzb-1c73-ac1xyzxyz990
vfat       249.7M 152.6M     97.2M  61% /vmfs/volumes/58xyzxyz-cdxyzxyz-766a-12xyzxyz1226
vfat       249.7M 145.3M    104.4M  58% /vmfs/volumes/b4xyzxyz-80xyzxyz-9bf2-e5xyzxyzf6d0

Now that we have the current scratch disk and a list of the sizes of all the disks, we can check if the scratch volume is indeed set to the volume that is 4GB in size.

In the example above we can see that the volume “/vmfs/volumes/5xyzxyz6-dxyzxyzb-1c73-ac1xyzxyz990” is only 285MB in size. That means that this current volume is far too small. No wonder we’re getting an error.

We want to set our scratch disk to a volume that is 4GB in size. According to the list above that means we want to use the volume “/vmfs/volumes/5xyzxyz-1234xyzz-12xy-1234xyzz1234”. To set the desired scratch disk we’ll use the command “ln -sfn <volume_id> /scratch”.

root@ESXi# ln -sfn /vmfs/volumes/5xyzxyz-1234xyzz-12xy-1234xyzz1234 /scratch

If we recheck what the scratch disk is on our host, we’ll see that it is now set to the proper disk volume.

root@ESXi# ls -ll /scratch

lrwxrwxrwx    1 root     root            49 May  8 23:40 /scratch -> /vmfs/volumes/5xyzxyz-1234xyzz-12xy-1234xyzz1234

Now that the scratch disk is properly configured on the host we can update it in VCSA and be done.

From the Host, go to Configure, then Advanced Systems Settings, and click “Edit”.
Select “ScratchConfig.CurrentScratchLocation” and set it to the same value that you just manually configured on host. Hit “Apply”, and you’ll see that the VCSA now recognizes the newly configured scratch disk.

Well now we’re done, and we didn’t even need to reboot a single physical host! You can read more about this error in Nutanix’s KB article about it.