DIF errros in ESXi log.

Have you ever saw DIF ERROR in yours ESXi logs. This is something you should probably start to worry about.

If you have never heard about DIF, it is optional feature for disk systems and communications that extend the SCSI standard to provide end-to-end protection of user data. So it provide protection in case of media and transmission errors.

DIF extend the disk sector from 512 bytes to 520 bytes. Needs support from all elements in infrastructure (especially including storage systems and OS drivers(!))

Normally new standard is not an issue until they are entered through the back door.

I saw situation when the new storage attached to the environment was a trigger that bad things started to happen. Other situation happens after ESXi HBA firmware/driver upgrade. Both are connected due the incorrect DIV handling by HBA card (qlogic).

First case was similar to this described here: https://vnote42.net/2020/08/27/esxi-storage-connection-problems-after-installing-a-new-array/

So the customer bought a storage. After storage was prepared in environment and ready to move load to this, whole vSphere environment started to behave unpredictably.

Performance was slow, virtual systems started crashing  (randomly) and eventually ESXi randomly freezes too.

In vmkernel.log lots of entries like this:

DIF ERROR in cmd: 0x28 Type=0x0 lba=0xb100 actRefTag=0x1000000, expRefTag=0xb100, actAppTag=0x0, expAppTag=0x0, actGuard=0x400, expGuard=0xa671

Please check kb: https://kb.vmware.com/s/article/80237

As destribed in this article, new qlogic drivers fix this errors. Other soluton (if for some reason you can’t do upgrade) is to disable t10dif on the driver level:

esxcfg-module -s “ql2xt10difvendor=0” qlnativefc.

Useful links:

https://kb.vmware.com/s/article/2113956

https://kb.vmware.com/s/article/2113956

https://kb.vmware.com/s/article/80237

https://h20195.www2.hpe.com/v2/getpdf.aspx/4aa3-3516enw.pdf

https://en.wikipedia.org/wiki/Data_Integrity_Field

https://www.t10.org/ftp/t10/document.03/03-111r0.pdf

T10 DIF (Data Integrity Field)

No Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

VMware
VCF, backup configuration

Backup implementation for VMware component is fairly easy. Just the requirements is to configure SFTP server in proper way and make it network available to the VMware components. SDDC Manager and NSX Manager backup In VCF Operations it is possible to configure backup for SDDC Manager and NSX Manager. Go …

VMware
VCF Automation, fresh environment configuration with identity providers and access control.

Introduction Login Login as user admin to the Organization name: system or if selected manual: Check the connections (in Administration section), where you should see connection to the vCenter and NSX-t manager as those are provided automatically via VCF Operations: the same for VCF Instances: Also check your networking: Identity …

VMware
VCF SoS

SoS (Supportability and Serviceability) command can be used for troubleshooting purpose to generate VCF (per component) log bundle, massively enable/disable ssh service on ESXi, vCenter, password and certificates expiry status, verify cluster health and many other. while troubleshooting, the following commands can be helpful: