Re: mlx5: Regression VFs fail to probe on v6.8-rc1

From: Saeed Mahameed
Date: Mon Jan 22 2024 - 14:55:50 EST


On 22 Jan 12:10, Niklas Schnelle wrote:
Hi Saeed, Hi Leon,

On current v6.8-rc1 on both s390x and on an Intel x86_64 test system
with a ConnectX-6 DX the mlx5 driver fails to probe for VFs (On x86
"echo 1 > /sys/bus/pci/devices/<dev>/sriov_numvfs" after a fresh boot
is enough and is 100% reproducible).

In dmesg I see the following messages (from the Intel server but it's
basically the same on s390x):

[ 110.443950] mlx5_core 0000:6f:00.1: E-Switch: Enable: mode(LEGACY), nvfs(1), necvfs(0), active vports(2)
[ 110.546248] pci 0000:6f:08.2: [15b3:101e] type 00 class 0x020000 PCIe Endpoint
[ 110.546340] pci 0000:6f:08.2: enabling Extended Tags
[ 110.547626] pci 0000:6f:08.2: Adding to iommu group 115
[ 110.553328] mlx5_core 0000:6f:08.2: enabling device (0000 -> 0002)
[ 110.553478] mlx5_core 0000:6f:08.2: firmware version: 22.36.1010
[ 110.718748] mlx5_core 0000:6f:08.2: Rate limit: 127 rates are supported, range: 0Mbps to 97656Mbps
[ 110.730136] mlx5_core 0000:6f:08.2: Assigned random MAC address ce:a6:ec:9e:70:49
[ 110.734351] mlx5_core 0000:6f:08.2: mlx5_cmd_out_err:808:(pid 650): CREATE_TIS(0x912) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x595b5d), err(-22)
[ 110.735776] mlx5_core 0000:6f:08.2: mlx5e_create_mdev_resources:174:(pid 650): alloc tises failed, -22
[ 110.736819] mlx5_core 0000:6f:08.2: _mlx5e_probe:6076:(pid 650): mlx5e_resume failed, -22
[ 110.749146] mlx5_core.eth: probe of mlx5_core.eth.2 failed with error -22



Hi Niklas,

The following two commits got reverted from net-next, so they are missing
in the current kernel release.

I will resend them as fixes to net branch, hopefully they will make
it to RC2 soon
https://lore.kernel.org/netdev/20231221005721.186607-2-saeed@xxxxxxxxxx/
https://lore.kernel.org/netdev/20231221005721.186607-3-saeed@xxxxxxxxxx/