RE: [RFC v2 0/4] vfio/hisilicon: add acc live migration driver

From: Shameerali Kolothum Thodi
Date: Wed Feb 02 2022 - 09:35:02 EST




> -----Original Message-----
> From: Jason Gunthorpe [mailto:jgg@xxxxxxxxxx]
> Sent: 02 February 2022 13:15
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@xxxxxxxxxx>
> Cc: kvm@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> linux-crypto@xxxxxxxxxxxxxxx; alex.williamson@xxxxxxxxxx;
> mgurtovoy@xxxxxxxxxx; Linuxarm <linuxarm@xxxxxxxxxx>; liulongfang
> <liulongfang@xxxxxxxxxx>; Zengtao (B) <prime.zeng@xxxxxxxxxxxxx>;
> yuzenghui <yuzenghui@xxxxxxxxxx>; Jonathan Cameron
> <jonathan.cameron@xxxxxxxxxx>; Wangzhou (B) <wangzhou1@xxxxxxxxxxxxx>
> Subject: Re: [RFC v2 0/4] vfio/hisilicon: add acc live migration driver
>
> On Fri, Jul 02, 2021 at 10:58:45AM +0100, Shameer Kolothum wrote:
> > This series attempts to add vfio live migration support for
> > HiSilicon ACC VF devices. HiSilicon ACC VF device MMIO space
> > includes both the functional register space and migration
> > control register space. As discussed in RFCv1[0], this may create
> > security issues as these regions get shared between the Guest
> > driver and the migration driver. Based on the feedback, we tried
> > to address those concerns in this version.
> >
> > This is now based on the new vfio-pci-core framework proposal[1].
> > Understand that the framework proposal is still under discussion,
> > but really appreciate any feedback on the approach taken here
> > to mitigate the security risks.
>
> Hi, can you look at the v6 proposal for the mlx5 implementation of the
> migration API and see if it meets hisilicon acc's needs as well?
>
> https://lore.kernel.org/all/20220130160826.32449-1-yishaih@xxxxxxxxxx/

Yes, I saw that one. Thanks for that and is now looking into it.

>
> There are few topics to consider:
> - Which of the three feature sets (STOP_COPY, P2P and PRECOPY) make
> sense for this driver?

I think it will be STOP_COPY only for now. We might have PRECOPY feature once
we have the SMMUv3 HTTU support in future.

>
> I see pf_qm_state_pre_save() but didn't understand why it wanted to
> send the first 32 bytes in the PRECOPY mode? It is fine, but it
> will add some complexity to continue to do this.

That was mainly to do a quick verification between src and dst compatibility
before we start saving the state. I think probably we can delay that check
for later.

> - I think we discussed the P2P implementation and decided it would
> work for this device? Can you re-read and confirm?

In our case these devices are Integrated End Point devices and doesn't have
P2P DMA capability. Hence the FSM arcs will be limited to STOP_COPY feature
I guess. Also, since we cannot guarantee a NDMA state in STOP, my
assumption currently is the onus of making sure that no MMIO access happens
in STOP is on the user. Is that a valid assumption?

> - Are the arcs we defined going to work here as well? The current
> implementation in hisi_acc_vf_set_device_state() is very far away
> from what the v1 protocol is, so I'm having a hard time guessing,
> but..

Right. The FSM has changed a couple of times since we posted this.
I am going to rebase all that now.

> RESUMING -> STOP
> Probably vf_qm_state_resume()
>
> RUNNING -> STOP
> vf_qm_fun_restart() - that is oddly named..
>
> STOP -> RESUMING
> Seems to be a nop (likely a bug)
>
> STOP -> RUNNING
> Not implemented currenty? (also a bug)
>
> STOP -> STOP_COPY
> pf_qm_state_pre_save / vf_qm_state_save
>
> STOP_COPY -> STOP
> NOP

I will check and verify this.

> And the modification for the P2P/NO DMA is presumably just
> fun_restart too since stopping the device and stopping DMA are
> going to be the same thing here?

Yes, in our case stopping device and stopping DMA are effectively the
same thing.

>
> The mlx5 implementation linked above is a full example you can cut and
> paste from for how to implement the state function and the how to do
> the data transfer. The f_ops read/write implementation for acc looks
> trivial as it only streams the fixed size and pre-allocated 'struct
> acc_vf_data'
>
> It looks like it would be a short path to implement our v2 proposal
> and remove a lot of driver code, as we saw in mlx5.
>

Ok. These are the git repo I am using for the rework,
https://github.com/jgunthorpe/qemu/commits/vfio_migration_v2
https://github.com/jgunthorpe/linux/tree/vfio_migration_v2

Please let me know if the above are not up to date.

Also, just noted that my quick prototype is now failing
with below error,

" Error: VFIO device doesn't support migration"

Do we need to set the below before the feature query?
Or am I using a wrong Qemu/kernel repo?

--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -488,6 +488,7 @@ static int vfio_migration_query_flags(VFIODevice
*vbasedev, uint64_t *mig_flags)
struct vfio_device_feature_migration *mig = (void *)feature->data;

feature->argsz = sizeof(buf);
+ feature->flags = VFIO_DEVICE_FEATURE_MIGRATION | VFIO_DEVICE_FEATURE_GET;
if (ioctl(vbasedev->fd, VFIO_DEVICE_FEATURE, feature) != 0)
return -EOPNOTSUPP;

Thanks,
Shameer