Re: New subsystem for acceleration devices

From: Jason Gunthorpe
Date: Tue Aug 09 2022 - 08:43:48 EST


On Mon, Aug 08, 2022 at 11:26:11PM +0300, Oded Gabbay wrote:

> So if you want a common uAPI and a common userspace library to use it,
> you need to expose the same device character files for every device,
> regardless of the driver. e.g. you need all devices to be called
> /dev/accelX and not /dev/habanaX or /dev/nvidiaX

So, this is an interesting idea. One of the things we did in RDMA that
turned our very well is to have the user side of the kernel/user API
in a single git repo for all the drivers, including the lowest layer
of the driver-specific APIs.

It gives a reasonable target for a DRM-like test of "you must have a
userspace". Ie send your userspace and userspace documentation/tests
before your kernel side can be merged.

Even if it is just a git repo collecting and curating driver-specific
libraries under the "accel" banner it could be quite a useful
activity.

But, probably this boils down to things that look like:

device = habana_open_device()
habana_mooo(device)

device = nvidia_open_device()
nvidia_baaa(device)

> That's what I mean by abstracting all this kernel API from the
> drivers. Not because it is an API that is hard to use, but because the
> drivers should *not* use it at all.
>
> I think drm did that pretty well. Their code defines objects for
> driver, device and minors, with resource manager that will take care
> of releasing the objects automatically (it is based on devres.c).

We have lots of examples of subsystems doing this - the main thing
unique about accel is that that there is really no shared uAPI between
the drivers, and not 'abstraction' provided by the kernel. Maybe that
is the point..

> So actually I do want an ioctl but as you said, not for the main
> device char, but to an accompanied control device char.

There is a general problem across all these "thick" devices in the
kernel to support their RAS & configuration requirements and IMHO we
don't have a good answer at all.

We've been talking on and off here about having some kind of
subsystem/methodology specifically for this area - how to monitor,
configure, service, etc a very complicated off-CPU device. I think
there would be a lot of interest in this and maybe it shouldn't be
coupled to this accel idea.

Eg we already have some established mechinisms - I would expect any
accel device to be able to introspect and upgrade its flash FW using
the 'devlink flash' common API.

> an application only has access to the information ioctl through this
> device char (so it can't submit anything, allocate memory, etc.) and
> can only retrieve metrics which do not leak information about the
> compute application.

This is often being done over a netlink socket as the "second char"

Jason