Re: [PATCH 00/15] Habana Labs kernel driver

From: Dave Airlie
Date: Wed Jan 23 2019 - 17:45:33 EST


On Thu, 24 Jan 2019 at 08:32, Oded Gabbay <oded.gabbay@xxxxxxxxx> wrote:
>
> On Thu, Jan 24, 2019 at 12:02 AM Dave Airlie <airlied@xxxxxxxxx> wrote:
> >
> > Adding Daniel as well.
> >
> > Dave.
> >
> > On Thu, 24 Jan 2019 at 07:57, Dave Airlie <airlied@xxxxxxxxx> wrote:
> > >
> > > On Wed, 23 Jan 2019 at 10:01, Oded Gabbay <oded.gabbay@xxxxxxxxx> wrote:
> > > >
> > > > Hello,
> > > >
> > > > For those who don't know me, my name is Oded Gabbay (Kernel Maintainer
> > > > for AMD's amdkfd driver, worked at RedHat's Desktop group) and I work at
> > > > Habana Labs since its inception two and a half years ago.
> > >
> > > Hey Oded,
> > >
> > > So this creates a driver with a userspace facing API via ioctls.
> > > Although this isn't a "GPU" driver we have a rule in the graphics
> > > drivers are for accelerators that we don't merge userspace API with an
> > > appropriate userspace user.
> > >
> > > https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#open-source-userspace-requirements
> > >
> > > I see nothing in these accelerator drivers that make me think we
> > > should be treating them different.
> > >
> > > Having large closed userspaces that we have no insight into means we
> > > get suboptimal locked for ever uAPIs. If someone in the future creates
> > > an open source userspace, we will end up in a place where they get
> > > suboptimal behaviour because they are locked into a uAPI that we can't
> > > change.
> > >
> > > Dave.
>
> Hi Dave,
> While I always appreciate your opinion and happy to hear it, I totally
> disagree with you on this point.
>
> First of all, as you said, this device is NOT a GPU. Hence, I wasn't
> aware that this rule might apply to this driver or to any other driver
> outside of drm. Has this rule been applied to all the current drivers
> in the kernel tree with userspace facing API via IOCTLs, which are not
> in the drm subsystem ? I see the logic for GPUs as they drive the
> display of the entire machine, but this is an accelerator for a
> specific purpose, not something generic as GPU. I just don't see how
> one can treat them in the same way.

The logic isn't there for GPUs for those reason that we have an
established library or that GPUs are in laptops. They are just where
we learned the lessons of merging things whose primary reason for
being in the kernel is to execute stuff from misc userspace stacks,
where the uAPI has to remain stable indefinitely.

a) security - without knowledge of what the accelerator can do how can
we know if the API you expose isn't just a giant root hole?

b) uAPI stability. Without a userspace for this, there is no way for
anyone even if in possession of the hardware to validate the uAPI you
provide and are asking the kernel to commit to supporting indefinitely
is optimal or secure. If an open source userspace appears is it to be
limited to API the closed userspace has created. It limits the future
unnecessarily.

> There is no way that "someone" will create a userspace
> for our H/W without the intimate knowledge of the H/W or without the
> ISA of our programmable cores. Maybe for large companies this request
> is valid, but for startups complying to this request is not realistic.

So what benefit does the Linux kernel get from having support for this
feature upstream?

If users can't access the necessary code to use it, why does this
require to be maintained in the kernel.

> To conclude, I think this approach discourage other companies from
> open sourcing their drivers and is counter-productive. I'm not sure
> you are aware of how difficult it is to convince startup management to
> opensource the code...

Oh I am, but I'm also more aware how quickly startups go away and
leave the kernel holding a lot of code we don't know how to validate
or use.

I'm opening to being convinced but I think defining new userspace
facing APIs is a task that we should take a lot more seriously going
forward to avoid mistakes of the past.

Dave.