Re: [PATCH] habanalabs: improve error messages

From: Oded Gabbay
Date: Fri Mar 29 2019 - 13:29:36 EST


On Fri, Mar 29, 2019 at 7:29 PM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Thu, Mar 28, 2019 at 09:13:13AM +0200, Oded Gabbay wrote:
> > This patch improves two error messages to help the user to
> > better understand what error occurred.
> >
> > Signed-off-by: Oded Gabbay <oded.gabbay@xxxxxxxxx>
> > ---
> > drivers/misc/habanalabs/command_submission.c | 3 ++-
> > drivers/misc/habanalabs/memory.c | 2 +-
> > 2 files changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/misc/habanalabs/command_submission.c b/drivers/misc/habanalabs/command_submission.c
> > index f908643f871f..02c48da0b645 100644
> > --- a/drivers/misc/habanalabs/command_submission.c
> > +++ b/drivers/misc/habanalabs/command_submission.c
> > @@ -261,7 +261,8 @@ static void cs_timedout(struct work_struct *work)
> > ctx_asid = cs->ctx->asid;
> >
> > /* TODO: add information about last signaled seq and last emitted seq */
> > - dev_err(hdev->dev, "CS %d.%llu got stuck!\n", ctx_asid, cs->sequence);
> > + dev_err(hdev->dev, "User %d command submission %llu got stuck!\n",
> > + ctx_asid, cs->sequence);
> >
> > cs_put(cs);
> >
> > diff --git a/drivers/misc/habanalabs/memory.c b/drivers/misc/habanalabs/memory.c
> > index ce1fda40a8b8..39788b1cf8d0 100644
> > --- a/drivers/misc/habanalabs/memory.c
> > +++ b/drivers/misc/habanalabs/memory.c
> > @@ -109,7 +109,7 @@ static int alloc_device_memory(struct hl_ctx *ctx, struct hl_mem_in *args,
> > page_size);
> > if (!phys_pg_pack->pages[i]) {
> > dev_err(hdev->dev,
> > - "ioctl failed to allocate page\n");
> > + "Failed to allocate device memory (out of memory)\n");
>
> No need for a message at all here, right? The core should have already
> told you you had a problem.
>
> greg k-h

No, I don't think so, because this function is called for allocating
memory in the device's DRAM. So we don't pass through the Linux core
code inside.
We use the generic genalloc module to implement the device's DRAM
physical page allocator, and AFAIK, you won't get any message in case
genalloc fails to find a free memory in its pool.
Even if genalloc prints something, I would prefer to display a more
meaningful message to the user in this case. This allocation is
directly requested by the user (its part of the IOCTL code) and he
should know how it failed, no ?

Oded