Re: [PATCH] PCI/doe: Fix work struct declaration

From: Bjorn Helgaas
Date: Tue Nov 15 2022 - 17:13:08 EST


On Tue, Nov 15, 2022 at 12:54:39PM -0800, Ira Weiny wrote:
> On Tue, Nov 15, 2022 at 02:41:35PM -0600, Bjorn Helgaas wrote:
> > On Tue, Nov 15, 2022 at 12:18:38PM -0800, Ira Weiny wrote:
> > > On Tue, Nov 15, 2022 at 01:44:24PM -0600, Bjorn Helgaas wrote:
> > > > On Mon, Nov 14, 2022 at 05:19:43PM -0800, ira.weiny@xxxxxxxxx wrote:
> > > > > From: Ira Weiny <ira.weiny@xxxxxxxxx>
> > > > >
> > > > > The callers of pci_doe_submit_task() allocate the
> > > > > pci_doe_task on the stack. This causes the work structure
> > > > > to be allocated on the stack without pci_doe_submit_task()
> > > > > knowing. Work item initialization needs to be done with
> > > > > either INIT_WORK_ONSTACK() or INIT_WORK() depending on how
> > > > > the work item is allocated.
> > > > >
> > > > > Jonathan suggested creating doe task allocation macros such
> > > > > as DECLARE_CDAT_DOE_TASK_ONSTACK().[1] The issue with this
> > > > > is the work function is not known to the callers and must be
> > > > > initialized correctly.
> > > > >
> > > > > A follow up suggestion was to have an internal
> > > > > 'pci_doe_work' item allocated by pci_doe_submit_task().[2]
> > > > > This requires an allocation which could restrict the context
> > > > > where tasks are used.
> > > > >
> > > > > Compromise with an intermediate step to initialize the task
> > > > > struct with a new call pci_doe_init_task() which must be
> > > > > called prior to submit task.
> > > >
> > > > I'm not really a fan of passing a parameter to say "this struct is on
> > > > the stack" because that seems kind of error-prone and I don't know
> > > > what the consequence of getting it wrong would be. Sounds like it
> > > > *could* be some memory corruption or reading garbage data that would
> > > > be hard to debug.
> > > >
> > > > Do we have cases today where pci_doe_submit_task() can't do the
> > > > kzalloc() as in your patch at [3]?
>
> No.
>
> > > > If the current use cases allow a
> > > > kzalloc(), why not do that now and defer this until it becomes an
> > > > issue?
>
> I do like pci_doe_submit_task() handling this as an internal detail.
> I'm happy with that if you are.
>
> I was just concerned about the restriction of context. Dan
> suggested this instead of passing a gfp parameter.
>
> If you are happy with my original patch I will submit it instead.
> (With a better one liner.)

I don't know what's coming as far as pci_doe_submit_task() callers.
If there's some imminent caller that will require atomic context, I
guess we could solve it now. But DOE doesn't really seem like an
atomic context thing to begin with, so maybe we could postpone dealing
with it.

That patch in [3] is more complicated than I expected, but I admit I
haven't looked closely.

Bjorn

> > > > > [1] https://lore.kernel.org/linux-cxl/20221014151045.24781-1-Jonathan.Cameron@xxxxxxxxxx/T/#m88a7f50dcce52f30c8bf5c3dcc06fa9843b54a2d
> > > > > [2] https://lore.kernel.org/linux-cxl/20221014151045.24781-1-Jonathan.Cameron@xxxxxxxxxx/T/#m63c636c5135f304480370924f4d03c00357be667
> > > >
> > > > [3] https://lore.kernel.org/linux-cxl/Y2AnKB88ALYm9c5L@iweiny-desk3/