Re: [PATCH 1/2] devcoredump: Remove devcoredump device if failing device is gone

From: Rodrigo Vivi
Date: Tue Jan 30 2024 - 10:16:34 EST


> > If the failing_device is gone, the 'data cookie' it used to register with
> > dev_coredumpm(... void *data,...), is also likely gone on a clean removal.
>
> That's on the user. You'll always be able to shoot yourself in the foot.
>
> > And to be honest, we shouldn't even count that the registered *read()
> > function pointer is valid anymore.
>
> That's not true: the module cannot be removed, there's a reference to it
> if you're using dev_coredumpm() correctly (which is to say: pass
> THIS_MODULE to the struct module *owner argument).
>
> > Well, we could indeed. And that would unblock our CI, but I'm afraid
> > it wouldn't protect the final user from bad memory access on a direct
> > $ cat /sys/class/devcoredump/devcd<n>/data
> >
> > Shouldn't we consider this critical itself to justify this entirely
> > removal?
>
> No? IMHO that's totally on the user. If you absolutely cannot make a
> standalone dump 'data' pointer (why not?! you can always stick the
> actual data into a vmalloc chunk and use dev_coredumpv()?)

hmm... fair enough. We would be okay here indeed since devcoredump always
free the data.

> then maybe we
> can offer ways of removing it when you need to?

well, to be honest my first local version was like that, offering
a dev_coredump_removal() that driver could request the removal
before going away.

> But I'd rather not, it
> feels weird to have a need for it.

We could change or CI and instruct our devs to always write
something to 'data' to ensure that devcoredump is deleted
before we can reload our module. Maybe that's the right
approach indeed, although I would really prefer to have
a direct way.


>
> johannes