Re: general protection fault in kernfs_add_one

From: Marcel Holtmann
Date: Tue Nov 19 2019 - 18:03:04 EST


Hi Linus,

> So looking at the decode, as usual the noise generated by KASAN isn't
> being very helpful, but it does look like at least one of the reports
> (I picked 5.2 because I don't care about 4.19 etc) is because
> 'kernfs_root(kn) is NULL in kernfs_add_one().
>
> Looking at the reports, every single one seems to have a call chain
> that comes from vhci_write() -> vhci_get_user() ->
> vhci_create_device() -> __vhci_create_device() -> hci_register_dev()
> -> device_add() -> kobject_add().
>
> (In this case, "every single one" is by looking at the last 10 reports
> sorted by date, it wasn't exhaustive).
>
> The way it got into 'write()' can be a bit varied (splice, write, whatever).
>
> That makes me think it's bluetooth that is the problem, but it might
> be an effect of how syzbot groups the reports too, of course.
>
> Might the device have been added at the same time that the last
> previous device was removed, so that the parent was deleted as the new
> device was aded? I dunno. The repro seem to be a repeated "open
> /dev/vhci, write two random bytes to it"
>
> Or might it be some "it happens after you've added enough devices that
> something overflows" issue?

long time ago there used to be an issue with quick device remove / device add operations, but that was fixed. I am just too fuzzy on the details since it has been a while.

We also havenât touched our sysfs integration in a while and Bluetooth support is so old that this might have been bit-rotting.

I need to run the re-producer myself and see if something stands out that I can spot.

Regards

Marcel