Re: [PATCH] PCI: fix use-after-free in pci_register_host_bridge

From: Qinglang Miao
Date: Mon Dec 14 2020 - 02:25:35 EST




在 2020/12/11 23:46, Rob Herring 写道:
On Fri, Nov 20, 2020 at 03:48:48PM +0800, Qinglang Miao wrote:
When put_device(&bridge->dev) being called, kfree(bridge) is inside
of release function, so the following device_del would cause a
use-after-free bug.

Fixes: 37d6a0a6f470 ("PCI: Add pci_register_host_bridge() interface")

That commit did have some problems, but this patch doesn't apply to that
commit. See commits 1b54ae8327a4 and 9885440b16b8.

Reported-by: Hulk Robot <hulkci@xxxxxxxxxx>
Signed-off-by: Qinglang Miao <miaoqinglang@xxxxxxxxxx>
---
drivers/pci/probe.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 4289030b0..82292e87e 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -991,8 +991,8 @@ static int pci_register_host_bridge(struct pci_host_bridge *bridge)
return 0;
unregister:
- put_device(&bridge->dev);
device_del(&bridge->dev);
+ put_device(&bridge->dev);

I don't think this is right.

Let's look at pci_register_host_bridge() with only the relevant
sections:

static int pci_register_host_bridge(struct pci_host_bridge *bridge)
{
...

err = device_add(&bridge->dev);
if (err) {
put_device(&bridge->dev);
goto free;
}
bus->bridge = get_device(&bridge->dev);

...
if (err)
goto unregister;
...

return 0;

unregister:
put_device(&bridge->dev);
device_del(&bridge->dev);

free:
kfree(bus);
return err;
}

The documentation for device_add says this:
* Rule of thumb is: if device_add() succeeds, you should call
* device_del() when you want to get rid of it. If device_add() has
* *not* succeeded, use *only* put_device() to drop the reference
* count.

The put_device at the end is to balance the get_device after device_add.
It will *only* decrement the use count. Then we call device_del as the
documentation says.

Rob
.
Hi, Rob

Your words make sence to me: the code is *logicly* correct here and won't raise a use-after-free bug. I do hold a misunderstanding of this one, sorry for that ~

But I still think this patch should be reconsidered:

The kdoc of device_unregister explicitly mentions the possibility that other refs might continue to exist after device_unregister was called, and *del_device* is first part of it.

By the way, 'del_device() called before put_device()' is everywhere in kernel code, like device_unregister(), pci_destroy_dev() or switchtec_pci_remove()

In fact, I can't find another place in kernel code looks like:
put_device(x);
device_del(x);

So I guess put_device() ought to be the last time we touch the object (I don't find evidence strong enough in kdoc to prove this) and putting put_device after device_del is a more natural logic.

Qinglang
.