Re: [PATCH 1/2] ipmi_ssif: Unregister i2c device only if created by ssif

From: Corey Minyard
Date: Tue Aug 28 2018 - 12:57:59 EST


On 08/28/2018 09:32 AM, George Cherian wrote:

Hi Corey,

On 08/28/2018 04:59 AM, Corey Minyard wrote:

On 08/27/2018 01:07 AM, George Cherian wrote:

Hi Corey,

On 08/24/2018 06:37 PM, Corey Minyard wrote:

On 08/24/2018 06:10 AM, George Cherian wrote:
In ssif_probe error path the i2c client is left hanging, so that
ssif_platform_remove will remove the client. But it is quite
possible that ssif would never call an i2c_new_device.
This condition would lead to kernel crash as below.
To fix this leave only the client ssif registered hanging in error
path. All other non-registered clients are set to NULL.

I'm having a hard time seeing how this could happen.

The i2c_new_device() call is only done in the case of dmi_ipmi_probe
(called from
ssif_platform_probe) or a hard-coded entry. How does
ssif_platform_remove get
called on a device that was not registered with ssif_platform_probe?


Initially I also had the same doubt but then,
ssif_adapter_hanlder is called for each i2c_dev only after initialized
is true. So we end up not calling i2c_new_device for devices probed
during the module_init time.


I spent some time looking at this, and I don't think that's what is
happening.

I think that i2c_del_driver() in cleanup_ipmi_ssif() is causing
i2c_unregister_device() to be called on all the devices, and
platform_driver_unregister() causes it to be called on the
devices that came in through the platform method. It's
a double-free.

Try reversing the order of i2c_del_driver() and platform_driver_unregister()
in cleanup_ipmi_ssif() to test this.

Reversing the call order didn't help, I am still getting the trace.

That's really strange. Calling ssif_platform_remove() should result in
i2c_del_driver() being called on the device, and thus i2c_del_driver()
should't free it. At least per you later analysis in this mail.


You are partly correct on the double free scenario. I dont see double free in normal operation. I see a double free only in probe failure case.


I have added prints in i2c_unregister_device() to print the client.
pr_err("client = %px\n", client);

In normal case, I get 2 calls to i2c_unregister_device()
Call 1: i2c-core:Â client = ffff800ada315400 => called from i2c_del_driver()
This in turn calls ssif_remove, where we set addr_info->client to NULL.

Call 2: i2c-core:Â client = 0000000000000000 => called from ssif_platform_remove()
This is fine since i2c_unregister_device is NULL aware.
This works fine without crashing .

Now in the probe failing case, I get 2 calls to i2c_unregister_device()
Call 1: i2c-core:Â client = ffff800ad9897400 => called from i2c_del_driver()
This never calls ssif_remove, since the probe failed.

Call 2: i2c-core:Â client = ffff800ad9897400 => called from ssif_platform_remove()
This is a case of double free.

Do you think the proposed patch itself is the solution or
Is it that we should really leave addr_info->client hanging in probe
error path at all?

I have been wondering that.


NB: For easy simulation of the ssif_probe failing case I just replaced
the

ssif_info->thread = kthread_run(....) with

ssif_info->thread = ERR_PTR(-4); so that the probe takes the goto out path.


Ok, that was my next step, trying to reproduce this. I can try that and look.

Quick question, though: Is this device coming through DMI? Or are you
registering this as a platform device someplace else?

-corey