Re: [PATCH v10 54/55] Input: atmel_mxt_ts: Implement synchronization during various operation

From: Dmitry Osipenko
Date: Thu Apr 02 2020 - 09:24:45 EST


02.04.2020 14:50, Wang, Jiada ÐÐÑÐÑ:
> Hi Dmitry
>
> On 2020/04/02 1:04, Dmitry Osipenko wrote:
>> 31.03.2020 13:50, Jiada Wang ÐÐÑÐÑ:
>>> From: Sanjeev Chugh <sanjeev_chugh@xxxxxxxxxx>
>>>
>>> There could be scope of race conditions when sysfs is being handled
>>> and at the same time, device removal is occurring. For example,
>>> we don't want the device removal to begin if the Atmel device
>>> cfg update is going on or firmware update is going on. In such
>>> cases, wait for device update to be completed before the removal
>>> continues.
>>>
>>> ÂÂÂÂ ThreadÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ Thread 2:
>>> =========================ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ
>>> =========================
>>> mxt_update_fw_store()ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ mxt_remove()
>>> mutex_lock(&data->lock)ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ ...
>>> mxt_initialize()ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ //Tries to acquire lock
>>> ÂÂ request_firmware_nowait()ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ mutex_lock(&data->lock)
>>> ...ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ ==>waits for lock()
>>> ...ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ .
>>> ...ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ .
>>> mutex_unlock(&data->lock)ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ .
>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ //Gets lock and
>>> proceeds
>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ
>>> mxt_free_input_device();
>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ ...
>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ
>>> mutex_unlock(&data->lock)
>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ //Frees atmel driver
>>> data
>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ kfree(data)
>>>
>>> If the request_firmware_nowait() completes after the driver removal,
>>> and callback is triggered. But kernel crashes since the module is
>>> already removed.
>>>
>>> This commit adds state machine to serialize such scenarios.
>>
>> Won't it be easier to bump driver's module use-count by __module_get()
>> while firmware is updating? Or remove sysfs during of mxt_remove()? >
>
> thanks for your inspiration, I will replace state machine with module
> use-count.

I'm actually now thinking that the suggestion about the module-count
wasn't very correct because this won't really help in regards to
mxt_update_fw_store() / mxt_remove() racing.

I see that mxt_remove() already invokes the mxt_sysfs_remove(), which
should block until mxt_update_fw_store() is completed, shouldn't it?

I guess the kfree(data) isn't the real cause of the problem and
something like this should help:

diff --git a/drivers/input/touchscreen/atmel_mxt_ts.c
b/drivers/input/touchscreen/atmel_mxt_ts.c
index b2edf51e1595..4e66106feeb9 100644
--- a/drivers/input/touchscreen/atmel_mxt_ts.c
+++ b/drivers/input/touchscreen/atmel_mxt_ts.c
@@ -4254,6 +4254,7 @@ static void mxt_sysfs_remove(struct mxt_data *data)
struct i2c_client *client = data->client;

sysfs_remove_group(&client->dev.kobj, &mxt_attr_group);
+ sysfs_remove_group(&client->dev.kobj, &mxt_fw_attr_group);
}

static void mxt_reset_slots(struct mxt_data *data)
@@ -4649,31 +4650,19 @@ static int mxt_remove(struct i2c_client *client)
{
struct mxt_data *data = i2c_get_clientdata(client);

- mutex_lock(&data->lock);
- if (data->e_state == MXT_STATE_UPDATING_CONFIG_ASYNC ||
- data->e_state == MXT_STATE_UPDATING_CONFIG) {
- data->e_state = MXT_STATE_GOING_AWAY;
- mutex_unlock(&data->lock);
- mxt_wait_for_completion(data, &data->update_cfg_completion,
- MXT_CONFIG_TIMEOUT);
- } else {
- data->e_state = MXT_STATE_GOING_AWAY;
- mutex_unlock(&data->lock);
- }
+ mxt_sysfs_remove(data);

- disable_irq(data->irq);
- sysfs_remove_group(&client->dev.kobj, &mxt_fw_attr_group);
if (data->reset_gpio) {
sysfs_remove_link(&client->dev.kobj, "reset");
gpiod_unexport(data->reset_gpio);
}
+
mxt_debug_msg_remove(data);
- mxt_sysfs_remove(data);
mxt_free_input_device(data);
mxt_free_object_table(data);

if (debug_state)
cancel_delayed_work_sync(&data->watchdog_work);
+ disable_irq(data->irq);

return 0;
}