Re: [PATCH 1/2] rcu: Do not release a wait-head from a GP kthread

From: Joel Fernandes
Date: Thu Mar 07 2024 - 08:14:20 EST




On 3/7/2024 7:57 AM, Uladzislau Rezki wrote:
> On Wed, Mar 06, 2024 at 05:31:31PM -0500, Joel Fernandes wrote:
>>
>>
>> On 3/5/2024 2:57 PM, Uladzislau Rezki (Sony) wrote:
>>> Fix a below race by not releasing a wait-head from the
>>> GP-kthread as it can lead for reusing it whereas a worker
>>> can still access it thus execute newly added callbacks too
>>> early.
>>>
>>> CPU 0 CPU 1
>>> ----- -----
>>>
>>> // wait_tail == HEAD1
>>> rcu_sr_normal_gp_cleanup() {
>>> // has passed SR_MAX_USERS_WAKE_FROM_GP
>>> wait_tail->next = next;
>>> // done_tail = HEAD1
>>> smp_store_release(&rcu_state.srs_done_tail, wait_tail);
>>> queue_work() {
>>> test_and_set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(work)
>>> __queue_work()
>>> }
>>> }
>>>
>>> set_work_pool_and_clear_pending()
>>> rcu_sr_normal_gp_cleanup_work() {
[..]
>>>
>>> Reported-by: Frederic Weisbecker <frederic@xxxxxxxxxx>
>>> Fixes: 05a10b921000 ("rcu: Support direct wake-up of synchronize_rcu() users")
>>> Signed-off-by: Uladzislau Rezki (Sony) <urezki@xxxxxxxxx>
>>> ---
>>> kernel/rcu/tree.c | 22 ++++++++--------------
>>> 1 file changed, 8 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
>>> index 31f3a61f9c38..475647620b12 100644
>>> --- a/kernel/rcu/tree.c
>>> +++ b/kernel/rcu/tree.c
>>> @@ -1656,21 +1656,11 @@ static void rcu_sr_normal_gp_cleanup(void)
>>> WARN_ON_ONCE(!rcu_sr_is_wait_head(wait_tail));
>>>
>>> /*
>>> - * Process (a) and (d) cases. See an illustration. Apart of
>>> - * that it handles the scenario when all clients are done,
>>> - * wait-head is released if last. The worker is not kicked.
>>> + * Process (a) and (d) cases. See an illustration.
>>> */
>>> llist_for_each_safe(rcu, next, wait_tail->next) {
>>> - if (rcu_sr_is_wait_head(rcu)) {
>>> - if (!rcu->next) {
>>> - rcu_sr_put_wait_head(rcu);
>>> - wait_tail->next = NULL;
>>> - } else {
>>> - wait_tail->next = rcu;
>>> - }
>>> -
>>> + if (rcu_sr_is_wait_head(rcu))
>>> break;
>>> - }
>>>
>>> rcu_sr_normal_complete(rcu);
>>> // It can be last, update a next on this step.
>>> @@ -1684,8 +1674,12 @@ static void rcu_sr_normal_gp_cleanup(void)
>>> smp_store_release(&rcu_state.srs_done_tail, wait_tail);
>>> ASSERT_EXCLUSIVE_WRITER(rcu_state.srs_done_tail);
>>>
>>> - if (wait_tail->next)
>>> - queue_work(system_highpri_wq, &rcu_state.srs_cleanup_work);
>>> + /*
>>> + * We schedule a work in order to perform a final processing
>>> + * of outstanding users(if still left) and releasing wait-heads
>>> + * added by rcu_sr_normal_gp_init() call.
>>> + */
>>> + queue_work(system_highpri_wq, &rcu_state.srs_cleanup_work);
>>> }
>>
>> Ah, nice. So instead of allocating/freeing in GP thread and freeing in worker,
>> you allocate heads only in GP thread and free them only in worker, thus
>> essentially fixing the UAF that Frederick found.
>>
>> AFAICS, this fixes the issue.
>>
>> Reviewed-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx>
>>
> Thank you for the review-by!
>
>> There might a way to prevent queuing new work as fast-path optimization, incase
>> the CBs per GP will always be < SR_MAX_USERS_WAKE_FROM_GP but I could not find a
>> workqueue API that helps there, and work_busy() has comments saying not to use that.
>>
> This is not really critical but yes, we can think of it.
>

Thanks, I have a patch that does that. I could not help but write it as soon as
I woke up in the morning, ;-). It passes torture and I will push it for further
review after some more testing.

thanks,

- Joel