Re: [PATCH] 9p: fix crash when transaction killed

From: Schspa Shi
Date: Wed Nov 30 2022 - 04:46:10 EST



asmadeus@xxxxxxxxxxxxx writes:

> (fixed Christophe's address, hopefully that will do for good...)
>
> Schspa Shi wrote on Wed, Nov 30, 2022 at 10:22:44AM +0800:
>> > I'm happy to believe we have a race somewhere (even if no sane server
>> > would produce it), but right now I don't see it looking at the code.. :/
>>
>> And I think there is a race too. because the syzbot report about 9p fs
>> memory corruption multi times.
>
> Yes, no point in denying that :)
>
>> As for the problem, the p9_tag_lookup only takes the rcu_read_lock when
>> accessing the IDR, why it doesn't take the p9_client->lock? Maybe the
>> root cause is that a lock is missing here.
>
> It shouldn't need to, but happy to try adding it.
> For the logic:
> - idr_find is RCU-safe (trusting the comment above it)
> - reqs are alloced in a kmem_cache created with SLAB_TYPESAFE_BY_RCU.
> This means that if we get a req from idr_find, even if it has just been
> freed, it either is still in the state it was freed at (hence refcount
> 0, we ignore it) or is another req coming from the same cache (if

If the req was newly alloced(It was at a new page), refcount maybe not
0, there will be problem in this case. It seems we can't relay on this.

We need to set the refcount to zero before add it to idr in p9_tag_alloc.

> refcount isn't zero, we can check its tag)

As for the release case, the next request will have the same tag with
high probability. It's better to make the tag value to be an increase
sequence, thus will avoid very much possible req reuse.

> The refcount itself is an atomic operation so doesn't require lock.
> ... And in the off chance I hadn't considered that we're already
> dealing with a new request with the same tag here, we'll be updating
> its status so another receive for it shouldn't use it?...
>
> I don't think adding the client lock helps with anything here, but it'll
> certainly simplify this logic as we then are guaranteed not to get
> obsolete results from idr_find.
>
> Unfortunately adding a lock will slow things down regardless of
> correctness, so it might just make the race much harder to hit without
> fixing it and we might not notice that, so it'd be good to understand
> the race.


--
BRs
Schspa Shi