Re: Coccinelle rule for CVE-2019-18683

From: Alexander Popov
Date: Fri Apr 10 2020 - 20:11:06 EST


On 09.04.2020 22:41, Alexander Popov wrote:
> On 09.04.2020 01:26, Jann Horn wrote:
>> On Thu, Apr 9, 2020 at 12:01 AM Alexander Popov <alex.popov@xxxxxxxxx> wrote:
>>> CVE-2019-18683 refers to three similar vulnerabilities caused by the same
>>> incorrect approach to locking that is used in vivid_stop_generating_vid_cap(),
>>> vivid_stop_generating_vid_out(), and sdr_cap_stop_streaming().
>>>
>>> For fixes please see the commit 6dcd5d7a7a29c1e4 (media: vivid: Fix wrong
>>> locking that causes race conditions on streaming stop).
>>>
>>> These three functions are called during streaming stopping with vivid_dev.mutex
>>> locked. And they all do the same mistake while stopping their kthreads, which
>>> need to lock this mutex as well. See the example from
>>> vivid_stop_generating_vid_cap():
>>> /* shutdown control thread */
>>> vivid_grab_controls(dev, false);
>>> mutex_unlock(&dev->mutex);
>>> kthread_stop(dev->kthread_vid_cap);
>>> dev->kthread_vid_cap = NULL;
>>> mutex_lock(&dev->mutex);
>>>
>>> But when this mutex is unlocked, another vb2_fop_read() can lock it instead of
>>> the kthread and manipulate the buffer queue. That causes use-after-free.
>>>
>>> I created a Coccinelle rule that detects mutex_unlock+kthread_stop+mutex_lock
>>> within one function.
>> [...]
>>> mutex_unlock@unlock_p(E)
>>> ...
>>> kthread_stop@stop_p(...)
>>> ...
>>> mutex_lock@lock_p(E)
>>
>> Is the kthread_stop() really special here? It seems to me like it's
>> pretty much just a normal instance of the "temporarily dropping a
>> lock" pattern - which does tend to go wrong quite often, but can also
>> be correct.
>
> Right, searching without kthread_stop() gives more cases.
>
>> I think it would be interesting though to have a list of places that
>> drop and then re-acquire a mutex/spinlock/... that was not originally
>> acquired in the same block of code (but was instead originally
>> acquired in an outer block, or by a parent function, or something like
>> that). So things like this:

The following rule reported 146 matching cases, which might be interesting.

```
virtual report
virtual context

@race exists@
expression E;
position unlock_p;
position lock_p;
@@

... when != mutex_lock(E)
* mutex_unlock@unlock_p(E)
... when != schedule()
when != schedule_timeout(...)
when != cond_resched()
when != wait_event(...)
when != wait_event_timeout(...)
when != wait_event_interruptible_timeout(...)
when != wait_event_interruptible(...)
when != msleep()
when != msleep_interruptible(...)
* mutex_lock@lock_p(E)

@script:python@
unlock_p << race.unlock_p;
lock_p << race.lock_p;
E << race.E;
@@

coccilib.report.print_report(unlock_p[0], 'see mutex_unlock(' + E + ') here')
coccilib.report.print_report(lock_p[0], 'see mutex_lock(' + E + ') here\n')
```

Analysing each matching case would take a lot of time.

However, I'm focused on searching kernel security issues.
So I will filter out the code that:
- is not enabled in popular kernel configurations,
- doesn't create additional attack surface.
Then I'll take the time to analyse the rest of reported cases.

I'll inform you if I find any bug.

Best regards,
Alexander