Re: drm_vm.c:drm_mmap: possible circular locking dependency detected(was: Re: Linux 2.6.33-rc2 - Merry Christmas ...)

From: Linus Torvalds
Date: Wed Dec 30 2009 - 17:03:37 EST




On Wed, 30 Dec 2009, Eric W. Biederman wrote:

> Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes:
>
> > We've seen it several times (yes, mostly with drm, but it's been seen with
> > others too), and it's very annoying. It can be fixed by having very
> > careful readdir implementations, but I really would blame sysfs in
> > particular for having a very annoying lock reversal issue when used
> > reasonably.
>
> Maybe. The mnmap_sem has some interesting issues all of it's own.
> What reasonable thing is the drm doing that is causing problems?

The details are in the original thread on lkml, but it boils down to
basically (the below may not be the exact sequence, but it's close)

- drm_mmap (called with mmap_sem) takes 'dev->struct_mutex' to protect
it's own device data (very reasonable)

- drm_release takes 'dev->struct_mutex' again to protect its own data,
and calls "mtrr_del_page()" which ends up taking cpu_hotplug.lock.

Again, that doesn't sound "wrong" in any way.

- hibernate ends up with the sequence: _cpu_down (cpu_hotplug.lock) -> ..
kref_put .. -> sysfs_addrm_start (sysfs_mutex)

Again, nothing suspicious or "bad", and this part of the dependency
chain has nothing to do with the DRM code itself.

- sysfs_readdir() (and this is the big problem) holds sysfs_mutex in its
readdir implementation over the call to filldir. And filldir copies the
data to user space, so now you have sysfs_mutex -> mmap_sem.

See? None of the chains look bad. Except sysfs_readdir() obviously has
that sysfs_mutex -> mmap_sem thing, which is _very_ annoying, because now
you end up with a chain like

mmap_sem -> dev->struct_mutex -> cpu_hotplug.lock -> sysfs_mutex -> mmap_sem

and I think you'll agree that of all the lock chains, the place to break
the association is at sysfs_mutex. And the obvious place to break it would
be that last "sysfs_mutex -> mmap_sem" stage.

> > Added Eric and Greg to the cc, in case the sysfs people want to solve it.
>
> There are scalability reasons for dropping the sysfs_mutex in sysfs_readdir
> and I have some tenative patches for that. I will take a look after I
> come back from the holidays, in a couple of days. I don't understand
> the issue as described.

Ok, hopefully the above chain explains it to you, and also makes it clear
that it's rather hard to break anywhere else, and it's not somebody else
doing anything "obviously bogus".

Btw, the scalability issues with readdir() in general is why I'd be open
to try to change the rules for filldir(), and always make it a kernel
space copy. I suspect a number of users would like being able to use
spinlocks over filldir, but it's currently impossible.

However, we have a lot of filldir implementations (knfsd and several
different system call interfaces), and while some of them already use
kernel buffers (eg knfsd) others would need to allocate temporary storage
and then do a double copy. And I suspect even things like knfsd do things
like sleep and take locks, so it's possible that actually getting to the
point where filldir could be spinlock-safe would be infeasible.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/