Re: [Patch v2] sysfs: add lockdep class support to s_active

From: Xiaotian Feng
Date: Fri Feb 05 2010 - 02:09:59 EST


On Fri, Feb 5, 2010 at 2:42 PM, Amerigo Wang <amwang@xxxxxxxxxx> wrote:
> Recently we met a lockdep warning from sysfs during s2ram or cpu hotplug.
> As reported by several people, it is something like:
>
> [ 6967.926563] ACPI: Preparing to enter system sleep state S3
> [ 6967.956156] Disabling non-boot CPUs ...
> [ 6967.970401]
> [ 6967.970408] =============================================
> [ 6967.970419] [ INFO: possible recursive locking detected ]
> [ 6967.970431] 2.6.33-rc2-git6 #27
> [ 6967.970439] ---------------------------------------------
> [ 6967.970450] pm-suspend/22147 is trying to acquire lock:
> [ 6967.970460] Â(s_active){++++.+}, at: [<c10d2941>]
> sysfs_hash_and_remove+0x3d/0x4f
> [ 6967.970493]
> [ 6967.970497] but task is already holding lock:
> [ 6967.970506] Â(s_active){++++.+}, at: [<c10d4110>]
> sysfs_get_active_two+0x16/0x36
> [...]
>
> Eric already provides a patch for this[1], but it still can't fix the
> problem. Based on his work and Peter's suggestion, I write this patch,
> hopefully we can fix the warning completely.
>
> This patch put sysfs s_active into two classes, one is for PM, the other
> is for the rest, so lockdep will distinguish them.

I think this patch does not hit the root cause, we have a similiar
warning which is not related with PM.
Reported by Nick when he's trying to switch evalator. It is
reproducable with "echo deadline >/sys/block/sdx/queue/scheduler"
while kernel is using cfq.

[ INFO: possible recursive locking detected ]
2.6.33-rc6 #1
---------------------------------------------
sh/889 is trying to acquire lock:
(s_active){++++.+}, at: [<7820a975>] sysfs_addrm_finish+0x27/0x4e

but task is already holding lock:
(s_active){++++.+}, at: [<7820ab82>] sysfs_get_active_two+0x18/0x3e

other info that might help us debug this:
4 locks held by sh/889:
#0: (&buffer->mutex){+.+.+.}, at: [<7820984e>] sysfs_write_file+0x20/0x99
#1: (s_active){++++.+}, at: [<7820ab82>] sysfs_get_active_two+0x18/0x3e
#2: (s_active){++++.+}, at: [<7820ab91>] sysfs_get_active_two+0x27/0x3e
#3: (&q->sysfs_lock){+.+.+.}, at: [<78289e95>] queue_attr_store+0x2e/0x68

stack backtrace:
Pid: 889, comm: sh Not tainted 2.6.33-rc6 #1
Call Trace:
[<784a6966>] ? printk+0xf/0x11
[<781752a1>] print_deadlock_bug+0x99/0xa3
[<781753c6>] check_deadlock+0x11b/0x140
[<781763e5>] validate_chain+0x4ec/0x4f9
[<78176a68>] __lock_acquire+0x676/0x6cf
[<78176b64>] lock_acquire+0xa3/0xbc
[<7820a975>] ? sysfs_addrm_finish+0x27/0x4e
[<7820a37a>] sysfs_deactivate+0x6c/0xa4
[<7820a975>] ? sysfs_addrm_finish+0x27/0x4e
[<7820a975>] sysfs_addrm_finish+0x27/0x4e
[<7820aa3a>] sysfs_remove_dir+0x62/0x72
[<7829d6dd>] kobject_del+0x11/0x32
[<78283406>] __elv_unregister_queue+0x18/0x20
[<78283c66>] elevator_switch+0x6d/0x11b
[<78283d92>] elv_iosched_store+0x7e/0x9b
[<78289eb8>] queue_attr_store+0x51/0x68
[<78209894>] sysfs_write_file+0x66/0x99
[<781cd460>] vfs_write+0x8a/0x108
[<781cd578>] sys_write+0x3c/0x63
[<78125b90>] sysenter_do_call+0x12/0x36

>
> 1. http://lkml.org/lkml/2010/1/10/282
>
>
> Reported-by: Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx>
> Reported-by: Larry Finger <Larry.Finger@xxxxxxxxxxxx>
> Reported-by: Miles Lane <miles.lane@xxxxxxxxx>
> Reported-by: Heiko Carstens <heiko.carstens@xxxxxxxxxx>
> Signed-off-by: WANG Cong <amwang@xxxxxxxxxx>
> Cc: Eric W. Biederman <ebiederm@xxxxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Tejun Heo <tj@xxxxxxxxxx>
> Cc: Greg Kroah-Hartman <gregkh@xxxxxxx>
>
> ---
> diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
> index 699f371..d7de269 100644
> --- a/fs/sysfs/dir.c
> +++ b/fs/sysfs/dir.c
> @@ -354,7 +354,6 @@ struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type)
>
> Â Â Â Âatomic_set(&sd->s_count, 1);
> Â Â Â Âatomic_set(&sd->s_active, 0);
> - Â Â Â sysfs_dirent_init_lockdep(sd);
>
> Â Â Â Âsd->s_name = name;
> Â Â Â Âsd->s_mode = mode;
> diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
> index dc30d9e..97e397a 100644
> --- a/fs/sysfs/file.c
> +++ b/fs/sysfs/file.c
> @@ -24,6 +24,8 @@
>
> Â#include "sysfs.h"
>
> +static struct lock_class_key sysfs_classes[SYSFS_NR_CLASSES];
> +
> Â/* used in crash dumps to help with debugging */
> Âstatic char last_sysfs_file[PATH_MAX];
> Âvoid sysfs_printk_last_file(void)
> @@ -504,11 +506,16 @@ int sysfs_add_file_mode(struct sysfs_dirent *dir_sd,
> Â Â Â Âstruct sysfs_addrm_cxt acxt;
> Â Â Â Âstruct sysfs_dirent *sd;
> Â Â Â Âint rc;
> + Â Â Â int class;
>
> Â Â Â Âsd = sysfs_new_dirent(attr->name, mode, type);
> Â Â Â Âif (!sd)
> Â Â Â Â Â Â Â Âreturn -ENOMEM;
> Â Â Â Âsd->s_attr.attr = (void *)attr;
> + Â Â Â class = SYSFS_ATTR_NORMAL;
> + Â Â Â if (sysfs_type(sd) == SYSFS_KOBJ_ATTR)
> + Â Â Â Â Â Â Â class = sd->s_attr.attr->class;
> + Â Â Â lockdep_set_class_and_name(sd, &sysfs_classes[class], "s_active");
>
> Â Â Â Âsysfs_addrm_start(&acxt, dir_sd);
> Â Â Â Ârc = sysfs_add_one(&acxt, sd);
> diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
> index cdd9377..dde4d73 100644
> --- a/fs/sysfs/sysfs.h
> +++ b/fs/sysfs/sysfs.h
> @@ -88,17 +88,6 @@ static inline unsigned int sysfs_type(struct sysfs_dirent *sd)
> Â Â Â Âreturn sd->s_flags & SYSFS_TYPE_MASK;
> Â}
>
> -#ifdef CONFIG_DEBUG_LOCK_ALLOC
> -#define sysfs_dirent_init_lockdep(sd) Â Â Â Â Â Â Â Â Â Â Â Â Â\
> -do { Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â \
> - Â Â Â static struct lock_class_key __key; Â Â Â Â Â Â Â Â Â Â \
> - Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â \
> - Â Â Â lockdep_init_map(&sd->dep_map, "s_active", &__key, 0); Â\
> -} while(0)
> -#else
> -#define sysfs_dirent_init_lockdep(sd) do {} while(0)
> -#endif
> -
> Â/*
> Â* Context structure to be used while adding/removing nodes.
> Â*/
> diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
> index cfa8308..2b91b74 100644
> --- a/include/linux/sysfs.h
> +++ b/include/linux/sysfs.h
> @@ -20,6 +20,12 @@
> Âstruct kobject;
> Âstruct module;
>
> +enum sysfs_attr_lock_class {
> + Â Â Â SYSFS_ATTR_NORMAL,
> + Â Â Â SYSFS_ATTR_PM_CONTROL,
> + Â Â Â SYSFS_NR_CLASSES,
> +};
> +
> Â/* FIXME
> Â* The *owner field is no longer used.
> Â* x86 tree has been cleaned up. The owner
> @@ -29,6 +35,7 @@ struct attribute {
>    Âconst char       Â*name;
>    Âstruct module      *owner;
>    Âmode_t         Âmode;
> +    enum sysfs_attr_lock_class   Âclass;
> Â};
>
> Âstruct attribute_group {
> diff --git a/kernel/power/power.h b/kernel/power/power.h
> index 46c5a26..67a6fe7 100644
> --- a/kernel/power/power.h
> +++ b/kernel/power/power.h
> @@ -54,13 +54,14 @@ extern int hibernation_platform_enter(void);
> Âextern int pfn_is_nosave(unsigned long);
>
> Â#define power_attr(_name) \
> -static struct kobj_attribute _name##_attr = { Â\
> -    .attr  = {               \
> - Â Â Â Â Â Â Â .name = __stringify(_name), Â Â \
> - Â Â Â Â Â Â Â .mode = 0644, Â Â Â Â Â Â Â Â Â \
> - Â Â Â }, Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â\
> -    .show  = _name##_show,         \
> - Â Â Â .store Â= _name##_store, Â Â Â Â Â Â Â Â\
> +static struct kobj_attribute _name##_attr = { Â Â Â Â Â\
> +    .attr  = {                   \
> + Â Â Â Â Â Â Â .name = __stringify(_name), Â Â Â Â Â Â \
> + Â Â Â Â Â Â Â .mode = 0644, Â Â Â Â Â Â Â Â Â Â Â Â Â \
> + Â Â Â Â Â Â Â .class = SYSFS_ATTR_PM_CONTROL, Â Â Â Â \
> + Â Â Â }, Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â\
> +    .show  = _name##_show,             \
> + Â Â Â .store Â= _name##_store, Â Â Â Â Â Â Â Â Â Â Â Â\
> Â}
>
> Â/* Preferred image size in bytes (default 500 MB) */
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at Âhttp://vger.kernel.org/majordomo-info.html
> Please read the FAQ at Âhttp://www.tux.org/lkml/
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/