Re: [PATCH v3] fs: record task name which froze superblock

From: Dave Chinner
Date: Sat Feb 28 2015 - 16:31:49 EST


On Sat, Feb 28, 2015 at 05:25:57PM +0300, Alexey Dobriyan wrote:
> Freezing and thawing are separate system calls, task which is supposed
> to thaw filesystem/superblock can disappear due to crash or not thaw
> due to a bug. At least record task name (we can't take task_struct
> reference) to make support engineer's life easier.
>
> Hopefully 16 bytes per superblock isn't much.
>
> TASK_COMM_LEN definition (which is userspace ABI, see prctl(PR_SET_NAME)) is
> moved to userspace exported header to not drag sched.h into every fs.h inclusion.
>
> Signed-off-by: Alexey Dobriyan <adobriyan@xxxxxxxxx>

Freeze/thaw can be nested at the block level. That means the
sb->s_writers.freeze_comm can point at the wrong process. i.e.

Task A Task B
freeze_bdev
freeze_super
freeze_comm = A
freeze_bdev
.....
thaw_bdev
<device still frozen>
<crash>

At this point, the block device will never be unthawed, but
the debug field is now pointing to the wrong task. i.e. The debug
helper has not recorded the process that is actually causing the
problem, and leads us all off on a wild goose chase down the wrong
path.

IMO, debug code is only useful if it's reliable.....

> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -303,9 +303,6 @@ extern char ___assert_task_state[1 - 2*!!(
>
> #endif
>
> -/* Task command name length */
> -#define TASK_COMM_LEN 16
> -
> #include <linux/spinlock.h>
>
> /*
> --- a/include/uapi/linux/sched.h
> +++ b/include/uapi/linux/sched.h
> @@ -49,4 +49,7 @@
> */
> #define SCHED_FLAG_RESET_ON_FORK 0x01
>
> +/* Task command name length */
> +#define TASK_COMM_LEN 16
> +
> #endif /* _UAPI_LINUX_SCHED_H */

That should be a separate patch, sent to the scheduler maintainers
for review. AFAICT, it isn't part of the user API - it's not defined
in the man page which just says "can be up to 16 bytes".

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/