Re: 15c8410c67 ("mm/slob.c: respect list_head abstraction layer"): WARNING: CPU: 0 PID: 1 at lib/list_debug.c:28 __list_add_valid

From: Tobin C. Harding
Date: Wed Apr 03 2019 - 00:55:00 EST


On Wed, Apr 03, 2019 at 10:00:38AM +0800, kernel test robot wrote:
> Greetings,
>
> 0day kernel testing robot got the below dmesg and the first bad commit is
>
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
>
> commit 15c8410c67adefd26ea0df1f1b86e1836051784b
> Author: Tobin C. Harding <tobin@xxxxxxxxxx>
> AuthorDate: Fri Mar 29 10:01:23 2019 +1100
> Commit: Stephen Rothwell <sfr@xxxxxxxxxxxxxxxx>
> CommitDate: Sat Mar 30 16:09:41 2019 +1100
>
> mm/slob.c: respect list_head abstraction layer
>
> Currently we reach inside the list_head. This is a violation of the layer
> of abstraction provided by the list_head. It makes the code fragile.
> More importantly it makes the code wicked hard to understand.
>
> The code logic is based on the page in which an allocation was made, we
> want to modify the slob_list we are working on to have this page at the
> front. We already have a function to check if an entry is at the front of
> the list. Recently a function was added to list.h to do the list
> rotation. We can use these two functions to reduce line count, reduce
> code fragility, and reduce cognitive load required to read the code.
>
> Use list_head functions to interact with lists thereby maintaining the
> abstraction provided by the list_head structure.
>
> Link: http://lkml.kernel.org/r/20190318000234.22049-3-tobin@xxxxxxxxxx
> Signed-off-by: Tobin C. Harding <tobin@xxxxxxxxxx>
> Cc: Christoph Lameter <cl@xxxxxxxxx>
> Cc: David Rientjes <rientjes@xxxxxxxxxx>
> Cc: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>
> Cc: Pekka Enberg <penberg@xxxxxxxxxx>
> Cc: Roman Gushchin <guro@xxxxxx>
> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Signed-off-by: Stephen Rothwell <sfr@xxxxxxxxxxxxxxxx>
>
> 2e1f88301e include/linux/list.h: add list_rotate_to_front()
> 15c8410c67 mm/slob.c: respect list_head abstraction layer
> 05d08e2995 Add linux-next specific files for 20190402
> +-------------------------------------------------------+------------+------------+---------------+
> | | 2e1f88301e | 15c8410c67 | next-20190402 |
> +-------------------------------------------------------+------------+------------+---------------+
> | boot_successes | 1009 | 198 | 299 |
> | boot_failures | 0 | 2 | 44 |
> | WARNING:at_lib/list_debug.c:#__list_add_valid | 0 | 2 | 44 |
> | RIP:__list_add_valid | 0 | 2 | 44 |
> | WARNING:at_lib/list_debug.c:#__list_del_entry_valid | 0 | 2 | 25 |
> | RIP:__list_del_entry_valid | 0 | 2 | 25 |
> | WARNING:possible_circular_locking_dependency_detected | 0 | 2 | 44 |
> | RIP:_raw_spin_unlock_irqrestore | 0 | 2 | 2 |
> | BUG:kernel_hang_in_test_stage | 0 | 0 | 6 |
> | BUG:unable_to_handle_kernel | 0 | 0 | 1 |
> | Oops:#[##] | 0 | 0 | 1 |
> | RIP:slob_page_alloc | 0 | 0 | 1 |
> | Kernel_panic-not_syncing:Fatal_exception | 0 | 0 | 1 |
> | RIP:delay_tsc | 0 | 0 | 2 |
> +-------------------------------------------------------+------------+------------+---------------+
>
> [ 2.618737] db_root: cannot open: /etc/target
> [ 2.620114] mtdoops: mtd device (mtddev=name/number) must be supplied
> [ 2.620967] slram: not enough parameters.
> [ 2.621614] ------------[ cut here ]------------
> [ 2.622254] list_add corruption. prev->next should be next (ffffffffaeeb71b0), but was ffffcee1406d3f70. (prev=ffffcee140422508).

Is this perhaps a false positive because we hackishly move the list_head
'head' and insert it back into the list. Perhaps this is confusing the
validation functions?

Tobin