Re: [PATCH v2 1/2] libnvdimm, region: fix flush hint detection crash

From: Dan Williams
Date: Wed Apr 26 2017 - 16:04:41 EST


On Wed, Apr 26, 2017 at 12:43 PM, Jeff Moyer <jmoyer@xxxxxxxxxx> wrote:
> Dan Williams <dan.j.williams@xxxxxxxxx> writes:
>
>> In the case where a dimm does not have any associated flush hints the
>> ndrd->flush_wpq array may be uninitialized leading to crashes with the
>> following signature:
>>
>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
>> IP: region_visible+0x10f/0x160 [libnvdimm]
>>
>> Call Trace:
>> internal_create_group+0xbe/0x2f0
>> sysfs_create_groups+0x40/0x80
>> device_add+0x2d8/0x650
>> nd_async_device_register+0x12/0x40 [libnvdimm]
>> async_run_entry_fn+0x39/0x170
>> process_one_work+0x212/0x6c0
>> ? process_one_work+0x197/0x6c0
>> worker_thread+0x4e/0x4a0
>> kthread+0x10c/0x140
>> ? process_one_work+0x6c0/0x6c0
>> ? kthread_create_on_node+0x60/0x60
>> ret_from_fork+0x31/0x40
>
> Sorry for being dense, but I'm having a tough time connecting the dots,
> here. How does region_visible trip over the missing (not uninitialized,
> you're actually walking off the end of the structure) wpq_flush array?

So, you're not dense, or you're at least as equally dense as me,
because I didn't immediately understand where this failure was coming
from either. I just happened to trigger it while running patch2 and
thought the current code just looked unsafe by inspection.

> Anyway, the fix looks valid.
>
> Reviewed-by: Jeff Moyer <jmoyer@xxxxxxxxxx>

Thanks!