Re: [BUG] BUG: kernel NULL pointer dereference at ttm_device_init+0xb4

From: Bhardwaj, Rajneesh
Date: Mon Jan 22 2024 - 20:36:30 EST



On 1/22/2024 7:34 PM, Steven Rostedt wrote:
On Mon, 22 Jan 2024 19:29:41 -0500
"Bhardwaj, Rajneesh" <rajneesh.bhardwaj@xxxxxxx> wrote:

In one of my previous revisions of this patch when I was experimenting,
I used something like below. Wonder if that could work in your case
and/or in general.


diff --git a/drivers/gpu/drm/ttm/ttm_device.c
b/drivers/gpu/drm/ttm/ttm_device.c

index 43e27ab77f95..4c3902b94be4 100644

--- a/drivers/gpu/drm/ttm/ttm_device.c

+++ b/drivers/gpu/drm/ttm/ttm_device.c

@@ -195,6 +195,7 @@ int ttm_device_init(struct ttm_device *bdev, struct
ttm_device_funcs *funcs,

bool use_dma_alloc, bool use_dma32){

struct ttm_global *glob = &ttm_glob;

+bool node_has_cpu = false;

int ret;

if (WARN_ON(vma_manager == NULL))

@@ -213,7 +214,12 @@ int ttm_device_init(struct ttm_device *bdev, struct
ttm_device_funcs *funcs,

bdev->funcs = funcs;

ttm_sys_man_init(bdev);

-ttm_pool_init(&bdev->pool, dev, NUMA_NO_NODE, use_dma_alloc, use_dma32);

+

+node_has_cpu = node_state(dev->numa_node, N_CPU);
Considering that qxl_ttm_init() passes in dev = NULL, the above would blow
up just the same.


I agree, I think we need something like you suggested i.e.

+ ttm_pool_init(&bdev->pool, dev, dev ? dev_to_node(dev) : NUMA_NO_NODE,
+ use_dma_alloc, use_dma32);


I am not quite sure if the above node_has_cpu change will be a better solution in general, along with the NULL pointer check as you suggested. If you prefer that, then I can send a fix otherwise, your fix looks good to me.



-- Steve


+if (node_has_cpu)

+ttm_pool_init(&bdev->pool, dev, dev->numa_node, use_dma_alloc, use_dma32);

+else

+ttm_pool_init(&bdev->pool, dev, NUMA_NO_NODE, use_dma_alloc,

+use_dma32);

bdev->vma_manager = vma_manager;

spin_lock_init(&bdev->lru_lock);


-- Steve