Re: [PATCH] drm/prime: fix extracting of the DMA addresses from a scatterlist

From: Marek Szyprowski
Date: Fri Mar 27 2020 - 05:00:42 EST


Hi Shane

On 2020-03-27 09:55, Shane Francis wrote:
> On Fri, Mar 27, 2020 at 8:24 AM Marek Szyprowski
> <m.szyprowski@xxxxxxxxxxx> wrote:
>> Scatterlist elements contains both pages and DMA addresses, but in general,
>> one cannot assume 1:1 relation between them. The sg->length is the size of
>> the physical memory chunk described by sg->page, while sg_dma_length(sg) is
>> the size of the DMA (IO virtual) chunk described by sg_dma_address(sg).
>>
>> The proper way of extracting both: pages and DMA addresses of the whole
>> buffer described by a scatterlist it to iterate independently over the
>> sg->pages/sg->length and sg_dma_address(sg)/sg_dma_len(sg) entries.
>>
>> Fixes: 42e67b479eab ("drm/prime: use dma length macro when mapping sg")
>> Signed-off-by: Marek Szyprowski <m.szyprowski@xxxxxxxxxxx>
>> ---
>> This fixes the following kernel panic observed on ARM 32bit Samsung
>> Exynos5250-based Snow Chromebook since linux-next 20200326, which
>> introduced the commit 42e67b479eab ("drm/prime: use dma length macro when
>> mapping sg"):
>>
>> [drm] Initialized panfrost 1.1.0 20180908 for 11800000.gpu on minor 0
>> [drm] Exynos DRM: using 14400000.fimd device for DMA mapping operations
>> exynos-drm exynos-drm: bound 14400000.fimd (ops fimd_component_ops)
>> exynos-drm exynos-drm: bound 14450000.mixer (ops mixer_component_ops)
>> exynos-drm exynos-drm: bound 145b0000.dp-controller (ops exynos_dp_ops)
>> exynos-drm exynos-drm: bound 14530000.hdmi (ops hdmi_component_ops)
>> [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
>> ------------[ cut here ]------------
>> WARNING: CPU: 0 PID: 12 at mm/vmalloc.c:163 vmap_page_range_noflush+0x18c/0x1b0
>> Modules linked in:
>> CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.6.0-rc7-next-20200326-00060-gbb3f893b3f08 #7929
>> Hardware name: Samsung Exynos (Flattened Device Tree)
>> Workqueue: events deferred_probe_work_func
>> [<c0111f20>] (unwind_backtrace) from [<c010d128>] (show_stack+0x10/0x14)
>> [<c010d128>] (show_stack) from [<c0a78178>] (dump_stack+0xa4/0xd0)
>> [<c0a78178>] (dump_stack) from [<c01271a0>] (__warn+0xf4/0x10c)
>> [<c01271a0>] (__warn) from [<c0127268>] (warn_slowpath_fmt+0xb0/0xb8)
>> [<c0127268>] (warn_slowpath_fmt) from [<c0294fdc>] (vmap_page_range_noflush+0x18c/0x1b0)
>> [<c0294fdc>] (vmap_page_range_noflush) from [<c02952fc>] (map_vm_area+0x30/0x6c)
>> [<c02952fc>] (map_vm_area) from [<c0298df8>] (vmap+0x64/0x80)
>> [<c0298df8>] (vmap) from [<c05f71f4>] (exynos_drm_fbdev_create+0x148/0x270)
>> [<c05f71f4>] (exynos_drm_fbdev_create) from [<c05bde44>] (__drm_fb_helper_initial_config_and_unlock+0x388/0x5dc)
>> [<c05bde44>] (__drm_fb_helper_initial_config_and_unlock) from [<c05f743c>] (exynos_drm_fbdev_init+0x78/0xe0)
>> [<c05f743c>] (exynos_drm_fbdev_init) from [<c05f59f4>] (exynos_drm_bind+0x14c/0x19c)
>> [<c05f59f4>] (exynos_drm_bind) from [<c0614784>] (try_to_bring_up_master+0x208/0x2bc)
>> [<c0614784>] (try_to_bring_up_master) from [<c0614ac4>] (__component_add+0xb0/0x178)
>> [<c0614ac4>] (__component_add) from [<c05fb488>] (exynos_dp_probe+0x94/0x12c)
>> [<c05fb488>] (exynos_dp_probe) from [<c061e330>] (platform_drv_probe+0x48/0x9c)
>> [<c061e330>] (platform_drv_probe) from [<c061badc>] (really_probe+0x1c4/0x470)
>> [<c061badc>] (really_probe) from [<c061bf1c>] (driver_probe_device+0x78/0x1bc)
>> [<c061bf1c>] (driver_probe_device) from [<c0619c9c>] (bus_for_each_drv+0x74/0xb8)
>> [<c0619c9c>] (bus_for_each_drv) from [<c061b878>] (__device_attach+0xd4/0x16c)
>> [<c061b878>] (__device_attach) from [<c061aa38>] (bus_probe_device+0x88/0x90)
>> [<c061aa38>] (bus_probe_device) from [<c061af5c>] (deferred_probe_work_func+0x4c/0xd0)
>> [<c061af5c>] (deferred_probe_work_func) from [<c0149f9c>] (process_one_work+0x30c/0x880)
>> [<c0149f9c>] (process_one_work) from [<c014a568>] (worker_thread+0x58/0x5a4)
>> [<c014a568>] (worker_thread) from [<c0151a5c>] (kthread+0x154/0x19c)
>> [<c0151a5c>] (kthread) from [<c0100114>] (ret_from_fork+0x14/0x20)
>> Exception stack(0xee8fdfb0 to 0xee8fdff8)
>> dfa0: 00000000 00000000 00000000 00000000
>> dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>> dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
>> irq event stamp: 54037
>> hardirqs last enabled at (54055): [<c019ed50>] console_unlock+0x58c/0x6a8
>> hardirqs last disabled at (54062): [<c019e890>] console_unlock+0xcc/0x6a8
>> softirqs last enabled at (54078): [<c0101724>] __do_softirq+0x4fc/0x5f4
>> softirqs last disabled at (54089): [<c0130248>] irq_exit+0x16c/0x170
>> ---[ end trace 74519922e0e4625d ]---
>> exynos4-fb 14400000.fimd: [drm:exynos_drm_fbdev_create] *ERROR* failed to map pages to kernel space.
>> exynos-drm exynos-drm: [drm:exynos_drm_fbdev_init] *ERROR* failed to set up hw configuration.
>> ------------[ cut here ]------------
>> WARNING: CPU: 0 PID: 12 at kernel/locking/mutex-debug.c:103 mutex_destroy+0x84/0x88
>> DEBUG_LOCKS_WARN_ON(mutex_is_locked(lock))
>> Modules linked in:
>> CPU: 0 PID: 12 Comm: kworker/0:1 Tainted: G W 5.6.0-rc7-next-20200326-00060-gbb3f893b3f08 #7929
>> Hardware name: Samsung Exynos (Flattened Device Tree)
>> Workqueue: events deferred_probe_work_func
>> [<c0111f20>] (unwind_backtrace) from [<c010d128>] (show_stack+0x10/0x14)
>> [<c010d128>] (show_stack) from [<c0a78178>] (dump_stack+0xa4/0xd0)
>> [<c0a78178>] (dump_stack) from [<c01271a0>] (__warn+0xf4/0x10c)
>> [<c01271a0>] (__warn) from [<c012722c>] (warn_slowpath_fmt+0x74/0xb8)
>> [<c012722c>] (warn_slowpath_fmt) from [<c01892a4>] (mutex_destroy+0x84/0x88)
>> [<c01892a4>] (mutex_destroy) from [<c05be1a4>] (drm_fb_helper_fini.part.1+0x9c/0xd4)
>> [<c05be1a4>] (drm_fb_helper_fini.part.1) from [<c05f7464>] (exynos_drm_fbdev_init+0xa0/0xe0)
>> [<c05f7464>] (exynos_drm_fbdev_init) from [<c05f59f4>] (exynos_drm_bind+0x14c/0x19c)
>> [<c05f59f4>] (exynos_drm_bind) from [<c0614784>] (try_to_bring_up_master+0x208/0x2bc)
>> [<c0614784>] (try_to_bring_up_master) from [<c0614ac4>] (__component_add+0xb0/0x178)
>> [<c0614ac4>] (__component_add) from [<c05fb488>] (exynos_dp_probe+0x94/0x12c)
>> [<c05fb488>] (exynos_dp_probe) from [<c061e330>] (platform_drv_probe+0x48/0x9c)
>> [<c061e330>] (platform_drv_probe) from [<c061badc>] (really_probe+0x1c4/0x470)
>> [<c061badc>] (really_probe) from [<c061bf1c>] (driver_probe_device+0x78/0x1bc)
>> [<c061bf1c>] (driver_probe_device) from [<c0619c9c>] (bus_for_each_drv+0x74/0xb8)
>> [<c0619c9c>] (bus_for_each_drv) from [<c061b878>] (__device_attach+0xd4/0x16c)
>> [<c061b878>] (__device_attach) from [<c061aa38>] (bus_probe_device+0x88/0x90)
>> [<c061aa38>] (bus_probe_device) from [<c061af5c>] (deferred_probe_work_func+0x4c/0xd0)
>> [<c061af5c>] (deferred_probe_work_func) from [<c0149f9c>] (process_one_work+0x30c/0x880)
>> [<c0149f9c>] (process_one_work) from [<c014a568>] (worker_thread+0x58/0x5a4)
>> [<c014a568>] (worker_thread) from [<c0151a5c>] (kthread+0x154/0x19c)
>> [<c0151a5c>] (kthread) from [<c0100114>] (ret_from_fork+0x14/0x20)
>> Exception stack(0xee8fdfb0 to 0xee8fdff8)
>> dfa0: 00000000 00000000 00000000 00000000
>> dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>> dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
>> irq event stamp: 56283
>> hardirqs last enabled at (56283): [<c02b189c>] kfree+0x198/0x3e4
>> hardirqs last disabled at (56282): [<c02b17d0>] kfree+0xcc/0x3e4
>> softirqs last enabled at (56262): [<c0101724>] __do_softirq+0x4fc/0x5f4
>> softirqs last disabled at (56255): [<c0130248>] irq_exit+0x16c/0x170
>> ---[ end trace 74519922e0e4625e ]---
>> exynos-sysmmu 14640000.sysmmu: 14400000.fimd: PAGE FAULT occurred at 0x20000000
>> ------------[ cut here ]------------
>> kernel BUG at drivers/iommu/exynos-iommu.c:447!
>> Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
>> Modules linked in:
>> CPU: 0 PID: 52 Comm: kworker/0:2 Tainted: G W 5.6.0-rc7-next-20200326-00060-gbb3f893b3f08 #7929
>> Hardware name: Samsung Exynos (Flattened Device Tree)
>> Workqueue: events output_poll_execute
>> PC is at exynos_sysmmu_irq+0x210/0x258
>> LR is at report_iommu_fault+0x144/0x1cc
>> pc : [<c05a6e34>] lr : [<c05a13ec>] psr: a0000193
>> sp : cfafdbe8 ip : 2d495ebb fp : 00000200
>> r10: eeb187e0 r9 : 20000000 r8 : cf300000
>> r7 : c11d4fb0 r6 : eeb187c0 r5 : 00000000 r4 : c0b59f74
>> r3 : cfafc000 r2 : 00010001 r1 : 00000000 r0 : ffffffda
>> Flags: NzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment none
>> Control: 10c5387d Table: 4000406a DAC: 00000051
>> Process kworker/0:2 (pid: 52, stack limit = 0x(ptrval))
>> Stack: (0xcfafdbe8 to 0xcfafe000)
>> ...
>> [<c05a6e34>] (exynos_sysmmu_irq) from [<c01a24f4>] (__handle_irq_event_percpu+0x68/0x42c)
>> [<c01a24f4>] (__handle_irq_event_percpu) from [<c01a28e4>] (handle_irq_event_percpu+0x2c/0x7c)
>> [<c01a28e4>] (handle_irq_event_percpu) from [<c01a296c>] (handle_irq_event+0x38/0x5c)
>> [<c01a296c>] (handle_irq_event) from [<c01a7160>] (handle_level_irq+0xcc/0x150)
>> [<c01a7160>] (handle_level_irq) from [<c01a1574>] (generic_handle_irq+0x34/0x44)
>> [<c01a1574>] (generic_handle_irq) from [<c0509a5c>] (combiner_handle_cascade_irq+0x8c/0xdc)
>> [<c0509a5c>] (combiner_handle_cascade_irq) from [<c01a1574>] (generic_handle_irq+0x34/0x44)
>> [<c01a1574>] (generic_handle_irq) from [<c01a1bbc>] (__handle_domain_irq+0x7c/0xec)
>> [<c01a1bbc>] (__handle_domain_irq) from [<c050a024>] (gic_handle_irq+0x58/0x9c)
>> [<c050a024>] (gic_handle_irq) from [<c0100af0>] (__irq_svc+0x70/0xb0)
>> Exception stack(0xcfafdd00 to 0xcfafdd48)
>> dd00: c02b189c 00000000 2df3c000 00000000 cf2ed3c0 ee801cc0 60000113 ef1ddda0
>> dd20: c05d5500 00000000 cf2c9800 cf2ea8b8 00003220 cfafdd50 c02b189c c02b18a0
>> dd40: 60000113 ffffffff
>> [<c0100af0>] (__irq_svc) from [<c02b18a0>] (kfree+0x19c/0x3e4)
>> [<c02b18a0>] (kfree) from [<c05d5500>] (drm_atomic_state_default_clear+0x1b8/0x2dc)
>> [<c05d5500>] (drm_atomic_state_default_clear) from [<c05d5650>] (__drm_atomic_state_free+0x10/0x50)
>> [<c05d5650>] (__drm_atomic_state_free) from [<c05e9d88>] (drm_client_modeset_commit_atomic+0x240/0x26c)
>> [<c05e9d88>] (drm_client_modeset_commit_atomic) from [<c05e9df8>] (drm_client_modeset_commit_locked+0x44/0x1d0)
>> [<c05e9df8>] (drm_client_modeset_commit_locked) from [<c05e9fa8>] (drm_client_modeset_commit+0x24/0x40)
>> [<c05e9fa8>] (drm_client_modeset_commit) from [<c05be3ec>] (drm_fb_helper_restore_fbdev_mode_unlocked+0x58/0xa4)
>> [<c05be3ec>] (drm_fb_helper_restore_fbdev_mode_unlocked) from [<c05be468>] (drm_fb_helper_set_par+0x30/0x5c)
>> [<c05be468>] (drm_fb_helper_set_par) from [<c05be538>] (drm_fb_helper_hotplug_event.part.5+0xa4/0xbc)
>> [<c05be538>] (drm_fb_helper_hotplug_event.part.5) from [<c05ac148>] (drm_kms_helper_hotplug_event+0x24/0x30)
>> [<c05ac148>] (drm_kms_helper_hotplug_event) from [<c05ac238>] (output_poll_execute+0xb8/0x1b4)
>> [<c05ac238>] (output_poll_execute) from [<c0149f9c>] (process_one_work+0x30c/0x880)
>> [<c0149f9c>] (process_one_work) from [<c014a568>] (worker_thread+0x58/0x5a4)
>> [<c014a568>] (worker_thread) from [<c0151a5c>] (kthread+0x154/0x19c)
>> [<c0151a5c>] (kthread) from [<c0100114>] (ret_from_fork+0x14/0x20)
>> Exception stack(0xcfafdfb0 to 0xcfafdff8)
>> dfa0: 00000000 00000000 00000000 00000000
>> dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>> dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
>> Code: e34c00de e300119e ebee00e1 eaffff81 (e7f001f2)
>> ---[ end trace 74519922e0e4625f ]---
>> Kernel panic - not syncing: Fatal exception in interrupt
>> CPU1: stopping
>> CPU: 1 PID: 0 Comm: swapper/1 Tainted: G D W 5.6.0-rc7-next-20200326-00060-gbb3f893b3f08 #7929
>> Hardware name: Samsung Exynos (Flattened Device Tree)
>> [<c0111f20>] (unwind_backtrace) from [<c010d128>] (show_stack+0x10/0x14)
>> [<c010d128>] (show_stack) from [<c0a78178>] (dump_stack+0xa4/0xd0)
>> [<c0a78178>] (dump_stack) from [<c0110ad4>] (handle_IPI+0x3b4/0x440)
>> [<c0110ad4>] (handle_IPI) from [<c050a064>] (gic_handle_irq+0x98/0x9c)
>> [<c050a064>] (gic_handle_irq) from [<c0100af0>] (__irq_svc+0x70/0xb0)
>> Exception stack(0xee8fff58 to 0xee8fffa0)
>> ff40: c0109534 00000000
>> ff60: 2df50000 00000000 ee8fe000 c1108ee8 c1108f2c 00000002 00000000 c0de63c0
>> ff80: 00000000 c1075fe8 2d495ebb ee8fffa8 c0109534 c0109538 60000013 ffffffff
>> [<c0100af0>] (__irq_svc) from [<c0109538>] (arch_cpu_idle+0x24/0x44)
>> [<c0109538>] (arch_cpu_idle) from [<c0163a74>] (do_idle+0x1d8/0x2d4)
>> [<c0163a74>] (do_idle) from [<c0163f24>] (cpu_startup_entry+0x18/0x1c)
>> [<c0163f24>] (cpu_startup_entry) from [<401018ac>] (0x401018ac)
>> ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
>> ---
>> drivers/gpu/drm/drm_prime.c | 30 ++++++++++++++++++------------
>> 1 file changed, 18 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
>> index 1de2cde2277c..424db18987f6 100644
>> --- a/drivers/gpu/drm/drm_prime.c
>> +++ b/drivers/gpu/drm/drm_prime.c
>> @@ -962,27 +962,33 @@ int drm_prime_sg_to_page_addr_arrays(struct sg_table *sgt, struct page **pages,
>> unsigned count;
>> struct scatterlist *sg;
>> struct page *page;
>> - u32 len, index;
>> + u32 page_len, page_index;
>> dma_addr_t addr;
>> + u32 dma_len, dma_index;
>>
>> - index = 0;
>> + page_index = 0;
>> + dma_index = 0;
>> for_each_sg(sgt->sgl, sg, sgt->nents, count) {
>> - len = sg_dma_len(sg);
>> + page_len = sg->length;
>> page = sg_page(sg);
>> + dma_len = sg_dma_len(sg);
>> addr = sg_dma_address(sg);
>>
>> - while (len > 0) {
>> - if (WARN_ON(index >= max_entries))
>> + while (pages && page_len > 0) {
>> + if (WARN_ON(page_index >= max_entries))
>> return -1;
>> - if (pages)
>> - pages[index] = page;
>> - if (addrs)
>> - addrs[index] = addr;
>> -
>> + pages[page_index] = page;
>> page++;
>> + page_len -= PAGE_SIZE;
>> + page_index++;
>> + }
>> + while (addrs && dma_len > 0) {
>> + if (WARN_ON(dma_index >= max_entries))
>> + return -1;
>> + addrs[dma_index] = addr;
>> addr += PAGE_SIZE;
>> - len -= PAGE_SIZE;
>> - index++;
>> + dma_len -= PAGE_SIZE;
>> + dma_index++;
>> }
>> }
>> return 0;
>> --
>> 2.17.1
>>
> I have tested the above patch against my original issues with amdgpu
> and radeon drivers and everything is still working as expected.
>
> Sorry I missed this in my original patches.

No problem. Thanks for testing!

Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland