Re: [PATCH v2 07/22] drm/msm: Do rpm get sooner in the submit path

From: Rob Clark
Date: Sun Oct 25 2020 - 13:39:22 EST


On Thu, Oct 22, 2020 at 1:06 AM Viresh Kumar <viresh.kumar@xxxxxxxxxx> wrote:
>
> On 20-10-20, 07:13, Rob Clark wrote:
> > On Tue, Oct 20, 2020 at 4:24 AM Viresh Kumar <viresh.kumar@xxxxxxxxxx> wrote:
> > >
> > > On 20-10-20, 12:56, Daniel Vetter wrote:
> > > > Yeah that's bad practice. Generally you shouldn't need to hold locks
> > > > in setup/teardown code, since there's no other thread which can
> > > > possible hold a reference to anything your touching anymore. Ofc
> > > > excluding quickly grabbing/dropping a lock to insert/remove objects
> > > > into lists and stuff.
> > > >
> > > > The other reason is that especially with anything related to sysfs or
> > > > debugfs, the locking dependencies you're pulling in are enormous: vfs
> > > > locks pull in mm locks (due to mmap) and at that point there's pretty
> > > > much nothing left you're allowed to hold while acquiring such a lock.
> > > > For simple drivers this is no issue, but for fancy drivers (like gpu
> > > > drivers) which need to interact with core mm) this means your
> > > > subsystem is a major pain to use.
> > > >
> > > > Usually the correct fix is to only hold your subsystem locks in
> > > > setup/teardown when absolutely required, and fix any data
> > > > inconsistency issues by reordering your setup/teardown code: When you
> > > > register as the last step and unregister as the first step, there's no
> > > > need for any additional locking. And hence no need to call debugfs
> > > > functions while holding your subsystem locks.
> > > >
> > > > The catch phrase I use for this is "don't solve object lifetime issues
> > > > with locking". Instead use refcounting and careful ordering in
> > > > setup/teardown code.
> > >
> > > This is exactly what I have done in the OPP core, the locks were taken
> > > only when really necessary, though as we have seen now I have missed
> > > that at a single place and that should be fixed as well. Will do that,
> > > thanks.
> >
> > I do have an easy enough way to repro the issue, so if you have a
> > patch I can certainly test it.
>
> Does this fix it for you ? There is one more place still left where we
> are taking the opp_table_lock while adding stuff to debugfs and that's
> not that straight forward to fix. But I didn't see that path in your
> circular dependency trace, so who knows :)

Nope, I suspect any creation of debugfs files will be problematic.

(btw, _add_opp_dev_unlocked() looks like it should be called
_add_opp_dev_locked()?)

It does look like 'struct opp_table' is already refcnt'd, so I suspect
you could replace holding opp_table_lock while calling into debugfs
with holding a reference to the opp_table instead?

BR,
-R

[ +0.074543] ======================================================
[ +0.006347] WARNING: possible circular locking dependency detected
[ +0.006349] 5.4.72 #14 Not tainted
[ +0.003501] ------------------------------------------------------
[ +0.006350] chrome/1865 is trying to acquire lock:
[ +0.004922] ffffffdd34921750 (opp_table_lock){+.+.}, at:
_find_opp_table+0x34/0x74
[ +0.007779]
but task is already holding lock:
[ +0.005989] ffffff81f0fc71a8 (reservation_ww_class_mutex){+.+.}, at:
submit_lock_objects+0x70/0x1ec
[ +0.001132] fscrypt: AES-256-CTS-CBC using implementation "cts-cbc-aes-ce"
[ +0.008156]
which lock already depends on the new lock.

[ +0.000002]
the existing dependency chain (in reverse order) is:
[ +0.000002]
-> #4 (reservation_ww_class_mutex){+.+.}:
[ +0.000009] __mutex_lock_common+0xec/0xc0c
[ +0.000004] ww_mutex_lock_interruptible+0x5c/0xc4
[ +0.000003] msm_gem_fault+0x2c/0x124
[ +0.000005] __do_fault+0x40/0x16c
[ +0.000003] handle_mm_fault+0x7cc/0xd98
[ +0.000005] do_page_fault+0x230/0x3b4
[ +0.000003] do_translation_fault+0x5c/0x78
[ +0.000004] do_mem_abort+0x4c/0xb4
[ +0.000006] el0_da+0x1c/0x20
[ +0.069917]
-> #3 (&mm->mmap_sem){++++}:
[ +0.005548] __might_fault+0x70/0x98
[ +0.004209] compat_filldir+0xf8/0x48c
[ +0.004394] dcache_readdir+0x70/0x1dc
[ +0.004394] iterate_dir+0xd4/0x180
[ +0.004119] __arm64_compat_sys_getdents+0xa0/0x19c
[ +0.005549] el0_svc_common+0xa8/0x178
[ +0.004394] el0_svc_compat_handler+0x2c/0x40
[ +0.005007] el0_svc_compat+0x8/0x10
[ +0.004205]
-> #2 (&sb->s_type->i_mutex_key#3){++++}:
[ +0.006708] down_write+0x54/0x16c
[ +0.004034] start_creating+0x68/0x128
[ +0.004392] debugfs_create_dir+0x28/0x114
[ +0.004747] opp_debug_register+0x8c/0xc0
[ +0.004657] _add_opp_dev_unlocked+0x5c/0x70
[ +0.004920] _add_opp_dev+0x38/0x58
[ +0.004118] _opp_get_opp_table+0xdc/0x1ac
[ +0.004745] dev_pm_opp_get_opp_table_indexed+0x24/0x30
[ +0.005899] dev_pm_opp_of_add_table_indexed+0x48/0x84
[ +0.005813] of_genpd_add_provider_onecell+0xc0/0x1b8
[ +0.005724] rpmhpd_probe+0x240/0x268
[ +0.004307] platform_drv_probe+0x90/0xb0
[ +0.004654] really_probe+0x134/0x2ec
[ +0.004304] driver_probe_device+0x64/0xfc
[ +0.004746] __device_attach_driver+0x8c/0xa4
[ +0.005008] bus_for_each_drv+0x90/0xd8
[ +0.004481] __device_attach+0xc0/0x148
[ +0.004480] device_initial_probe+0x20/0x2c
[ +0.004832] bus_probe_device+0x34/0x94
[ +0.004482] device_add+0x1fc/0x3b0
[ +0.004121] of_device_add+0x3c/0x4c
[ +0.004206] of_platform_device_create_pdata+0xb8/0xfc
[ +0.005811] of_platform_bus_create+0x1e4/0x368
[ +0.005185] of_platform_populate+0x70/0xbc
[ +0.004833] devm_of_platform_populate+0x58/0xa0
[ +0.005283] rpmh_rsc_probe+0x36c/0x3cc
[ +0.004481] platform_drv_probe+0x90/0xb0
[ +0.004657] really_probe+0x134/0x2ec
[ +0.004305] driver_probe_device+0x64/0xfc
[ +0.004745] __device_attach_driver+0x8c/0xa4
[ +0.005007] bus_for_each_drv+0x90/0xd8
[ +0.004480] __device_attach+0xc0/0x148
[ +0.004481] device_initial_probe+0x20/0x2c
[ +0.004833] bus_probe_device+0x34/0x94
[ +0.004481] device_add+0x1fc/0x3b0
[ +0.004119] of_device_add+0x3c/0x4c
[ +0.004206] of_platform_device_create_pdata+0xb8/0xfc
[ +0.005811] of_platform_bus_create+0x1e4/0x368
[ +0.005185] of_platform_bus_create+0x230/0x368
[ +0.005185] of_platform_populate+0x70/0xbc
[ +0.004836] of_platform_default_populate_init+0xa8/0xc0
[ +0.005986] do_one_initcall+0x1c8/0x3fc
[ +0.004572] do_initcall_level+0xb4/0x10c
[ +0.004657] do_basic_setup+0x30/0x48
[ +0.004304] kernel_init_freeable+0x124/0x1a4
[ +0.005009] kernel_init+0x14/0x104
[ +0.004119] ret_from_fork+0x10/0x18
[ +0.004205]
-> #1 (&opp_table->lock){+.+.}:
[ +0.005815] __mutex_lock_common+0xec/0xc0c
[ +0.004832] mutex_lock_nested+0x40/0x50
[ +0.004570] _add_opp_dev+0x2c/0x58
[ +0.004119] _opp_get_opp_table+0xdc/0x1ac
[ +0.004745] dev_pm_opp_get_opp_table_indexed+0x24/0x30
[ +0.005899] dev_pm_opp_of_add_table_indexed+0x48/0x84
[ +0.005814] of_genpd_add_provider_onecell+0xc0/0x1b8
[ +0.005721] rpmhpd_probe+0x240/0x268
[ +0.004306] platform_drv_probe+0x90/0xb0
[ +0.004656] really_probe+0x134/0x2ec
[ +0.004305] driver_probe_device+0x64/0xfc
[ +0.004745] __device_attach_driver+0x8c/0xa4
[ +0.005007] bus_for_each_drv+0x90/0xd8
[ +0.004481] __device_attach+0xc0/0x148
[ +0.004481] device_initial_probe+0x20/0x2c
[ +0.004832] bus_probe_device+0x34/0x94
[ +0.004481] device_add+0x1fc/0x3b0
[ +0.004119] of_device_add+0x3c/0x4c
[ +0.004206] of_platform_device_create_pdata+0xb8/0xfc
[ +0.005810] of_platform_bus_create+0x1e4/0x368
[ +0.005197] of_platform_populate+0x70/0xbc
[ +0.004832] devm_of_platform_populate+0x58/0xa0
[ +0.005284] rpmh_rsc_probe+0x36c/0x3cc
[ +0.004480] platform_drv_probe+0x90/0xb0
[ +0.004658] really_probe+0x134/0x2ec
[ +0.004301] driver_probe_device+0x64/0xfc
[ +0.004745] __device_attach_driver+0x8c/0xa4
[ +0.005007] bus_for_each_drv+0x90/0xd8
[ +0.004480] __device_attach+0xc0/0x148
[ +0.004481] device_initial_probe+0x20/0x2c
[ +0.004831] bus_probe_device+0x34/0x94
[ +0.004482] device_add+0x1fc/0x3b0
[ +0.004118] of_device_add+0x3c/0x4c
[ +0.004214] of_platform_device_create_pdata+0xb8/0xfc
[ +0.005817] of_platform_bus_create+0x1e4/0x368
[ +0.005186] of_platform_bus_create+0x230/0x368
[ +0.005188] of_platform_populate+0x70/0xbc
[ +0.004832] of_platform_default_populate_init+0xa8/0xc0
[ +0.005987] do_one_initcall+0x1c8/0x3fc
[ +0.004569] do_initcall_level+0xb4/0x10c
[ +0.004657] do_basic_setup+0x30/0x48
[ +0.004305] kernel_init_freeable+0x124/0x1a4
[ +0.005008] kernel_init+0x14/0x104
[ +0.004119] ret_from_fork+0x10/0x18
[ +0.004206]
-> #0 (opp_table_lock){+.+.}:
[ +0.005640] __lock_acquire+0xee4/0x2450
[ +0.004570] lock_acquire+0x1cc/0x210
[ +0.004305] __mutex_lock_common+0xec/0xc0c
[ +0.004833] mutex_lock_nested+0x40/0x50
[ +0.004570] _find_opp_table+0x34/0x74
[ +0.004393] dev_pm_opp_find_freq_exact+0x2c/0xdc
[ +0.005372] a6xx_gmu_resume+0xc8/0xecc
[ +0.004480] a6xx_pm_resume+0x148/0x200
[ +0.004482] adreno_resume+0x28/0x34
[ +0.004209] pm_generic_runtime_resume+0x34/0x48
[ +0.005283] __rpm_callback+0x70/0x10c
[ +0.004393] rpm_callback+0x34/0x8c
[ +0.004119] rpm_resume+0x414/0x550
[ +0.004119] __pm_runtime_resume+0x7c/0xa0
[ +0.004746] msm_gpu_submit+0x60/0x1c0
[ +0.004394] msm_ioctl_gem_submit+0xadc/0xb60
[ +0.005010] drm_ioctl_kernel+0x9c/0x118
[ +0.004569] drm_ioctl+0x27c/0x408
[ +0.004034] drm_compat_ioctl+0xcc/0xdc
[ +0.004483] __se_compat_sys_ioctl+0x100/0x206c
[ +0.005186] __arm64_compat_sys_ioctl+0x20/0x2c
[ +0.005187] el0_svc_common+0xa8/0x178
[ +0.004393] el0_svc_compat_handler+0x2c/0x40
[ +0.005009] el0_svc_compat+0x8/0x10
[ +0.004205]
other info that might help us debug this:

[ +0.008213] Chain exists of:
opp_table_lock --> &mm->mmap_sem --> reservation_ww_class_mutex

[ +0.011780] Possible unsafe locking scenario:

[ +0.006082] CPU0 CPU1
[ +0.004660] ---- ----
[ +0.004656] lock(reservation_ww_class_mutex);
[ +0.004657] lock(&mm->mmap_sem);
[ +0.006079] lock(reservation_ww_class_mutex);
[ +0.007237] lock(opp_table_lock);
[ +0.003592]
*** DEADLOCK ***

[ +0.006084] 3 locks held by chrome/1865:
[ +0.004031] #0: ffffff81edecc0d8 (&dev->struct_mutex){+.+.}, at:
msm_ioctl_gem_submit+0x264/0xb60
[ +0.009198] #1: ffffff81d0000870
(reservation_ww_class_acquire){+.+.}, at:
msm_ioctl_gem_submit+0x8e8/0xb60
[ +0.010086] #2: ffffff81f0fc71a8
(reservation_ww_class_mutex){+.+.}, at: submit_lock_objects+0x70/0x1ec
[ +0.009735]
stack backtrace:
[ +0.004482] CPU: 0 PID: 1865 Comm: chrome Not tainted 5.4.72 #14
[ +0.006173] Hardware name: Google Lazor (rev1+) with LTE (DT)
[ +0.005899] Call trace:
[ +0.002515] dump_backtrace+0x0/0x158
[ +0.003768] show_stack+0x20/0x2c
[ +0.003407] dump_stack+0xc8/0x160
[ +0.003506] print_circular_bug+0x2c4/0x2c8
[ +0.004305] check_noncircular+0x1a8/0x1b0
[ +0.004206] __lock_acquire+0xee4/0x2450
[ +0.004032] lock_acquire+0x1cc/0x210
[ +0.003768] __mutex_lock_common+0xec/0xc0c
[ +0.004305] mutex_lock_nested+0x40/0x50
[ +0.004033] _find_opp_table+0x34/0x74
[ +0.003855] dev_pm_opp_find_freq_exact+0x2c/0xdc
[ +0.004833] a6xx_gmu_resume+0xc8/0xecc
[ +0.003943] a6xx_pm_resume+0x148/0x200
[ +0.003944] adreno_resume+0x28/0x34
[ +0.003681] pm_generic_runtime_resume+0x34/0x48
[ +0.004745] __rpm_callback+0x70/0x10c
[ +0.003854] rpm_callback+0x34/0x8c
[ +0.003592] rpm_resume+0x414/0x550
[ +0.003592] __pm_runtime_resume+0x7c/0xa0
[ +0.004207] msm_gpu_submit+0x60/0x1c0
[ +0.003855] msm_ioctl_gem_submit+0xadc/0xb60
[ +0.004481] drm_ioctl_kernel+0x9c/0x118
[ +0.004031] drm_ioctl+0x27c/0x408
[ +0.003504] drm_compat_ioctl+0xcc/0xdc
[ +0.003945] __se_compat_sys_ioctl+0x100/0x206c
[ +0.004658] __arm64_compat_sys_ioctl+0x20/0x2c
[ +0.004659] el0_svc_common+0xa8/0x178
[ +0.003855] el0_svc_compat_handler+0x2c/0x40
[ +0.004480] el0_svc_compat+0x8/0x10

> diff --git a/drivers/opp/core.c b/drivers/opp/core.c
> index 2483e765318a..4cc0fb716381 100644
> --- a/drivers/opp/core.c
> +++ b/drivers/opp/core.c
> @@ -1181,6 +1181,10 @@ static void _opp_table_kref_release(struct kref *kref)
> struct opp_device *opp_dev, *temp;
> int i;
>
> + /* Drop the lock as soon as we can */
> + list_del(&opp_table->node);
> + mutex_unlock(&opp_table_lock);
> +
> _of_clear_opp_table(opp_table);
>
> /* Release clk */
> @@ -1208,10 +1212,7 @@ static void _opp_table_kref_release(struct kref *kref)
>
> mutex_destroy(&opp_table->genpd_virt_dev_lock);
> mutex_destroy(&opp_table->lock);
> - list_del(&opp_table->node);
> kfree(opp_table);
> -
> - mutex_unlock(&opp_table_lock);
> }
>
> void dev_pm_opp_put_opp_table(struct opp_table *opp_table)
>
> --
> viresh