Re: [PATCH v2] drm: avoid races with modesetting rights

From: Desmond Cheong Zhi Xi
Date: Tue Aug 17 2021 - 11:06:49 EST


On 16/8/21 9:59 pm, Daniel Vetter wrote:
On Mon, Aug 16, 2021 at 12:31 PM Desmond Cheong Zhi Xi
<desmondcheongzx@xxxxxxxxx> wrote:

On 16/8/21 5:04 pm, Daniel Vetter wrote:
On Mon, Aug 16, 2021 at 10:53 AM Desmond Cheong Zhi Xi
<desmondcheongzx@xxxxxxxxx> wrote:
On 16/8/21 2:47 am, kernel test robot wrote:
Hi Desmond,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on next-20210813]
[also build test ERROR on v5.14-rc5]
[cannot apply to linus/master v5.14-rc5 v5.14-rc4 v5.14-rc3]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/Desmond-Cheong-Zhi-Xi/drm-avoid-races-with-modesetting-rights/20210815-234145
base: 4b358aabb93a2c654cd1dcab1a25a589f6e2b153
config: i386-randconfig-a004-20210815 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
# https://github.com/0day-ci/linux/commit/cf6d8354b7d7953cd866fad004cbb189adfa074f
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Desmond-Cheong-Zhi-Xi/drm-avoid-races-with-modesetting-rights/20210815-234145
git checkout cf6d8354b7d7953cd866fad004cbb189adfa074f
# save the attached .config to linux build tree
make W=1 ARCH=i386

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@xxxxxxxxx>

All errors (new ones prefixed by >>, old ones prefixed by <<):

ERROR: modpost: "task_work_add" [drivers/gpu/drm/drm.ko] undefined!


I'm a bit uncertain about this. Looking into the .config used, this
error seems to happen because task_work_add isn't an exported symbol,
but DRM is being compiled as a loadable kernel module (CONFIG_DRM=m).

One way to deal with this is to export the symbol, but there was a
proposed patch to do this a few months back that wasn't picked up [1],
so I'm not sure what to make of this.

I'll export the symbol as part of a v3 series, and check in with the
task-work maintainers.

Link:
https://lore.kernel.org/lkml/20210127150029.13766-3-joshi.k@xxxxxxxxxxx/ [1]

Yeah that sounds best. I have two more thoughts on the patch:
- drm_master_flush isn't used by any modules outside of drm.ko, so we
can unexport it and drop the kerneldoc (the comment is still good).
These kind of internal functions have their declaration in
drm-internal.h - there's already a few there from drm_auth.c


Sounds good, I'll do that and move the declaration from drm_auth.h to
drm_internal.h.

- We know have 3 locks for master state, that feels a bit like
overkill. The spinlock I think we need to keep due to lock inversions,
but the master_mutex and master_rwsem look like we should be able to
merge them? I.e. anywhere we currently grab the master_mutex we could
instead grab the rwsem in either write mode (when we change stuff) or
read mode (when we just check, like in master_internal_acquire).

Thoughts?
-Daniel


Using rwsem in the places where we currently hold the mutex seems pretty
doable.

There are some tricky bits once we add rwsem read locks to the ioctl
handler. Some ioctl functions like drm_authmagic need a write lock.

Ah yes, I only looked at the dropmaster/setmaster ioctl, and those
don't have the DRM_MASTER bit set.

In this particular case, it might make sense to break master_mutex down
into finer-grained locks, since the function doesn't change master
permissions. It just needs to prevent concurrent writes to the
drm_master.magic_map idr.

Yeah for authmagic we could perhaps just reuse the spinlock to protect
->magic_map?


Yup, I had to move the spinlock from struct drm_file to struct drm_device, but I think that should work.

For other ioctls, I'll take a closer look on a case-by-case basis.

If it's too much shuffling then I think totally fine to leave things
as-is. Just feels a bit silly to have 3 locks, on of which is an
rwlock itself, for this fairly small amount of state.
-Daniel


Agreed, there's a lot of overlap between the master_mutex and rwsem so this a good opportunity to refactor things.

I'm cleaning up a v3 series now. There's some movement, but most of it are fixes to potential bugs that I saw while refactoring. We can see if the new version is a better design.



---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@xxxxxxxxxxxx