Re: [PATCH 1/3] drm/panfrost: Simplify lock_region calculation

From: Steven Price
Date: Mon Aug 23 2021 - 05:40:51 EST


On 20/08/2021 22:31, Alyssa Rosenzweig wrote:
> In lock_region, simplify the calculation of the region_width parameter.
> This field is the size, but encoded as log2(ceil(size)) - 1.
> log2(ceil(size)) may be computed directly as fls(size - 1). However, we
> want to use the 64-bit versions as the amount to lock can exceed
> 32-bits.
>
> This avoids undefined behaviour when locking all memory (size ~0),
> caught by UBSAN.

It might have been useful to mention what it is that UBSAN specifically
picked up (it took me a while to spot) - but anyway I think there's a
bigger issue with it being completely wrong when size == ~0 (see below).

> Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@xxxxxxxxxxxxx>
> Reported-and-tested-by: Chris Morgan <macromorgan@xxxxxxxxxxx>
> Cc: <stable@xxxxxxxxxxxxxxx>

However, I've confirmed this returns the same values and is certainly
more simple, so:

Reviewed-by: Steven Price <steven.price@xxxxxxx>

> ---
> drivers/gpu/drm/panfrost/panfrost_mmu.c | 19 +++++--------------
> 1 file changed, 5 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> index 0da5b3100ab1..f6e02d0392f4 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> @@ -62,21 +62,12 @@ static void lock_region(struct panfrost_device *pfdev, u32 as_nr,
> {
> u8 region_width;
> u64 region = iova & PAGE_MASK;
> - /*
> - * fls returns:
> - * 1 .. 32
> - *
> - * 10 + fls(num_pages)
> - * results in the range (11 .. 42)
> - */
> -
> - size = round_up(size, PAGE_SIZE);

This seems to be the first issue - ~0 will be 'rounded up' to 0.

>
> - region_width = 10 + fls(size >> PAGE_SHIFT);

fls(0) == 0, so region_width == 10.

> - if ((size >> PAGE_SHIFT) != (1ul << (region_width - 11))) {

Presumably here's where UBSAN objects - we're shifting by a negative
value, which even it it happens to works means the lock region is tiny
and certainly not what was intended! It might well be worth a:

Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver")

Note for anyone following along at (working-from-) home: although this
code was cargo culted from kbase - kbase is fine because it takes a pfn
and doesn't do the round_up() stage.

Which also exposes the second bug (fixed in patch 2): a size_t isn't big
enough on 32 bit platforms (all Midgard/Bifrost GPUs have a VA size
bigger than 32 bits). Again kbase gets away with a u32 because it's a pfn.

There is potentially a third bug which kbase only recently attempted to
fix. The lock address is effectively rounded down by the hardware (the
bottom bits are ignored). So if you have mask=(1<<region_width)-1 but
(iova & mask) != ((iova + size) & mask) then you are potentially failing
to lock the end of the intended region. kbase has added some code to
handle this:

> /* Round up if some memory pages spill into the next region. */
> region_frame_number_start = pfn >> (lockaddr_size_log2 - PAGE_SHIFT);
> region_frame_number_end =
> (pfn + num_pages - 1) >> (lockaddr_size_log2 - PAGE_SHIFT);
>
> if (region_frame_number_start < region_frame_number_end)
> lockaddr_size_log2 += 1;

I guess we should too?

Steve

> - /* not pow2, so must go up to the next pow2 */
> - region_width += 1;
> - }
> + /* The size is encoded as ceil(log2) minus(1), which may be calculated
> + * with fls. The size must be clamped to hardware bounds.
> + */
> + size = max_t(u64, size, PAGE_SIZE);
> + region_width = fls64(size - 1) - 1;
> region |= region_width;
>
> /* Lock the region that needs to be updated */
>