Re: [PATCH 1/3] iommu/io-pgtable-arm: Add nents_per_pgtable in struct io_pgtable_cfg

From: Will Deacon
Date: Wed Aug 30 2023 - 18:11:32 EST


On Tue, Aug 29, 2023 at 03:15:52PM -0700, Nicolin Chen wrote:
> Meanwhile, by re-looking at Will's commit log:
> arm64: tlbi: Set MAX_TLBI_OPS to PTRS_PER_PTE
>
> In order to reduce the possibility of soft lock-ups, we bound the
> maximum number of TLBI operations performed by a single call to
> flush_tlb_range() to an arbitrary constant of 1024.
>
> Whilst this does the job of avoiding lock-ups, we can actually be a bit
> smarter by defining this as PTRS_PER_PTE. Due to the structure of our
> page tables, using PTRS_PER_PTE means that an outer loop calling
> flush_tlb_range() for entire table entries will end up performing just a
> single TLBI operation for each entry. As an example, mremap()ing a 1GB
> range mapped using 4k pages now requires only 512 TLBI operations when
> moving the page tables as opposed to 262144 operations (512*512) when
> using the current threshold of 1024.
>
> I found that I am actually not quite getting the calculation at the
> end for the comparison between 512 and 262144.
>
> For a 4K pgsize setup, MAX_TLBI_OPS is set to 512, calculated from
> 4096 / 8. Then, any VA range >= 2MB will trigger a flush_tlb_all().
> By setting the threshold to 1024, the 2MB size bumps up to 4MB, i.e.
> the condition becomes range >= 4MB.
>
> So, it seems to me that requesting a 1GB invalidation will trigger
> a flush_tlb_all() in either case of having a 2MB or a 4MB threshold?
>
> I can get that the 262144 is the number of pages in a 1GB size, so
> the number of per-page invalidations will be 262144 operations if
> there was no threshold to replace with a full-as invalidation. Yet,
> that wasn't the case since we had a 4MB threshold with an arbitrary
> 1024 for MAX_TLBI_OPS?

I think this is because you can't always batch up the entire range as
you'd like due to things like locking concerns. For example,
move_page_tables() can end up invalidating 2MiB at a time, which is
too low to trigger the old threshold and so you end up doing ever single
pte individually.

Will