Re: [PATCH 1/3] x86,mm/pat: Use generic interval trees

From: Michel Lespinasse
Date: Wed Sep 04 2019 - 22:00:51 EST


Hi Davidlohr,

On Wed, Sep 4, 2019 at 5:52 PM Davidlohr Bueso <dave@xxxxxxxxxxxx> wrote:
> Ok, so for that I've added the following helper which will make the
> conversion a bit more straightforward:
>
> #define vma_interval_tree_foreach_stab(vma, root, start)
> vma_interval_tree_foreach(vma, root, start, start)

Yes, that should help with the conversion :)

> I think this also makes sense as it documents the nature of the lookup.
>
> Now for the interval-tree conversion to [a,b[ , how does the following
> look to you? Note that this should be the end result; we can discuss
> how to get there later, but I wanna make sure that it was more or less
> what you were picturing.

I do not have time for a full review right now, but I did have a quick
pass at it and it does seem to match the direction I'd like this to
take.

Please let me know if you'd like me to do a full review of this, or if
it'll be easier to do it on the individual steps once this gets broken
up.

BTW are you going to plumbers ? I am and I would love to talk to you
there, about this and about MM range locking, too.

Things I noticed in my quick pass:

> diff --git a/drivers/gpu/drm/radeon/radeon_vm.c b/drivers/gpu/drm/radeon/radeon_vm.c
> index e0ad547786e8..dc9fad8ea84a 100644
> --- a/drivers/gpu/drm/radeon/radeon_vm.c
> +++ b/drivers/gpu/drm/radeon/radeon_vm.c
> @@ -450,13 +450,14 @@ int radeon_vm_bo_set_addr(struct radeon_device *rdev,
> {
> uint64_t size = radeon_bo_size(bo_va->bo);
> struct radeon_vm *vm = bo_va->vm;
> - unsigned last_pfn, pt_idx;
> + unsigned pt_idx;
> uint64_t eoffset;
> int r;
>
> if (soffset) {
> + unsigned last_pfn;
> /* make sure object fit at this offset */
> - eoffset = soffset + size - 1;
> + eoffset = soffset + size;
> if (soffset >= eoffset) {
Should probably be if (soffset > eoffset) now (just checking for
arithmetic overflow).
> r = -EINVAL;
> goto error_unreserve;
> @@ -471,7 +472,7 @@ int radeon_vm_bo_set_addr(struct radeon_device *rdev,
> }
>
> } else {
> - eoffset = last_pfn = 0;
> + eoffset = 1; /* interval trees are [a,b[ */
Looks highly suspicious to me, and doesn't jive with the eoffset /=
RADEON_GPU_PAGE_SIZE below.
I did not look deep enough into this to figure out what would be correct though.
> }
>
> mutex_lock(&vm->mutex);
> diff --git a/drivers/infiniband/hw/hfi1/mmu_rb.c b/drivers/infiniband/hw/hfi1/mmu_rb.c
> index 14d2a90964c3..d708c45bfabf 100644
> --- a/drivers/infiniband/hw/hfi1/mmu_rb.c
> +++ b/drivers/infiniband/hw/hfi1/mmu_rb.c
> @@ -195,13 +195,13 @@ static struct mmu_rb_node *__mmu_rb_search(struct mmu_rb_handler *handler,
> trace_hfi1_mmu_rb_search(addr, len);
> if (!handler->ops->filter) {
> node = __mmu_int_rb_iter_first(&handler->root, addr,
> - (addr + len) - 1);
> + (addr + len));
> } else {
> for (node = __mmu_int_rb_iter_first(&handler->root, addr,
> - (addr + len) - 1);
> + (addr + len));
> node;
> node = __mmu_int_rb_iter_next(node, addr,
> - (addr + len) - 1)) {
> + (addr + len))) {
> if (handler->ops->filter(node, addr, len))
> return node;
> }

Weird unnecessary parentheses through this block.

> @@ -787,7 +787,7 @@ static phys_addr_t viommu_iova_to_phys(struct iommu_domain *domain,
> struct viommu_domain *vdomain = to_viommu_domain(domain);
>
> spin_lock_irqsave(&vdomain->mappings_lock, flags);
> - node = interval_tree_iter_first(&vdomain->mappings, iova, iova);
> + node = interval_tree_iter_first(&vdomain->mappings, iova, iova + 1);

I was gonna suggest a stab iterator for the generic interval tree, but
maybe not if it's only used here.

> diff --git a/include/linux/interval_tree_generic.h b/include/linux/interval_tree_generic.h
> index aaa8a0767aa3..e714e67ebdb5 100644
> --- a/include/linux/interval_tree_generic.h
> +++ b/include/linux/interval_tree_generic.h
> @@ -69,12 +69,12 @@ ITSTATIC void ITPREFIX ## _remove(ITSTRUCT *node, \
> } \
> \
> /* \
> - * Iterate over intervals intersecting [start;last] \
> + * Iterate over intervals intersecting [start;last[ \

Maybe I'm extra nitpicky, but I would suggest to use 'end' instead of
'last' when the intervals are half open: [start;end[
To me 'last' implies that it's in the interval, while 'end' doesn't
imply anything (and thus the half open convention used through the
kernel applies).
similarly ITLAST should be changed to ITEND, etc and similarly in the
various call sites (drm and others).
Again, maybe it's nitpicky but I feel changing the name also helps
verify that we don't leave behind any off-by-one use.

That said, it's really only my preference - if you think it's too
painful to change, that's OK.

--
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.