Re: [PATCH 4/5] teach smaps_pte_range() about THP pmds

From: Andrea Arcangeli
Date: Thu Feb 10 2011 - 13:08:24 EST


On Wed, Feb 09, 2011 at 11:54:11AM -0800, Dave Hansen wrote:
> @@ -385,8 +387,16 @@ static int smaps_pte_range(pmd_t *pmd, u
> pte_t *pte;
> spinlock_t *ptl;
>
> - split_huge_page_pmd(walk->mm, pmd);
> -
> + if (pmd_trans_huge(*pmd)) {
> + if (pmd_trans_splitting(*pmd)) {
> + spin_unlock(&walk->mm->page_table_lock);
> + wait_split_huge_page(vma->anon_vma, pmd);
> + spin_lock(&walk->mm->page_table_lock);

the locking looks wrong, who is taking the &walk->mm->page_table_lock,
and isn't this going to deadlock on the pte_offset_map_lock for
NR_CPUS < 4, and where is it released? This spin_lock don't seem
necessary to me.

The right locking would be:

spin_lock(&walk->mm->page_table_lock);
if (pmd_trans_huge(*pmd)) {
if (pmd_trans_splitting(*pmd)) {
spin_unlock(&walk->mm->page_table_lock);
wait_split_huge_page(vma->anon_vma, pmd);
} else {
smaps_pte_entry(*(pte_t *)pmd, addr, HPAGE_SIZE, walk);
spin_unlock(&walk->mm->page_table_lock);
return 0;
}

I think it worked because you never run into a pmd_trans_splitting pmd
yet, and you were running smaps_pte_entry lockless which could race
against split_huge_page (but it normally doesn't).

> + } else {
> + smaps_pte_entry(*(pte_t *)pmd, addr, HPAGE_SIZE, walk);
> + return 0;
> + }
> + }
> pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
> for (; addr != end; pte++, addr += PAGE_SIZE)
> smaps_pte_entry(*pte, addr, PAGE_SIZE, walk);
> _
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/