Re: mmotm 2011-04-29 - wonky VmRSS and VmHWM values after swapping

From: Peter Zijlstra
Date: Tue May 10 2011 - 12:01:53 EST


On Mon, 2011-05-02 at 16:44 -0700, Andrew Morton wrote:
> On Mon, 02 May 2011 10:37:22 -0400
> Valdis.Kletnieks@xxxxxx wrote:
>
> > On Sun, 01 May 2011 20:26:54 EDT, Valdis.Kletnieks@xxxxxx said:
> > > On Fri, 29 Apr 2011 16:26:16 PDT, akpm@xxxxxxxxxxxxxxxxxxxx said:
> > > > The mm-of-the-moment snapshot 2011-04-29-16-25 has been uploaded to
> > > >
> > > > http://userweb.kernel.org/~akpm/mmotm/
> > >
> > > Dell Latitude E6500 laptop, Core2 Due P8700, 4G RAM, 2G swap.Z86_64 kernel.
> > >
> > > I was running a backup of the system to an external USB hard drive.
> >
> > Is a red herring. Am seeing it again, after only 20 minutes of uptime, and so
> > far I've only gotten 1.2G or so into the 4G ram (2.5G still free), and never
> > touched swap yet.
> >
> > Aha! I have a reproducer (found while composing this note). /bin/su will
> > reliably trigger it (4 tries out of 4, launching from a bash shell that itself
> > has sane VmRSS and VmHWM values). So it's a specific code sequence doing it
> > (probably one syscall doing something quirky).
> >
> > Now if I could figure out how to make strace look at the VmRSS after each
> > syscall, or get gdb to do similar. Any suggestions? Am open to perf/other
> > solutions as well, if anybody has one handy...
> >
>
> hm, me too. After boot, hald has a get_mm_counter(mm, MM_ANONPAGES) of
> 0xffffffffffff3c27. Bisected to Pater's
> mm-extended-batches-for-generic-mmu_gather.patch, can't see how it did
> that.
>

I haven't quite figured out how to reproduce, but does the below cure
things? If so, it should probably be folded into the first patch
(mm-mmu_gather-rework.patch?) since that is the one introducing this.

---
Subject: mm: Fix RSS zap_pte_range() accounting

Since we update the RSS counters when breaking out of the loop and
release the PTE lock, we should start with fresh deltas when we
restart the gather loop.

Reported-by: Valdis.Kletnieks@xxxxxx
Signed-off-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
---
Index: linux-2.6/mm/memory.c
===================================================================
--- linux-2.6.orig/mm/memory.c
+++ linux-2.6/mm/memory.c
@@ -1120,8 +1120,8 @@ static unsigned long zap_pte_range(struc
spinlock_t *ptl;
pte_t *pte;

- init_rss_vec(rss);
again:
+ init_rss_vec(rss);
pte = pte_offset_map_lock(mm, pmd, addr, &ptl);
arch_enter_lazy_mmu_mode();
do {


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/