Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid

From: Badari Pulavarty
Date: Thu Jun 07 2007 - 12:22:21 EST


On Thu, 2007-06-07 at 00:53 -0400, Albert Cahalan wrote:
> On 6/6/07, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> > On Wed, 6 Jun 2007 23:27:01 -0400 "Albert Cahalan" <acahalan@xxxxxxxxx> wrote:
> > > Eric W. Biederman writes:
> > > > Badari Pulavarty <pbadari@xxxxxxxxxx> writes:
> > >
> > > >> Your recent cleanup to shm code, namely
> > > >>
> > > >> [PATCH] shm: make sysv ipc shared memory use stacked files
> > > >>
> > > >> took away one of the debugging feature for shm segments.
> > > >> Originally, shmid were forced to be the inode numbers and
> > > >> they show up in /proc/pid/maps for the process which mapped
> > > >> this shared memory segments (vma listing). That way, its easy
> > > >> to find out who all mapped this shared memory segment. Your
> > > >> patchset, took away the inode# setting. So, we can't easily
> > > >> match the shmem segments to /proc/pid/maps easily. (It was
> > > >> really useful in tracking down a customer problem recently).
> > > >> Is this done deliberately ? Anything wrong in setting this back ?
> > > >
> > > > Theoretically it makes the stacked file concept more brittle,
> > > > because it means the lower layers can't care about their inode
> > > > number.
> > > >
> > > > We do need something to tie these things together.
> > > >
> > > > So I suspect what makes most sense is to simply rename the
> > > > dentry SYSVID<segmentid>
> > >
> > > Please stop breaking things in /proc. The pmap command relys
> > > on the old behavior.
> >
> > What effect did this change have upon the pmap command? Details, please.
> >
> > > It's time to revert.
> >
> > Probably true, but we'd need to understand what the impact was.
>
> Very simply, pmap reports the shmid.
>
> albert 0 ~$ pmap `pidof X` | egrep -2 shmid
> 30050000 16384K rw-s- /dev/fb0
> 31050000 152K rw--- [ anon ]
> 31076000 384K rw-s- [ shmid=0x3f428000 ]
> 310d6000 384K rw-s- [ shmid=0x3f430001 ]
> 31136000 384K rw-s- [ shmid=0x3f438002 ]
> 31196000 384K rw-s- [ shmid=0x3f440003 ]
> 311f6000 384K rw-s- [ shmid=0x3f448004 ]
> 31256000 384K rw-s- [ shmid=0x3f450005 ]
> 312b6000 384K rw-s- [ shmid=0x3f460006 ]
> 31316000 384K rw-s- [ shmid=0x3f870007 ]
> 31491000 140K r---- /usr/share/fonts/type1/gsfonts/n021003l.pfb
> 3150e000 9496K rw--- [ anon ]

pmap seems to get shmid from "ino#" field of /proc/pid/map.
Its already broken in current mainline.

But, the breakage is not due to namespaces or container effort :(
Its due to noble effort from Eric to clean up the shm code,
take out the hacks to handle hugetlbfs and make the code
more streamlined and readable.

If we really really want old behaviour, we need my one line
patch to force shmid as inode# :(

BTW, I agree with Eric that its would be nice to use shmid as part
of name instead of forcing to be as inode number. It should be
possible for pmap to workout shmid from "key" or name. Isn't it ?

Andrew/Linus, its up to you to figure out if its worth breaking.
Here is the patch to base dentry-name on shmid - so we don't
need to use ino# to identify shmid.

Thanks,
Badari

Instead of basing dentry name on the shm "key", base it on
"shmid" - so it shows up clearly in /proc/pid/maps. Earlier
we were forcing ino# to match shmid.

Signed-off-by: Badari Pulavarty <pbadari@xxxxxxxxxx>
Index: linux-2.6.22-rc4/ipc/shm.c
===================================================================
--- linux-2.6.22-rc4.orig/ipc/shm.c 2007-06-04 17:57:25.000000000 -0700
+++ linux-2.6.22-rc4/ipc/shm.c 2007-06-06 13:43:36.000000000 -0700
@@ -364,6 +364,14 @@ static int newseg (struct ipc_namespace
return error;
}

+ error = -ENOSPC;
+ id = shm_addid(ns, shp);
+ if(id == -1)
+ goto no_id;
+
+ /* Build an id, so we can use it for filename */
+ shp->id = shm_buildid(ns, id, shp->shm_perm.seq);
+
if (shmflg & SHM_HUGETLB) {
/* hugetlb_zero_setup takes care of mlock user accounting */
file = hugetlb_zero_setup(size);
@@ -377,34 +385,28 @@ static int newseg (struct ipc_namespace
if ((shmflg & SHM_NORESERVE) &&
sysctl_overcommit_memory != OVERCOMMIT_NEVER)
acctflag = 0;
- sprintf (name, "SYSV%08x", key);
+ sprintf (name, "SYSVID%d", shp->id);
file = shmem_file_setup(name, size, acctflag);
}
error = PTR_ERR(file);
if (IS_ERR(file))
goto no_file;

- error = -ENOSPC;
- id = shm_addid(ns, shp);
- if(id == -1)
- goto no_id;
-
shp->shm_cprid = current->tgid;
shp->shm_lprid = 0;
shp->shm_atim = shp->shm_dtim = 0;
shp->shm_ctim = get_seconds();
shp->shm_segsz = size;
shp->shm_nattch = 0;
- shp->id = shm_buildid(ns, id, shp->shm_perm.seq);
shp->shm_file = file;

ns->shm_tot += numpages;
shm_unlock(shp);
return shp->id;

-no_id:
- fput(file);
no_file:
+ shm_rmid(ns, shp->id);
+no_id:
security_shm_free(shp);
ipc_rcu_putref(shp);
return error;


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/