Re: getrusage vs /proc/pid/stat?

From: Dan Kegel (dank@kegel.com)
Date: Mon Jun 18 2001 - 16:20:20 EST


Pete Wyckoff wrote:
>
> dank@kegel.com said:
> > I'd like to monitor CPU, memory, and I/O utilization in a
> > long-running multithreaded daemon under kernels 2.2, 2.4,
> > and possibly also Solaris (#ifdefs are ok).
>
> getrusage() isn't really the system call you want for this.

I'll buy that. Looks like a lot of unices don't implement that
call fully, and Linux is one of them.

What is the proper way to measure total CPU time used by a multithreaded program?
The only way I can figure to do it is to sum /proc/pid/stat across
the threads of interest (see below). Is this the easiest way,
or am I missing something? (Forgive the C++. I'd recode this in C
if it were for general use.)

========= LinuxTimes.h ==========
#include <sys/times.h>
#include <pthread.h>

/*--------------------------------------------------------------------------
 Source and test case at http://www.kegel.com/lt.tar.gz

 Monitor the CPU usage of a bunch of threads in the same process.
 This is a simulation of the system call times()
 providing traditional semantics under LinuxThreads.
 On e.g. Solaris, you don't need this; you just call the standard times().
--------------------------------------------------------------------------*/
class LinuxTimes {
    const static int MAXTHREADS = 100;

    /// number of threads being monitored
    int m_nthreads;

    /// fd open to /proc/pid/stat for each thread
    int m_proc_pid_stat_fd[MAXTHREADS];

    /// make addSelf threadsafe
    pthread_mutex_t m_lock;

public:

    LinuxTimes() { m_nthreads = 0; pthread_mutex_init(&m_lock, NULL); }

    /**
     New threads call this to add themselves to the group.
     Threadsafe.
     @return 0 on success, Unix error code on failure
     */
    int addSelf();

    /**
     Calculate user and system time accumulated by all threads in group.
     Return result in tms_utime and tms_stime fields of given struct tms.
     Similar to 'man 2 times' on Solaris (where all CPU time of all threads
     is counted as CPU time towards the process).
     @return 0 on success, Unix error code on failure
     */
    int times(struct tms *buf);
};

========= LinuxTimes.cc ==========

#include "LinuxTimes.h"
#include <ctype.h>
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <unistd.h>

/**
 New threads call this to add themselves to the group.
 Threadsafe.
 @return 0 on success, Unix error code on failure
 */
int LinuxTimes::addSelf()
{
    int fd;
    char fname[256];
    int err = 0;

    if (pthread_mutex_lock(&m_lock))
        return EINVAL;

    if (m_nthreads >= MAXTHREADS) {
        err = E2BIG;
    } else {
        // Under LinuxThreads, each thread has its own pid
        sprintf(fname, "/proc/%d/stat", getpid());
        fd = open(fname, O_RDONLY);
        if (fd == -1)
            err = errno;
        else {
            m_proc_pid_stat_fd[m_nthreads++] = fd;
        }
    }

    if (pthread_mutex_unlock(&m_lock))
        return EINVAL;

    return err;
}

/* Skip to char after nth whitespace. Returns NULL on failure. */
static const char *skipNspace(const char *p, int n)
{
    while (*++p)
        if (isspace(*p) && ! --n)
            return p+1;
    return NULL;
}

/**
 Calculate user and system time accumulated by all threads in group.
 Return result in tms_utime and tms_stime fields of given struct tms.
 Similar to 'man 2 times' on Solaris (where all CPU time of all threads
 is counted as CPU time towards the process).
 @return 0 on success, Unix error code on failure
 */
int LinuxTimes::times(struct tms *buf)
{
    int i;
    int nread;

    buf->tms_utime = 0;
    buf->tms_stime = 0;
    for (i=0; i<m_nthreads; i++) {
        char scratch[1024]; // FIXME: is that big enough?
        int fd = m_proc_pid_stat_fd[i];

        // rewind to start of stat file. (Not all /proc entries support this.)
        if (lseek(fd, 0, SEEK_SET))
            return EBADF;

        // Read in ASCII data and parse out utime and stime fields
        // (see 'man proc')
        nread = read(fd, scratch, sizeof(scratch)-1);
        if (nread < 50) // FIXME: cheesy
            return EINVAL;
        scratch[nread] = 0;
        
        // Skip to end of command field
        // FIXME: what if executable has ) in its filename? Bleh.
        const char *p = strchr(scratch, ')') + 2;

        // Skip to utime field
        p = skipNspace(p, 11);
        if (!p) return EINVAL;
        buf->tms_utime += atoi(p);

        // Skip to stime field
        p = skipNspace(p, 1);
        if (!p) return EINVAL;
        buf->tms_stime += atoi(p);
    }

    return 0;
}

==============

Thanks,
Dan

-- 
"A computer is a state machine.
 Threads are for people who can't program state machines."
         - Alan Cox
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Jun 23 2001 - 21:00:22 EST