Linux job accounting (CSA)

From: Marlys Kohnke (kohnke@sgi.com)
Date: Fri Jun 16 2000 - 13:46:14 EST


     Los Alamos National Laboratory (LANL) and SGI are working
together to provide a job accounting package on Linux. This
accounting solution, called Comprehensive System Accounting (CSA),
provides the ability to track system resource utilization per
job and charge back the cost of those resources to users.
Please see http://oss.sgi.com/projects/csa for information on
the proposed kernel changes and a CSA overview.

     CSA is a set of kernel changes, C programs and shell scripts
that provide methods for collecting per-task resource usage data,
monitoring disk usage, and charging fees to specific login
accounts. CSA takes this per-task accounting information and
combines it outside of the kernel by job identifier (jid) within
system boot uptime periods. Another project, Process
Aggregates (PPAG), is providing the kernel job infrastructure
needed by CSA (http://oss.sgi.com/projects/pagg).

     Job accounting is important to production sites. As these sites
install large Linux systems, they need the enterprise style accounting
provided by CSA. Since numerous other Linux sites may not be
interested in job accounting, we're proposing that most of the
kernel code for CSA be contained in a loadable kernel module.
The new resource usage counters can also be used by performance
tools like sar and Performance Co-Pilot (PCP). These counters have
value outside of CSA and should be available regardless of
whether CSA is in use.

     Additional task resource usage counters are being proposed for
the number of characters read/written, blocks read/written, block
i/o wait time, number of read/write syscalls, physical and virtual
memory highwater marks, and physical and virtual memory integrals.

     These new counters plus a couple inline procedures add about
60 lines of kernel code. That number doesn't include adding new CSA
structures to the existing linux acct.h file or the new loadable kernel
module. The CSA source code will be available as soon as the
LANL and SGI lawyers come to final agreement on how and when
we will open source the code.

     A new acctctl syscall is needed to allow the kernel to
provide the following services related to CSA:

1) enable, disable and status processing of daemon and record
        accounting types
2) provide system accounting file name to kernel; allow switching
        to a new file (monitoring of file size is done outside
        of the kernel)
3) set memory and cpu threshold values (end-of-process accounting
        records written only if usage exceeds these values)
4) start and stop user job accounting (ja command is used to write
        accounting records for the current job to a user
        specified file in addition to the system accounting file)
5) provide daemon accounting records from system daemons like tape and
        workload management to the kernel to write to the system
        accounting file

     CSA will also use the resource usage counters that are currently
available in the kernel and which are used by the GNU process accounting
package. There will be no intermingling of accounting records between
these two packages. Each will write records into its own accounting
file. Each package will have its own set of user and administrator
commands to process its own accounting records and generate reports.
CSA will be a superset of the GNU process accounting, but a site
could choose to run both concurrently during a transition period.

     The initial prototype kernel code is done and accounting records
are being written. The commands haven't been ported yet. There's
still plenty of work to do, so please let me know if you're
interested in helping provide Linux job accounting.

     I'd appreciate any comments on the kernel counters and
guidance on whether the project should use a loadable kernel module
or follow the GNU accounting model in kernel/acct.c and use configuration
#ifdefs to manage compilation and execution of the kernel code.

----
Marlys Kohnke			Silicon Graphics Inc.
kohnke@sgi.com			655F Lone Oak Drive
(651)683-5324			Eagan, MN 55121

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Jun 23 2000 - 21:00:12 EST