According to the discussion on the lse-tech mailing-list, it appears that two steps (at least) are required to improve accounting.
We need to improve accounting structure. The current BSD-accounting structure doesn't have enough informations. Metrics computed by CSA module can be added to BSD accounting. According to the other discussion (like Andi Kleen's comment about the patch I wrote when I wanted to add CSA IO values in the BSD accounting http://lkml.org/lkml/2004/8/2/70), the current method to get metrics about blocks/char read/write is not accurate since most writes can be accounted to pdflushd.
We need to be able to manage groups of processes as it's clear that a major accounting improvement is the per-job accounting. I don't know if "job" is the right noun. The property needed here is that if a process is in a container, its children will be in the same container.
In this project we use the process events connector to catch the creation of a new process and we propose a user space solution for managing groups of processes. Therefore, all of the enhanced accounting is done in user space. Here is the Enhanced Linux System Accounting scope.
Work can be split into the following parts :
The connector reports process events to userspace. It uses the netlink mechanism and your kernel must be built with the following configuration options:
Device Drivers ---> Connector - unified userspace <-> kernelspace linker ---> <*> Connector unified userspace <-> kernelspace linker [*] Report process events to userspaceThis option is available since 2.6.15.
User space daemon
The user space daemon listens to the netlink messages sent by the process event connector. Like this, it will be alerted when fork happens. With this information, it will be able to manage a group of processes. It communicates to the high level application through sockets for local interprocess communication. Therefore, the high level application can send request to add or remove a process from a job (ie a group of processes), it can also send a request to get information about current jobs. Thus, the daemon is under the high level application control.
It has another task to accomplish. When it receives the message from the fork adviser, it will check if the process that has initiated the fork belongs to a job and if the answer is yes, the daemon will add the child into the same job. That's the main property of containers.
Per-process accounting informations
This is not a part of ELSA but it is used by it. Per-process accounting information is provided by an extra mechanism like BSD accounting or CSA accounting. Discussions are opened about the per-process accounting and SGI people are making effort to add their accounting values in BSD process accounting.
User space applications
The job manager
'jobmng' is the interface to manage groups of processes.
A webmin module to visualize data
The webmin module 'Enhanced Linux System Accounting' will use information provided by the user space job daemon and the information provided by a per-process accounting mechanism (like BSD process accounting) to provide a per-group accounting information.
There is also a scipt called 'elsa' that allows the visualization of accounting data.
KERNEL SPACE | USER SPACE | | | --------------------- 1. Process Events NETLINK | 2. Userspace Daemon | AF_UNIX connector <----------->| jobs manager |<------- | --------------------- | | | | | | ------------ | | | 4a. jobmng | | | ------------ | | ********** | --->* Job file * | ********** | | | ************ | -------------- 3.Accounting Data -------------->* Accounting * ---->| 4b. Webmin | (BSD and/or CSA) | * File *------->| interface | | ************ -------------- Legend: no box ---> Not part of ELSA ------ | | ---> Applications that are provided by ELSA ------ ****** * * ---> File used by ELSA ******
All comments are welcome,
The ELSA Team