> Smaller cache footprint is always a win. E.g. when your user program and
> the kernel manage to run both in L2 for a working set you save a lot of time.
> The key to that is small overall memory/cache footprint of the kernel
> (2.0 was much better in that than 2.4)
I've attached a patch (against 2.4.0-test5) for the page cache, which
demonstrates a technique to reduce the length of re-searching of a chain
after its guarding lock has been temporarily dropped (say, for a blocking
operation), and the re-search is expected to fail.
While the patch is only for the page cache, something similar could be
done for the inode, dcache, and buffer cache.
A common operation is to search a chain, looking for a certain matching
object. If the object isn't found, the chain's guard is dropped, a new
object allocated, then the chain is re-lock and re-searched for the object
before inserting the newly allocated one (to catch races in inserting the
same object). Nothing special here.
For most chains, objects are _always_ inserted at the chain's
head. This means, after the initial search, a new object can only appear
between the chain's head and what was the first object in the chain. As
the chain maybe empty, or the first object may disappear, the continued
existence of the first object cannot be relied upon after the guard is
By inserting a marker (cookie) as the first object (after the original
search and while the guarding lock is still held) gives a well known
point at which to terminate a subsequent search (assuming it doesn't match
before reaching the marker).
By taking the memory for the cookie off the kernel stack, it doesn't
require an allocation. The cookie only needs to contain the linkage
pointers and the members checked during a search, so it doesn't need to be
the full size of the objects held in the chain; setting the members in
the cookie so that it cannot match any valid searches. Of course, avoid
taking a large amount of memory from the k-stack.
As the cookie comes from k-stack, its virtual address uniquely
identifies it from other cookies which may be in use on the same chain
(assuming no clever mapping tricks are being used on the k-stack - eg. for
each CPU, the stack of the running task is always mapped to the same virt
address, with other (non-running) tasks having their stacks unmapped).
Of course, a cookie could be allocated from kmalloc() (and friends) when
the code assumes it will need it in the near furture.
For the page-cache, I'm using a cookie when re-searching a chain and do
not expect to find the page. That is, after a failed search, I insert a
page-cookie into the chain's head. After allocating a new page, I can
limit the re-search upto the cookie - no rocket science here.
Obviously, there is no point in limiting a search which is expected to
succeed, so I haven't used a cookie for every re-search of the page-cache.
I haven't benchmarked the patch; perhaps the overhead in adding/removing
the cookie is more than is saved - needs a proper investigation.
In the page-cache, I'm using a full page structure as a cookie. This
could be cutdown, but would then require keeping the full page struct and
the cookie page struct in sync.
As well as limiting re-searches, cookies can be used as markers during
walks - basically, restart points in a chain after dropping, and then
retaking, the chain's guard.
I can't believe no one has thought of doing this before (I've seen it in
other OSes I've worked on). Is the cost of the cookies too high to make
it worthwhile? Or are they simply too ugly?
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to email@example.com
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Mon Aug 07 2000 - 21:00:05 EST