Re: [PATCH 1/9] clockpro-nonresident.patch

From: Peter Zijlstra
Date: Sat Dec 31 2005 - 04:54:16 EST


On Fri, 2005-12-30 at 23:13 -0200, Marcelo Tosatti wrote:
> On Fri, Dec 30, 2005 at 11:42:44PM +0100, Peter Zijlstra wrote:
> >
> > From: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> >
> > Originally started by Rik van Riel, I heavily modified the code
> > to suit my needs.
> >
> > The nonresident code approximates a clock but sacrifices precision in order
> > to accomplish faster lookups.
> >
> > The actual datastructure is a hash of small clocks, so that, assuming an
> > equal distribution by the hash function, each clock has comparable order.
> >
> > TODO:
> > - remove the ARC requirements.
> >
> > Signed-off-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
>
> <snip>
>
> > + *
> > + *
> > + * Modified to work with ARC like algorithms who:
> > + * - need to balance two FIFOs; |b1| + |b2| = c,
> > + *
> > + * The bucket contains four single linked cyclic lists (CLOCKS) and each
> > + * clock has a tail hand. By selecting a victim clock upon insertion it
> > + * is possible to balance them.
> > + *
> > + * The first two lists are used for B1/B2 and a third for a free slot list.
> > + * The fourth list is unused.
> > + *
> > + * The slot looks like this:
> > + * struct slot_t {
> > + * u32 cookie : 24; // LSB
> > + * u32 index : 6;
> > + * u32 listid : 2;
> > + * };
>
> 8 and 16 bit accesses are slower than 32 bit on i386 (Arjan pointed this out sometime ago).
>
> Might be faster to load a full word and shape it as necessary, will see if I can do
> something instead of talking. ;)

everything is 32bit except for the hands, but yes, this code needs to be
redone.

> > +/*
> > + * For interactive workloads, we remember about as many non-resident pages
> > + * as we have actual memory pages. For server workloads with large inter-
> > + * reference distances we could benefit from remembering more.
> > + */
>
> This comment is bogus. Interactive or server loads have nothing to do
> with the inter reference distance. To the contrary, interactive loads
> have a higher chance to contain large inter reference distances, and
> many common server loads have strong locality.
>
> <snip>

Happy to drop it, Rik?

> > +++ linux-2.6-git/include/linux/swap.h
> > @@ -152,6 +152,31 @@ extern void out_of_memory(gfp_t gfp_mask
> > /* linux/mm/memory.c */
> > extern void swapin_readahead(swp_entry_t, unsigned long, struct vm_area_struct *);
> >
> > +/* linux/mm/nonresident.c */
> > +#define NR_b1 0
> > +#define NR_b2 1
> > +#define NR_free 2
> > +#define NR_lost 3
>
> What is the meaning of "NR_lost" ?

should have read, NR_unused, it is the available fourth hand which is
unused. I just put it there for completeness sake and remember
struggling with the name while doing it, guess I should've taken that as
a hint.

--
Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/