Re: PG_zero

From: Nick Piggin
Date: Tue Nov 02 2004 - 20:24:30 EST

Next message: Martin J. Bligh: "Re: PG_zero"
Previous message: Ingo Molnar: "Re: [patch] Real-Time Preemption, -RT-2.6.9-mm1-V0.6.9"
In reply to: Andrea Arcangeli: "Re: PG_zero"
Next in thread: Andrea Arcangeli: "Re: PG_zero"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Andrea Arcangeli wrote:

On Tue, Nov 02, 2004 at 02:41:15PM -0800, Martin J. Bligh wrote:

eh? I don't see how that matters at all. After the DMA transfer, all the cache lines will have to be invalidated in every CPUs cache anyway, so
it's guaranteed to be stone-dead zero-degrees-kelvin cold. I don't see how
however hot it becomes afterwards is relevant?

if the cold page becomes hot, it means the hot pages in the hot
quicklist will become colder. The cache size is limited, so if something
becomes hot, something will become cold.

The only difference is that the hot pages will become cold during the
dma if we return an hot page, or the hot pages will become cold while
the cpu touches the data of the previously cold page, if we return a
cold page. Or are you worried that the cache snooping is measurable?

I believe the hot-cold thing, is mostly important for the hot
allocations not for the cold one. So that the hot allocations are served
in a strict LIFO order, that truly matters but the cold allocations are
a grey area.

What kind of slowdown can you measure if you drop __GFP_COLD enterely?

Don't get me wrong, __GFP_COLD makes perfect sense since it's so little
cost to do it that it most certainly worth the branch in the
allocator, but I don't think the hot pages worth a _reservation_ since
they'll become cold anwyays after the I/O has completed, so then we
could have returned an hot page in the first place without slowing down
in the buddy to get it.

I see what you mean. You could be correct that it would model cache
behaviour better to just have the last N freed "hot" pages in LIFO
order on the list, and allocate cold pages from the other end of it.

You still don't want cold freeing to pollute this list, *but* you do
want to still batch up cold freeing to amortise the buddy's lock
aquisition.

You could do that with just one list, if you gave cold pages a small
extra allowance to batch freeing if the list is full.

If the DMA is to pages that are hot in the CPUs cache - it's WORSE ... we
have more work to do in terms of cacheline invalidates. Mmm ... in terms
of DMAs, we're talking about disk reads (ie a new page allocates) - we're
both on the same page there, right?

the DMA snoops the cache for the cacheline invalidate but I didn't think
it's measurable.

I would really like to see the performance difference of disabling the
__GFP_COLD thing for the allocations and to force picking from the head
of the list (and to always free the cold pages a the tail), I doubt you
will measure anything.

I think you want to still take them off the cold end. Taking a
really cache hot page and having it invalidated is worse than
having some cachelines out of your combined pool of hot pages
pushed out when you heat the cold page.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Martin J. Bligh: "Re: PG_zero"
Previous message: Ingo Molnar: "Re: [patch] Real-Time Preemption, -RT-2.6.9-mm1-V0.6.9"
In reply to: Andrea Arcangeli: "Re: PG_zero"
Next in thread: Andrea Arcangeli: "Re: PG_zero"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]