Re: [RFC][PATCH 2/9] deadlock prevention core

From: Daniel Phillips
Date: Fri Aug 18 2006 - 17:23:20 EST

Andrew Morton wrote:
Daniel Phillips wrote:
Andrew Morton wrote:'s runtime configurable.

So we default to "less than the best" because we are too lazy to fix the
network starvation issue properly? Maybe we don't really need a mempool for
struct bio either, isn't that one rather like the reserve pool we propose
for sk_buffs? (And for that matter, isn't a bio rather more like an sk_buff
than one would think from the code we have written?)

unavailable for write caching just to get around the network receive
starvation issue?

No, it's mainly to avoid latency: to prevent tasks which want to allocate
pages from getting stuck behind writeback.

This seems like a great help for some common loads, but a regression for
other not-so-rare loads. Regardless of whether or not marking 60% of memory
as unusable for write caching is a great idea, it is not a compelling reason
to fall for the hammer/nail trap.

What happens if some in kernel user grabs 68% of kernel memory to do some
very important thing, does this starvation avoidance scheme still work?

Well something has to give way. The process might get swapped out a bit,
or it might stall in the page allocator because of all the dirty memory.

It is that "something has to give way" that makes me doubt the efficacy of
our global write throttling scheme as a solution for network receive memory
starvation. As it is, the thing that gives way may well be the network
stack. Avoiding this hazard via a little bit of simple code is the whole
point of our patch set.

Yes, we did waltz off into some not-so-simple code at one point, but after
discussing it, Peter and I agree that the simple approach actually works
well enough, so we just need to trim down the patch set accordingly and
thus prove that our approach really is nice, light and tight.

Specifically, we do not actually need any fancy sk_buff sub-allocator
(either ours or the one proposed by davem). We already got rid of the per
device reserve accounting in order to finesse away the sk_buff->dev oddity
and shorten the patch. So the next rev will be rather lighter than the


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at