Re: VFS: file-max limit 50044 reached

From: Linus Torvalds
Date: Mon Oct 17 2005 - 10:43:04 EST




On Mon, 17 Oct 2005, Dipankar Sarma wrote:
>
> Agreed. It is not designed to work that way, so there must be
> a bug somewhere and I am trying to track it down. It could very well
> be that at maxbatch=10 we are just queueing at a rate far too high
> compared to processing.

That sounds sane.

I suspect that the real fix for 2.6.14 might be to update maxbatch to be
much higher by default.

The thing is, that batching really is fundamentally wrong. If we have a
thousand thing to free, we can't just free ten of them, and leave the 990
others to wait for next time. I realize people want real-time, but
if it's INCORRECT, then real-time isn't real-time.

I just checked: increasing "maxbatch" from 10 to 10000 does fix the
problem.

> This I am not sure, it is Linus' call. I am just trying to do the
> right thing - fix the real problem.

It sure looks like the batch limiter is the fundamental problem.

Instead of limiting the batching, we should likely try to avoid the RCU
lists getting huge in the first place - ie do the RCU callback processing
more often if the list is getting longer.

So I suspect that the _real_ fix is:

- for 2.6.14: remove the batching limig (or just make it much higher for
now)

- post-14: work on making sure rcu callbacks are done in a more timely
manner when the rcu queue gets long. This would involve TIF_RCUPENDING
and whatever else to make sure that we have timely quiescent periods,
and we do the RCU callback tasklet more often if the queue is long.

Hmm?

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/