Re: Copying to loop device hangs up everything

From: Andrew Morton (akpm@zip.com.au)
Date: Thu Dec 20 2001 - 02:41:47 EST


Andrea Arcangeli wrote:
>
> ...
> > The thing I don't like about the Andrea+Momchil approach is that it
> > exposes the risk of flooding the machine with dirty data. A scheme
>
> it doesn't, balance_dirty() has to work only at the highlevel.
> sync_page_buffers also is no problem, we'll try again later in those
> GFP_NOIO allocations.

Not so. The loop thread *copies* the data. We must throttle it,
otherwise the loop thread gobbles all memory and the box dies. This
is trivial to demonstrate.

> furthmore you don't even address the writepage from loop thread on the
> loop queue.

How can this deadlock? The only path to those buffers is
via the page, and the page is locked.
 
> The final fix should be in rc2aa1 that I will release in a jiffy. It
> takes care now of both the VM and balance_dirty().
>
> this is the incremental fix against rc1aa1:
>

No. Your patch removes *all* loop thread throttling, it doesn't even start
IO (thus removing the throttling which request starving would provide)
and doesn't even wake up bdflush.

If you set nfract to 70%, nfract_sync to 80% and do a big write, the
machine falls into a VM coma within 15 seconds. The same happens
with both my patches :-(

And it's not legitimate to say "don't do that". If we can't survive
those settings, we don't have a solution. We need to throttle writes
*more*, not less.

I'll keep poking at it. If you have any more suggestions/patches,
please toss them over...

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Dec 23 2001 - 21:00:21 EST