bug in ATOMIC swapout (make_request()) hided without oom-arca

Andrea Arcangeli (andrea@e-mind.com)
Thu, 8 Oct 1998 16:18:54 +0200 (CEST)


There' s a bug in the swapout code if we can' t sleep (if __GFP_WAIT is
not set).

If inside an irq we need memory and __get_free_pages() runs
try_to_free_pages(GFP_ATOMIC,...) (because thq irq happened while we are
OOM) it can happens that the swapout path end up executing
__get_request_wait() (in make_request()) that sleeps. This obviously harm
and cause an Oops because scheduling in irq.

(The irq was el3_interrupt that runs dev_alloc_skb() that does a
skb_alloc() that does a kmalloc(GFP_ATOMIC) that runs kmem_cache_grow that
run __get_free_pages(GFP_ATOMIC)).

I think a easy _workaround_ (probably not the thing to do and not tried
too because I don' t know how ll_rw_blk works yet) could be to change
make_request() from:

if (!req) {
if (rw_ahead) {
unlock_buffer(bh);
return;
}
req = __get_request_wait(max_req, bh->b_rdev);
}

to:

if (!req) {
if (rw_ahead || in_interrupt()) {
unlock_buffer(bh);
return;
}
req = __get_request_wait(max_req, bh->b_rdev);
}

Unfortunately I can' t continue to work on such swapout bug since now I
really must stop hacking since tomorrow I' ll have a math exam argg (I
must learn at least some proof to answer at the free-answer-question).

Nobody right now is able to see if the kernel is able to run when we are
near OOM because without my oom-[56789] patch the kernel simply deadlock
(due kswapd and get_free_pages) when we are _near_ OOM.

So I really think that my latest patch should be put in ASAP (to allow
people to test oom to produce Oops and discover bugs with many setup (it'
s better to see a report of an Oops than an useless "2.1 deadlock when I
reach oom" message I think)). The only problem I was able to reproduce in
some days of heavy testing of my patches is the __get_request_wait() in
interrupt described in detail above (and without my patch such problem is
hided very well).

You should find my latest patch (oom-9) on the list. (oom-9 continue to
run kswapd SCHED_FIFO _only_ when it sleep in _kswapd()_ and it seems to
works fine this way).

Comments?

Andrea[s] Arcangeli

PS. My patch fix also the kmem_cache_reap() that can be run by
do_try_to_free_page() only if __GFP_WAIT is set (note the down() at top of
such shrink function). Also such bug was hided very well for the _same_
reasons. Without my patch no one irq function will end up running
do_try_to_free_page().

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/