Re: Bug in buffer.c

Jeremy Katz (katzj@linuxpower.org)
Mon, 21 Jun 1999 21:36:51 -0400 (EDT)


On Tue, 22 Jun 1999, Andrea Arcangeli wrote:

> On Mon, 21 Jun 1999, Jeremy Katz wrote:
>
> >kernel BUG at buffer.c:1236!
>
> This seems to be a bug in the debugging code. We should remove all the
> "owner" debugging code since it looks it can race. We end changing the
> onwer field just to avoid UnlockPage to Oops, but changing it we allow the
> irq handler to Oops. I don't think doing:
>
> [..]
> cli();
> page->owner = (int)current;
> UnlockPage(page);
> sti();
> [..]
>
> is a good idea :)
>
> Anyway in clean 2.3.7 ll_rw_block gets called without the big kernel lock
> while we should hold it because add_request will refile the buffer and we
> can't grab the big kernel lock while we hold the io_request_lock to avoid
> deadlocking. I just fixed all these bugs in 2.3.7_andrea1 (and I would
> like to know if you can reproduce bad things with it).

Does a complete lockup while running md5sum count as bad things? :)
Completely locked up, sysrq did nothing, nothing logged in syslog.

As to running a copy working, I was unable to find out due to the
PAGE_BUG(page) call causing an oops and a segfault. The oops was preceded
by many "attempt to access beyond end of device" messages.

> Now I'll start deleting the "owner" checking.

Let me know once you get this up and I'll try again. :P

Jeremy

-- 
Jeremy Katz
http://linuxpower.org

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/