Re: [PATCH] [13/16] HWPOISON: The high level memory error handlerin the VM v3

From: Wu Fengguang
Date: Thu May 28 2009 - 10:50:52 EST


On Thu, May 28, 2009 at 09:45:20PM +0800, Andi Kleen wrote:
> On Thu, May 28, 2009 at 02:08:54PM +0200, Nick Piggin wrote:

[snip]

> >
> > BTW. I don't know if you are checking for PG_writeback often enough?
> > You can't remove a PG_writeback page from pagecache. The normal
> > pattern is lock_page(page); wait_on_page_writeback(page); which I
>
> So pages can be in writeback without being locked? I still
> wasn't able to find such a case (in fact unless I'm misreading
> the code badly the writeback bit is only used by NFS and a few
> obscure cases)

Yes the writeback page is typically not locked. Only read IO requires
to be exclusive. Read IO is in fact page *writer*, while writeback IO
is page *reader* :-)

The writeback bit is _widely_ used. test_set_page_writeback() is
directly used by NFS/AFS etc. But its main user is in fact
set_page_writeback(), which is called in 26 places.

> > think would be safest
>
> Okay. I'll just add it after the page lock.
>
> > (then you never have to bother with the writeback bit again)
>
> Until Fengguang does something fancy with it.

Yes I'm going to do it without wait_on_page_writeback().

The reason truncate_inode_pages_range() has to wait on writeback page
is to ensure data integrity. Otherwise if there comes two events:
truncate page A at offset X
populate page B at offset X
If A and B are all writeback pages, then B can hit disk first and then
be overwritten by A. Which corrupts the data at offset X from user's POV.

But for hwpoison, there are no such worries. If A is poisoned, we do
our best to isolate it as well as intercepting its IO. If the interception
fails, it will trigger another machine check before hitting the disk.

After all, poisoned A means the data at offset X is already corrupted.
It doesn't matter if there comes another B page.

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/