Re: [RFC][PATCH] HWPOISON: remove the unsafe __set_page_locked()

From: Nick Piggin
Date: Mon Sep 28 2009 - 00:30:18 EST


On Mon, Sep 28, 2009 at 06:11:08AM +0200, Andi Kleen wrote:
> On Mon, Sep 28, 2009 at 04:57:41AM +0200, Nick Piggin wrote:
> > On Mon, Sep 28, 2009 at 03:19:43AM +0200, Andi Kleen wrote:
> > > > There is no real rush AFAIKS to fix this one single pagecache site
> > > > while we have problems with slab allocators and all other unaudited
> > > > places that nonatomically modify page flags with an elevated
> > >
> > > hwpoison ignores slab pages.
> >
> > "ignores" them *after* it has already written to page flags?
> > By that time it's too late.
>
> Yes, currently the page lock comes first. The only exception
> is for page count == 0 pages. I suppose we could move the slab check
> up, but then it only helps when slab is set.

Yes, so it misses other potential non-atomic page flags manipulations.


> So if you make slab use refcount == 0 pages that would help.

Yes it would help here and also help with the pagecache part too,
and most other cases I suspect. I have some patches to do this at
home so I'll post them when I get back.


> > Well it's fundamentally badly buggy, rare or not. We could avoid
>
> Let's put it like this -- any access to the poisoned cache lines
> in that page will trigger a panic anyways.

Well yes, although maybe people who care about this feature will
care more about having a reliable panic than introducing a
random data corruption. I guess the chance of an ecc failure
combined with a chance the race window hits could be some orders
of magnitude less likely than other sources of bugs ;) but still
I don't like using that argument to allow known bugs -- it leads
to interesting things if we take it to a conclusion.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/