Re: [PATCH] [16/19] HWPOISON: Enable .remove_error_page for migration aware file systems

From: Nick Piggin
Date: Wed Aug 12 2009 - 04:06:00 EST


On Tue, Aug 11, 2009 at 09:17:56AM +0200, Andi Kleen wrote:
> On Tue, Aug 11, 2009 at 12:50:59PM +0900, Hidehiro Kawai wrote:
> > > And application
> > > that doesn't handle current IO errors correctly will also
> > > not necessarily handle hwpoison correctly (it's not better and not worse)
> >
> > This is my main concern. I'd like to prevent re-corruption even if
> > applications don't have good manners.
>
> I don't think there's much we can do if the application doesn't
> check for IO errors properly. What would you do if it doesn't
> check for IO errors at all? If it checks for IO errors it simply
> has to check for them on all IO operations -- if they do
> they will detect hwpoison errors correctly too.

But will quite possibly do the wrong thing: ie. try to re-sync the
same page again, or try to write the page to a new location, etc.

This is the whole problem with -EIO semantics I brought up.


> > That is why I suggested this:
> > >>(2) merge this patch with new panic_on_dirty_page_cache_corruption
>
> You probably mean panic_on_non_anonymous_dirty_page_cache
> Normally anonymous memory is dirty.
>
> > >> sysctl
>
> It's unclear to me this special mode is really desirable.
> Does it bring enough value to the user to justify the complexity
> of another exotic option? The case is relatively exotic,
> as in dirty write cache that is mapped to a file.
>
> Try to explain it in documentation and you see how ridiculous it sounds; u
> it simply doesn't have clean semantics
>
> ("In case you have applications with broken error IO handling on
> your mission critical system ...")

Not broken error handling. It is very simple: if the application is
assuming EIO is an error with dirty data being sent to disk, rather
than an error with the data itself (which I think may be a common
assumption). Then it could have a problem.

If a database for example tries to write the data to another location
in response to EIO and then record it in a list of failed IOs before
halting the database. Then if it restarts it might try to again try
writing out these failed IOs (eg. give the administrator a chance to
fix IO devices). Completely made up scenario but it is not outlandish
and it would cause bad data corruption.

A mission critical server will *definitely* want to panic on dirty
page corruption, IMO, because by definition they should be able to
tolerate panic. But if they do not know about this change to -EIO
semantics, then it is quite possible to cause problems.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/