Re: Fw: crash on x86_64 - mm related?

From: Linus Torvalds
Date: Thu Jan 05 2006 - 16:18:09 EST




On Thu, 5 Jan 2006, Ryan Richter wrote:
>
> Another one. I can't keep running this kernel - nearly all of our
> backup tapes are erased now. If a drive were to fail today, we would
> lose hundreds of GB of irreplacible data. I'm going back to 2.6.11.3
> until we have a full set of backups again.

Yeah, don't trash your backups.

If/when you try something again, how about these two trivial one-liners?

I'm not 100% sure the mapcount sanity check is strictly speaking right (no
locking between mapcount/pagecount comparison), but the page count really
should never fall below the mapcount, so aside from races that I don't
think can be triggered in practice, this might be very useful to find
where those pesky page counts suddenly disappear..

Right now we get the oops "too late" - something has decremented the page
count way too far, but we don't know what it was, and the actual function
that triggers it _seems_ to be harmless.

The other part of the patch is to clear "page" when it's being free'd, in
case somebody tries to free the same thing twice. I don't see how that
could happen either, but...

The patch is untested in every way. No guarantees.

Linus

---
diff --git a/drivers/scsi/st.c b/drivers/scsi/st.c
index c4aade8..18e60e1 100644
--- a/drivers/scsi/st.c
+++ b/drivers/scsi/st.c
@@ -4481,6 +4481,7 @@ static int sgl_unmap_user_pages(struct s
for (i=0; i < nr_pages; i++) {
struct page *page = sgl[i].page;

+ sgl[i].page = NULL;
if (dirtied)
SetPageDirty(page);
/* FIXME: cache flush missing for rw==READ
diff --git a/include/linux/mm.h b/include/linux/mm.h
index a06a84d..daf504d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -299,6 +299,7 @@ struct page {
#define put_page_testzero(p) \
({ \
BUG_ON(page_count(p) == 0); \
+ BUG_ON(page_count(p) <= page_mapcount(p)); \
atomic_add_negative(-1, &(p)->_count); \
})

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/