Re: Patch "page-allocator: preserve PFN ordering when __GFP_COLDis set" fails on my system

From: Mel Gorman
Date: Thu Aug 13 2009 - 04:39:17 EST


On Wed, Aug 12, 2009 at 02:40:30PM -0400, Arnaud Faucher wrote:
> I have a rather similar problem on a driver that I try to keep
> up-to-date with recent kernel versions
> (http://code.ximeta.com/trac-ndas/ticket/1110#comment:30). The NDAS
> hardware is an ethernet-enabled disk controller on one chip, kind of a
> cheap iSCSI.
>
> In my case there is no oops: the symptoms are that the read blocks seem
> to be swapped or full of garbage.
>
> After investigation in the NDAS code, the bug triggers when the driver
> tries to merge adjacent requests before sending them to the controller.
> I had to disable this merge in order to restore normal behavior, at the
> expense of a reduced efficiency.
>

That is a very interesting point and one I hadn't considered. The point
of the patch was to help drivers that merge adjacent requests if they
happen to be physically contiguous. The reported bug that led to the
patch was a regression of memory not being physically contiguous and
requests not being merged.

> > After this oops, system startup continues. Then the next oops occurs:
> >
> > This one is new, since I try to mount the connected SD card.
> >
>
> Mel's buffer overrun theory seems to apply in the NDAS driver case,
> where the original requests adjacency test seems faulty.
>
> May it also be the cause of the SD mounting crash ?
>

It's a possibility. If it's not an overrun, it's possible that the automatic
merging code is buggy as well.

Juergen, is the disk controller on your machine capable of merging
requests? If so, can you disable it and see if the bug still occurs
please?

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/