Re: [git pull] drm: previous pull req + 1.

From: Andrew Lutomirski
Date: Sun Jun 21 2009 - 14:50:43 EST


On Sun, Jun 21, 2009 at 1:13 PM, Linus
Torvalds<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
>
> On Sun, 21 Jun 2009, Linus Torvalds wrote:
>>
>> Dave - no amount of userspace differences make a corrupted page table
>> acceptable.
>>
>> This needs to be fixed. No excuses. Kernel crashes are never an issue of
>> "you used the wrong user space".
>
> So "corrupted page table" means that one of the reserved bits was set, and
> we get a page fault with the PF_RSVD bit on in the error code.
>
> Looking at the debug output, it says
>
>        PGD        12148a067
>        PUD        12148b067
>        PMD        121496067
>        PTE ffffc90011780237
>
> where the top-level entries look fine, but the PTE is total crap. It looks
> like it has filled in the page frame number with a virtual address rather
> than with an actual page
>
> The PTE _should_ look like this:
>
>  - bit 63: NX
>  - bits 62-52: zero (available to sw, but I don't think we use them)
>  - bits 51-47: zero (reserved)
>  - bits 46-12: page frame
>  - bits 11-0: protection and PAT bits etc (bits 8-7 are also reserved)
>
> and that PTE clearly does not match.
>
> Strictly speaking, that "47-bit" physical address is purely theoretical. I
> think existing CPU's are limited to 40 bits or so, so there are even more
> reserved bits.
>
> Anyway, here's a totally UNTESTED patch that hopefully gives a warning on
> where exactly we set the invalid bits. Andy, mind trying it out? You
> should get the warnign much earlier, and it should have a much more useful
> back-trace.

Your patch worked. Photo attached.

--Andy

Attachment: radeon_modeset_crash_2.jpg
Description: JPEG image