RE: [PATCH] VM, x86, PAT: Change implementation ofis_linear_pfn_mapping

From: Pallipadi, Venkatesh
Date: Wed Mar 11 2009 - 23:21:22 EST




>-----Original Message-----
>From: Pallipadi, Venkatesh [mailto:venkatesh.pallipadi@xxxxxxxxx]
>Sent: Wednesday, March 11, 2009 5:32 PM
>To: Frans Pop
>Cc: mingo@xxxxxxx; thellstrom@xxxxxxxxxx; Linux kernel mailing
>list; Siddha, Suresh B; Nick Piggin; ebiederm@xxxxxxxxxxxx
>Subject: Re: [PATCH] VM, x86, PAT: Change implementation of
>is_linear_pfn_mapping
>
>On Wed, 2009-03-11 at 15:09 -0700, Frans Pop wrote:
>> Pallipadi, Venkatesh wrote:
>> > Use of vma->vm_pgoff to identify the pfnmaps that are fully
>> > mapped at mmap time is broke. vm_pgoff is set by generic mmap
>> > code even for cases where drivers are setting up the mappings
>> > at the fault time.
>> >
>> > The problem was originally reported here.
>> > http://marc.info/?l=linux-kernel&m=123383810628583&w=2
>> >
>> > Change is_linear_pfn_mapping logic to overload VM_NONLINEAR
>> > flag along with VM_PFNMAP to mean full PFNMAP setup at mmap
>> > time.
>> >
>> > Acked-by: Thomas Hellstrom <thellstrom@xxxxxxxxxx>
>> > Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@xxxxxxxxx>
>> > Signed-off-by: Suresh Siddha <suresh.b.siddha>@intel.com>
>>
>> I've applied this patch on top of v2.6.29-rc7-143-g99adcd9
>[1] and since
>> then I've had my system, or rather X/KDE, hang several
>times. The last
>> time the problem seems to have been KDE's kicker. I was
>running a kernel
>> compile in a konsole window and that just continued and
>finished, but the
>> keyboard was completely dead.
>> I could still ssh in from another box. 'ps' would show the
>top processes,
>> but hang as well at some point (in the middle of listing KDE
>processes.
>>
>> The hang was with pat enabled. I've now booted with nopat.
>
>Frans,
>
>Thanks for testing this. I don't seem to reproduce this on any of my
>test systems with this patch on either tip or latest git. Do
>you see the
>hang on every boot or once in a while? Are things stable with nopat?
>
>> The log shows (full log attached):
>> kernel: BUG: unable to handle kernel NULL pointer dereference at
>> 0000000000000008
>> kernel: IP: [<ffffffff80322504>] prio_tree_remove+0x9c/0xcc
>> kernel: PGD 7cab1067 PUD 7d644067 PMD 0
>> kernel: Oops: 0000 [#1] SMP
>> kernel: last sysfs file: /sys/class/power_supply/C23D/charge_full
>> kernel: CPU 1
>> kernel: Pid: 5415, comm: kicker Not tainted 2.6.29-rc7 #4 HP Comp
>> aq 2510p Notebook PC
>> kernel: RIP: 0010:[<ffffffff80322504>] [<ffffffff80322504>] prio
>> _tree_remove+0x9c/0xcc
>> [...]
>> kernel: Call Trace:
>> kernel: [<ffffffff803225df>] prio_tree_insert+0xab/0x22a
>> kernel: [<ffffffff8027e90d>] vma_prio_tree_insert+0x23/0xc2
>> kernel: [<ffffffff802864af>] __vma_link_file+0x70/0x72
>> kernel: [<ffffffff80286c15>] vma_link+0x7d/0xab
>> kernel: [<ffffffff802881ea>] mmap_region+0x313/0x479
>> kernel: [<ffffffff80288646>] do_mmap_pgoff+0x2f6/0x35c
>> kernel: [<ffffffff802ea99a>] do_shmat+0x28a/0x36c
>> kernel: [<ffffffff802eaa8d>] sys_shmat+0x11/0x1c
>> kernel: [<ffffffff8020c25b>] system_call_fastpath+0x16/0x1b
>>
>> From the symptoms I strongly suspect this patch to be the culprit.
>>
>> [1] Together with some other patches (mainly Rafael's latest
>patchset
>> for "Rework disabling of interrupts during suspend-resume"),
>but I doubt
>> any of those are related to this issue.
>>
>
>Nothing obvious strikes me with this patch and above OOPs. Can you

Thinking about it a bit more, the usage of VM_NONLINEAR flag in
this patch may be conflicting with some expectation in
mm code, that may be resulting in above oops. Let me
spend some more time on this and get back to you.

Thanks,
Venki--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/