BUG: Bad page state in process pmemtest pfn:1418800 (when running pmemtest) in 2.6.32 kernel

From: Sadasivan Shaiju (shshaiju)
Date: Fri Mar 06 2015 - 14:25:07 EST



Hi,

I am getting the following crash in 2.6.32 kernel when running memtest . Is there any patch available for this .


[root@localhost sysdiag]# free -m
total used free shared buffers cached
Mem: 775315 5390 769924 0 90 231
-/+ buffers/cache: 5068 770246
Swap: 0 0 0
[root@localhost sysdiag]# ./pmemrun -l 1
Cisco Memory test script Version 1.6 (01/13/2015)
Starting ipmi drivers: [ OK ]
running for 1 loops, limit 1 correctable errors
ECC Error Check
DIMM slots : 48
DIMM installed : 48
CECC Errors detected: 0
DIMM count mismatch: DIMM's installed : 48 DIMM error counters found: 0

TEMP_SENS_FRONT : 32 C
TEMP_SENS_REAR : 37 C
P1_TEMP_SENS : 47 C
P2_TEMP_SENS : 46 C
P3_TEMP_SENS : 47 C
P4_TEMP_SENS : 47 C
MB 736549
total MB 775315
using MB 736549 , Bytes: 772327604224

PmemTest: Build date Mon Nov 18 14:37:38 PST 2013
PmemTest: Cisco Memory Test Tool, Version 1.2


start: Fri Jan 1 00:52:03 PST 2010
Running Memory Read/Write Test (pmemtest) on 736549 MByte for 1 loops
BUG: Bad page state in process pmemtest pfn:1418800
BUG: Bad page state in process pmemtest pfn:1418600
BUG: Bad page state in process pmemtest pfn:14185b7
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: [<ffffffff8112c520>] bad_page+0xd0/0x160
PGD c024687067 PUD c02bf35067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu143/topology/physical_package_id
CPU 2
Modules linked in: cpufreq_ondemand acpi_cpufreq freq_table mperf ipv6 iTCO_wdt iTCO_vendor_support enic ipmi_devintf power_meter lpc_ich mfd_core shpchp sg ext3 jbd mbcache dm_snapshot squashfs sr_mod cdrom fnic libfcoe libfc scsi_transport_fc scsi_tgt usb_storage sd_mod crc_t10dif xhci_hcd megaraid_sas wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]

Pid: 9325, comm: pmemtest Not tainted 2.6.32-431.cisco.x86_64 #3 Cisco Systems Inc UCSB-B420-M4/UCSB-B420-M4
RIP: 0010:[<ffffffff8112c520>] [<ffffffff8112c520>] bad_page+0xd0/0x160
RSP: 0018:ffff88c02bf5b978 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffffea0046555000 RCX: 00000000000048bb
RDX: 0000000100009c46 RSI: 0000000000000086 RDI: 0000000000000246
RBP: ffff88c02bf5b988 R08: 0000000000000000 R09: 0000000000000004
R10: 0000000000000006 R11: 0000000000000001 R12: ffff88c02bf5a000
R13: ffff88000001b0c0 R14: 00000000000028d5 R15: ffffea0046555000
FS: 00007f7a3152f700(0000) GS:ffff880162840000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000008 CR3: 000000c01616b000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process pmemtest (pid: 9325, threadinfo ffff88c02bf5a000, task ffff88c030baeae0)
Stack:
ffff88c02bf5a000 0000000000000200 ffff88c02bf5bab8 ffffffff8112d8f2
<d> ffff88c024d40340 00000000ffffffff ffffea00a85b0610 ffffea00a85b05e8
<d> 00000001ffffffff ffff88303795d1c0 000000022bf5ba88 0000000000000006
Call Trace:
[<ffffffff8112d8f2>] get_page_from_freelist+0x762/0x870
[<ffffffff8112f3a3>] __alloc_pages_nodemask+0x113/0x8d0
[<ffffffff81527bd0>] ? thread_return+0x4e/0x76e
[<ffffffff81060b13>] ? perf_event_task_sched_out+0x33/0x70
[<ffffffff81167b9a>] alloc_pages_vma+0x9a/0x150
[<ffffffff8118337d>] do_huge_pmd_anonymous_page+0x14d/0x3b0
[<ffffffff8117f7d5>] ? follow_trans_huge_pmd+0xe5/0xf0
[<ffffffff8114b350>] handle_mm_fault+0x2f0/0x300
[<ffffffff8114b48a>] __get_user_pages+0x12a/0x430
[<ffffffff8100bb8e>] ? apic_timer_interrupt+0xe/0x20
[<ffffffff8114d1c0>] __mlock_vma_pages_range+0xb0/0x200
[<ffffffff8114d5cb>] mlock_fixup+0x16b/0x200
[<ffffffff81175ecd>] ? free_percpu+0x9d/0x130
[<ffffffff8114d919>] do_mlock+0xc9/0x100
[<ffffffff8114da98>] sys_mlock+0xb8/0x100
[<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
Code: 00 00 48 c1 fa 03 48 81 c6 78 06 00 00 48 0f af d0 31 c0 e8 75 b0 3f 00 48 8b 13 4c 8b 4b 18 48 89 d8 44 8b 43 0c 66 85 d2 78 6c <8b> 48 08 41 83 c0 01 48 89 de 48 c7 c7 60 0b 7c 81 31 c0 e8 4a
RIP [<ffffffff8112c520>] bad_page+0xd0/0x160
RSP <ffff88c02bf5b978>
CR2: 0000000000000008
---[ end trace 76755f03dbce0879 ]---
BUG: unable to handle kernel NULL pointer dereference
Kernel panic - not syncing: Fatal exception
at 000000000000000c
Pid: 9325, comm: pmemtest Tainted: G D --------------- 2.6.32-431.cisco.x86_64 #3
IP:Call Trace:
[<ffffffff8112c520>] bad_page+0xd0/0x160
PGD 9029beb067 [<ffffffff815274ba>] ? panic+0xa7/0x16f
PUD 902a570067 PMD 0
[<ffffffff8152b7f4>] ? oops_end+0xe4/0x100
Oops: 0000 [#2] SMP
last sysfs file: /sys/devices/system/cpu/cpu143/topology/physical_package_id
[<ffffffff8104a00b>] ? no_context+0xfb/0x260
CPU 8
Modules linked in: cpufreq_ondemand acpi_cpufreq [<ffffffff8104a295>] ? __bad_area_nosemaphore+0x125/0x1e0
freq_table mperf ipv6 iTCO_wdt iTCO_vendor_support enic [<ffffffff8104a363>] ? bad_area_nosemaphore+0x13/0x20
ipmi_devintf power_meter [<ffffffff8104aabf>] ? __do_page_fault+0x31f/0x480
lpc_ich mfd_core shpchp sg ext3 jbd [<ffffffff8128c3f6>] ? vsnprintf+0x336/0x5e0
mbcache dm_snapshot squashfs sr_mod [<ffffffff81136951>] ? lru_cache_add_lru+0x21/0x40
cdrom fnic libfcoe libfc scsi_transport_fc scsi_tgt usb_storage [<ffffffff810a1287>] ? down_trylock+0x37/0x50
sd_mod crc_t10dif xhci_hcd megaraid_sas wmi [<ffffffff8152d71e>] ? do_page_fault+0x3e/0xa0
dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]

[<ffffffff8152aad5>] ? page_fault+0x25/0x30
Pid: 9331, comm: pmemtest Tainted: G D --------------- 2.6.32-431.cisco.x86_64 #3 Cisco Systems Inc UCSB-B420-M4 [<ffffffff8112c520>] ? bad_page+0xd0/0x160
/UCSB-B420-M4
RIP: 0010:[<ffffffff8112c520>] [<ffffffff8112c520>] bad_page+0xd0/0x160
[<ffffffff8112c50d>] ? bad_page+0xbd/0x160
RSP: 0018:ffff889034b21978 EFLAGS: 00010282
RAX: 0000000000000004 RBX: ffffea0046554008 RCX: 00000000000048bc
RDX: 0000000000009c40 RSI: 0000000000000086 RDI: 0000000000000246
[<ffffffff8112d8f2>] ? get_page_from_freelist+0x762/0x870
RBP: ffff889034b21988 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000006 R11: 0000000000000001 R12: ffff889034b20000
R13: ffff88000001b0c0 R14: 00000000000028d5 R15: ffffea004654e000
FS: 00007f7a3152f700(0000) GS:ffff880162900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000000000c CR3: 000000902f0ef000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[<ffffffff8112f3a3>] ? __alloc_pages_nodemask+0x113/0x8d0
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process pmemtest (pid: 9331, threadinfo ffff889034b20000, task ffff88902eacd540)
Stack:
[<ffffffff81527bd0>] ? thread_return+0x4e/0x76e
ffff889034b20000 0000000000000200 [<ffffffff81060b13>] ? perf_event_task_sched_out+0x33/0x70
ffff889034b21ab8 ffffffff8112d8f2 [<ffffffff81167b9a>] ? alloc_pages_vma+0x9a/0x150

<d> ffffffff811422b9 [<ffffffff8118337d>] ? do_huge_pmd_anonymous_page+0x14d/0x3b0
0000000000000001 [<ffffffff8117f7d5>] ? follow_trans_huge_pmd+0xe5/0xf0
ffffea00a84a66f0 [<ffffffff8114b350>] ? handle_mm_fault+0x2f0/0x300
ffffea00a84a66c8 [<ffffffff8114b48a>] ? __get_user_pages+0x12a/0x430

<d> 0000000134b21ae8 ffff8830372410c0 [<ffffffff8100bb8e>] ? apic_timer_interrupt+0xe/0x20
0000000234b21a88 [<ffffffff8114d1c0>] ? __mlock_vma_pages_range+0xb0/0x200
000000000000001a
Call Trace:
[<ffffffff8114d5cb>] ? mlock_fixup+0x16b/0x200
[<ffffffff8112d8f2>] get_page_from_freelist+0x762/0x870
[<ffffffff81175ecd>] ? free_percpu+0x9d/0x130
[<ffffffff811422b9>] ? zone_statistics+0x99/0xc0
[<ffffffff8114d919>] ? do_mlock+0xc9/0x100
[<ffffffff8112f3a3>] __alloc_pages_nodemask+0x113/0x8d0
[<ffffffff8114da98>] ? sys_mlock+0xb8/0x100
[<ffffffff81527bd0>] ? thread_return+0x4e/0x76e
[<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
[<ffffffff81060b13>] ? perf_event_task_sched_out+0x33/0x70
[<ffffffff81167b9a>] alloc_pages_vma+0x9a/0x150
[<ffffffff8118337d>] do_huge_pmd_anonymous_page+0x14d/0x3b0
[<ffffffff8117f7dd>] ? follow_trans_huge_pmd+0xed/0xf0
[<ffffffff8114b350>] handle_mm_fault+0x2f0/0x300
[<ffffffff8114b48a>] __get_user_pages+0x12a/0x430
[<ffffffff8114d1c0>] __mlock_vma_pages_range+0xb0/0x200
[<ffffffff8114d5cb>] mlock_fixup+0x16b/0x200
[<ffffffff81175ecd>] ? free_percpu+0x9d/0x130
[<ffffffff8114d919>] do_mlock+0xc9/0x100
[<ffffffff8114da98>] sys_mlock+0xb8/0x100
[<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
Code: 00 00 48 c1 fa 03 48 81 c6 78 06 00 00 48 0f af d0 31 c0 e8 75 b0 3f 00 48 8b 13 4c 8b 4b 18 48 89 d8 44 8b 43 0c 66 85 d2 78 6c <8b> 48 08 41 83 c0 01 48 89 de 48 c7 c7 60 0b 7c 81 31 c0 e8 4a
RIP [<ffffffff8112c520>] bad_page+0xd0/0x160
RSP <ffff889034b21978>
CR2: 000000000000000c
------------[ cut here ]------------
WARNING: at arch/x86/kernel/smp.c:118 native_smp_send_reschedule+0x5c/0x60() (Tainted: G D --------------- )
Hardware name: UCSB-B420-M4
Modules linked in: cpufreq_ondemand acpi_cpufreq freq_table mperf ipv6 iTCO_wdt iTCO_vendor_support enic ipmi_devintf power_meter lpc_ich mfd_core shpchp sg ext3 jbd mbcache dm_snapshot squashfs sr_mod cdrom fnic libfcoe libfc scsi_transport_fc scsi_tgt usb_storage sd_mod crc_t10dif xhci_hcd megaraid_sas wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
Pid: 9325, comm: pmemtest Tainted: G D --------------- 2.6.32-431.cisco.x86_64 #3
Call Trace:
<IRQ> [<ffffffff81071e27>] ? warn_slowpath_common+0x87/0xc0
[<ffffffff81071e7a>] ? warn_slowpath_null+0x1a/0x20
[<ffffffff81030b6c>] ? native_smp_send_reschedule+0x5c/0x60
[<ffffffff81055678>] ? resched_task+0x68/0x80
[<ffffffff8105a360>] ? check_preempt_wakeup+0x1c0/0x260
[<ffffffff8106941b>] ? enqueue_task_fair+0xfb/0x100
[<ffffffff8105572c>] ? check_preempt_curr+0x7c/0x90
[<ffffffff81065c25>] ? try_to_wake_up+0x215/0x3e0
[<ffffffff81065e02>] ? default_wake_function+0x12/0x20
[<ffffffff81054839>] ? __wake_up_common+0x59/0x90
[<ffffffff81065e02>] ? default_wake_function+0x12/0x20
[<ffffffff8109b2b6>] ? autoremove_wake_function+0x16/0x40
[<ffffffff81054839>] ? __wake_up_common+0x59/0x90
[<ffffffff81058d48>] ? __wake_up+0x48/0x70
[<ffffffff810acde0>] ? tick_sched_timer+0x0/0xc0
[<ffffffff81072837>] ? printk_tick+0x47/0x50
[<ffffffff8108433d>] ? update_process_times+0x4d/0x90
[<ffffffff810ace46>] ? tick_sched_timer+0x66/0xc0
[<ffffffff810ec944>] ? __rcu_process_callbacks+0x54/0x350
[<ffffffff8109f9ee>] ? __run_hrtimer+0x8e/0x1a0
[<ffffffff810a6e0f>] ? ktime_get_update_offsets+0x4f/0xd0
[<ffffffff8109fd56>] ? hrtimer_interrupt+0xe6/0x260
[<ffffffff81031f1d>] ? local_apic_timer_interrupt+0x3d/0x70
[<ffffffff81531365>] ? smp_apic_timer_interrupt+0x45/0x60
[<ffffffff8100bb93>] ? apic_timer_interrupt+0x13/0x20
<EOI> [<ffffffff8152755f>] ? panic+0x14c/0x16f
[<ffffffff815274ec>] ? panic+0xd9/0x16f
[<ffffffff8152b7f4>] ? oops_end+0xe4/0x100
[<ffffffff8104a00b>] ? no_context+0xfb/0x260
[<ffffffff8104a295>] ? __bad_area_nosemaphore+0x125/0x1e0
[<ffffffff8104a363>] ? bad_area_nosemaphore+0x13/0x20
[<ffffffff8104aabf>] ? __do_page_fault+0x31f/0x480
[<ffffffff8128c3f6>] ? vsnprintf+0x336/0x5e0
[<ffffffff81136951>] ? lru_cache_add_lru+0x21/0x40
[<ffffffff810a1287>] ? down_trylock+0x37/0x50
[<ffffffff8152d71e>] ? do_page_fault+0x3e/0xa0
[<ffffffff8152aad5>] ? page_fault+0x25/0x30
[<ffffffff8112c520>] ? bad_page+0xd0/0x160
[<ffffffff8112c50d>] ? bad_page+0xbd/0x160
[<ffffffff8112d8f2>] ? get_page_from_freelist+0x762/0x870
[<ffffffff8112f3a3>] ? __alloc_pages_nodemask+0x113/0x8d0
[<ffffffff81527bd0>] ? thread_return+0x4e/0x76e
[<ffffffff81060b13>] ? perf_event_task_sched_out+0x33/0x70
[<ffffffff81167b9a>] ? alloc_pages_vma+0x9a/0x150
[<ffffffff8118337d>] ? do_huge_pmd_anonymous_page+0x14d/0x3b0
[<ffffffff8117f7d5>] ? follow_trans_huge_pmd+0xe5/0xf0
[<ffffffff8114b350>] ? handle_mm_fault+0x2f0/0x300
[<ffffffff8114b48a>] ? __get_user_pages+0x12a/0x430
[<ffffffff8100bb8e>] ? apic_timer_interrupt+0xe/0x20
[<ffffffff8114d1c0>] ? __mlock_vma_pages_range+0xb0/0x200
[<ffffffff8114d5cb>] ? mlock_fixup+0x16b/0x200
[<ffffffff81175ecd>] ? free_percpu+0x9d/0x130
[<ffffffff8114d919>] ? do_mlock+0xc9/0x100
[<ffffffff8114da98>] ? sys_mlock+0xb8/0x100
[<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
---[ end trace 76755f03dbce087a ]---


Regards,
Shaiju.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/