Re: [slab corruption] BUG key_jar: Poison overwritten

From: Vegard Nossum
Date: Sat Jan 17 2009 - 16:43:48 EST


On Sat, Jan 17, 2009 at 9:33 PM, David Howells <dhowells@xxxxxxxxxx> wrote:
> Ingo Molnar <mingo@xxxxxxx> wrote:
>
>> The problem was not reproducible - i tried the same config once more and
>> it didnt produce the bug. (this system has no known hardware flukes)
>
> Having thought about it some more, we can't necessarily pin the blame on the
> key management code. I think it more likely due to the previous owner of the
> page as it's the guard poisoning that got corrupted, not the allocated key
> struct. The problem is we don't know who that was. I wonder if it's
> practical to store a recent history of page allocs and releases in a circular
> buffer.

This is what I think:

[ 44.482064] INFO: 0xf5f320c0-0xf5f320c0. First byte 0x6a instead of 0x6b
...
[ 44.482064] Object 0xf5f320c0: 6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b jkkkkkkkkkkkkkkk

Only the first byte has been changed. It changed from 0x6b to 0x6a.
Smells like a decrement. And yes, this particular cache allocates
'struct key's, which has an 'atomic_t usage' as its first member. So a
refcounting bug, most likely (key_put() was called too many times). Or
a missing key_get() somewhere?

It also seems noteworthy that no other data has been changed since the
last key_put() (i.e. since the refcount hit zero).

[ 44.482064] Bytes b4 0xf5f320b0: 7c 05 ff ff 5a 5a 5a 5a 5a 5a 5a
5a 5a 5a 5a 5a |.ïïZZZZZZZZZZZZ

...I guess "bytes b4" means "bytes before"?


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036
¢éì®&Þ~º&¶¬–+-±éÝ¥Šw®žË±Êâmébžìdz¹Þ)í…æèw*jg¬±¨¶‰šŽŠÝj/êäz¹ÞŠà2ŠÞ¨è­Ú&¢)ß«a¶Úþø®G«éh®æj:+v‰¨Šwè†Ù>Wš±êÞiÛaxPjØm¶Ÿÿà -»+ƒùdš_