Re: [bug] __nf_ct_refresh_acct(): WARNING: at lib/list_debug.c:30__list_add+0x7d/0xad()

From: Eric Dumazet
Date: Wed Jun 17 2009 - 11:29:48 EST


Patrick McHardy a écrit :
> Patrick McHardy wrote:
>> Eric Dumazet wrote:
>>> Patrick McHardy a écrit :
>>>> No, before it is confirmed, its only visible to the CPU handling
>>>> the initial packet of a connection. Confirmation is the step that
>>>> makes it visible to other CPUs.
>>>
>>> Thanks Patrick, I missed this, and your patch seems fine now :)
>>
>> Thanks for your help, I'll send it to Dave later today.
>
> I'm having some trouble figuring out the exact events that would
> lead to the timer base corruption. Ingo, could you please test
> this patch to make sure it also fixes the problem?
>
>

;)

Event can be described as following :

CPU1 CPU2

/* __nf_conntrack_confirm() */
__nf_conntrack_hash_insert(ct, hash, repl_hash);
// now 'ct' is visible by other cpus
// search conntrack and find ct
// timeout.expires becomes absolute here
ct->timeout.expires += jiffies;
add_timer(&ct->timeout);

/* __nf_ct_refresh_acct() */
if (!nf_ct_is_confirmed(ct)) {
// we *believe* timeout.expires
// is not yet in use by timer code
// and is still a relative quantity.
// We want to 'update' it but we should not !
ct->timeout.expires = extra_jiffies; << CORRUPTION >>
} else {
// too late :(
set_bit(IPS_CONFIRMED_BIT, &ct->status);



This is how I understood the problem, but I may be wrong ?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/