Re: [bug] __nf_ct_refresh_acct(): WARNING: at lib/list_debug.c:30__list_add+0x7d/0xad()

From: Patrick McHardy
Date: Wed Jun 17 2009 - 11:34:31 EST


Eric Dumazet wrote:
Patrick McHardy a écrit :
I'm having some trouble figuring out the exact events that would
lead to the timer base corruption. Ingo, could you please test
this patch to make sure it also fixes the problem?

;)

Event can be described as following :

CPU1 CPU2

/* __nf_conntrack_confirm() */
__nf_conntrack_hash_insert(ct, hash, repl_hash);
// now 'ct' is visible by other cpus
// search conntrack and find ct
// timeout.expires becomes absolute here
ct->timeout.expires += jiffies;
add_timer(&ct->timeout);

/* __nf_ct_refresh_acct() */
if (!nf_ct_is_confirmed(ct)) {
// we *believe* timeout.expires // is not yet in use by timer code
// and is still a relative quantity.
// We want to 'update' it but we should not !
ct->timeout.expires = extra_jiffies; << CORRUPTION >>
} else {
// too late :(
set_bit(IPS_CONFIRMED_BIT, &ct->status);

This is how I understood the problem, but I may be wrong ?

Thats one case that can happen, but that wouldn't corrupt the
timer base AFAICS. Also the callpath shows that it actually went
into the mod_timer_pending() path *and* timer_pending() was true.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/