Re: INFO: suspicious RCU usage in rcu_torture_writer()

From: Fengguang Wu
Date: Thu Aug 30 2012 - 11:31:34 EST


On Mon, Aug 27, 2012 at 11:17:51AM -0700, Paul E. McKenney wrote:
> On Mon, Aug 27, 2012 at 12:40:52PM +0800, Fengguang Wu wrote:
> > On Sat, Aug 25, 2012 at 05:01:49PM -0700, Paul E. McKenney wrote:
> > > On Sat, Aug 25, 2012 at 11:36:23AM +0800, Fengguang Wu wrote:
> > > > Greetings,
> > > >
> > > > I got this warning on 3.6.0-rc2. Full dmesg/config attached.
> > > >
> > > > [ 3.051375] Initializing RT-Tester: OK
> > > > [ 3.052491] rcu-torture:--- Start of test: nreaders=2 nfakewriters=4 stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stut
> > > > ter=5 irqreader=1 fqs_duration=0 fqs_holdoff=0 fqs_stutter=3 test_boost=1/1 test_boost_interval=7 test_boost_duration=4 shutdown_secs=0 onoff_interval=0 onoff_holdoff=0
> > > > [ 3.059084]
> > > > [ 3.059451] ===============================
> > > > [ 3.060454] [ INFO: suspicious RCU usage. ]
> > > > [ 3.061482] 3.6.0-rc2-00010-g4c58c42 #59 Not tainted
> > > > [ 3.062686] -------------------------------
> > > > [ 3.063744] /c/kernel-tests/src/stable/kernel/rcutorture.c:990 suspicious rcu_dereference_check() usage!
> > > >
> > > > 982 do {
> > > > 983 schedule_timeout_uninterruptible(1);
> > > > 984 rp = rcu_torture_alloc();
> > > > 985 if (rp == NULL)
> > > > 986 continue;
> > > > 987 rp->rtort_pipe_count = 0;
> > > > 988 udelay(rcu_random(&rand) & 0x3ff);
> > > > 989 old_rp = rcu_dereference_check(rcu_torture_current,
> > > > >990 current == writer_task);
> > > > 991 rp->rtort_mbtest = 1;
> > > > 992 rcu_assign_pointer(rcu_torture_current, rp);
> > > > 993 smp_wmb(); /* Mods to old_rp must follow rcu_assign_pointer() */
> > > > 994 if (old_rp) {
> > >
> > >
> > > Does the following clear this up?
> >
> > Sorry I'm still trying to reproduce this. It must be a rare bug
> > because it only showed up in several of the tens of thousands of test
> > boots. To reproduce it, I've done near 1000 boots however still not
> > caught it yet. Let's run it for more time...
>
> I will push the fix up for 3.7, if something else is happening, we can
> debug when it comes up. ;-)

Good idea! Since it's a really hard to reproduce problem that only
shows up after thousands of boots, it's easier to push the obvious fix
first. If ever it's not really fixed, the bug will show up again some
day and caught by the test system ;-)

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/