Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

From: Dave Chinner
Date: Thu Dec 22 2016 - 22:52:31 EST


On Fri, Dec 23, 2016 at 09:33:36AM +1100, Dave Chinner wrote:
> On Fri, Dec 23, 2016 at 09:15:00AM +1100, Dave Chinner wrote:
> > On Thu, Dec 22, 2016 at 01:10:19PM -0800, Linus Torvalds wrote:
> > > Ok, so the numa issue was a red herring. With that fixed:
> > >
> > > On Thu, Dec 22, 2016 at 1:06 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> > > >
> > > > Better, but still bad. average files/s is not up to 200k files/s,
> > > > so still a good 10-15% off where it should be. xfs_repair is back
> > > > down to 10-15% off where it should be, too. bulkstat still fires off
> > > > a bad page reference count warning, iscsi still panics immediately.
> > >
> > > Do you have CONFIG_BLK_WBT enabled, perhaps?
> >
> > Ok, yes, that's enabled. Let me go turn it off and see what happens.
>
> Numbers are still all over the place.
>
> FSUse% Count Size Files/sec App Overhead
> .....
> 0 28800000 0 228175.5 21928812
> 0 30400000 0 167880.5 39628229
> 0 32000000 0 124289.5 41420925
> 0 33600000 0 150577.9 35382318
> 0 35200000 0 216535.4 16072628
> 0 36800000 0 233414.4 11846654
> 0 38400000 0 213812.0 13356633
> 0 40000000 0 175905.7 53012015
> 0 41600000 0 157028.7 34700794
> 0 43200000 0 138829.1 50282461
>
> And the average is now back down to 185k files/s. repair runtime is
> unchanged and still 10-15% off...
>
> I've got to run away for a few hours right now, but I'll retest the
> 4.9 + xfs for-next branch when I get back to see if the problem is
> my curent config or whether there really is a perf problem lurking
> somewhere....

Well, I'm not sure now. Taking that config back to 4.9 gave
results of 210k files/s. A bit faster, but still not the 230k
files/s I'm expecting. So I'm missing something in the .config
at this point, though I'm not sure what.

FWIW, updating from 4.9 to to the 4.10 tree, this happened:

$ $ make oldconfig; make -j 32
scripts/kconfig/conf --oldconfig Kconfig
*
* Restart config...
*
*
* General setup
*

Yup, it definitely did something unexpected. And almost silently, I
might add - I didn't notice this the first time around, and wouldn't
hav enoticed it this time if I wasn't looking for something strange
to happen.

As iit is, I still haven't found what magic config option is
taking away that 10-15% of performance. There's no unexpected debug
options set, and it's nothing obvious in the fs/block layer config.
I'll keep looking for the moment...

Cheers,

Dave.


--
Dave Chinner
david@xxxxxxxxxxxxx