Re: BUG: BISECTED: in squashfs_xz_uncompress() (Was: RCU stalls in squashfs_readahead())

From: Paul E. McKenney
Date: Mon Nov 21 2022 - 21:07:45 EST


On Mon, Nov 21, 2022 at 05:04:36AM +0100, Mirsad Goran Todorovac wrote:
> On 20. 11. 2022. 20:21, Paul E. McKenney wrote:
> > > And what about the Mr. Robert Elliott's observation about calling conf_recshed()?
> > >
> > > > How big can these readahead sizes be? Should one of the loops include
> > > > cond_resched() calls?
> > >
> > > That is IMHO better than allowing 21000 milisecond stalls on a core (or more of them).
> > >
> > > I don't think it is correct to stay in kernel mode for more than an timer unit
> > > without yielding the CPU. It creates stalls in multimedia and audio (chirps like on scratched
> > > CD-ROMs). This is especially noticeable with a KASAN build.
> > >
> > > Since Firefox and most snaps are using squashfs as compressed ROFS, the Firefox appears
> > > to perform poorer since snaps are introduced than Chrome.
> > >
> > > IMHO, if we want something like realtime and multimedia processing (which is the specific
> > > area of my research), it seems that anything trying to hold processor for 21000 ms (21 secs)
> > > is either buggy or deliberately malicious. 20 ms is quite enough of work for a threat
> > > in one allotted timeslot.
> > >
> > > I do not agree with Mr. Lougher's observation that I am thrashing my laptop. I think that
> > > a system has to endure stress and torture testing. I was raised on Digital MicroVAX systems
> > > on Ultrix which compiled lab at a time in memory that would today sound funny. :)
> >
> > I personally think that it would be great if you were to work to decrease
> > the Linux kernel's latency. Doing so would not be fixing a regression,
> > but I personally would welcome it. Others might have different opinions,
> > but please do CC me on any resulting patches.
> >
> > And I will see your MicroVAX and raise you a videogame written on a
> > PDP-12 whose fastest instruction executed in 1.6 microseconds (-not-
> > nanoseconds!). ;-)
>
> I'm afraid that I would lose in Far Cry miserably if my cores
> decided to all lock up for 21 secs. :-(

Agreed, 21 seconds is an improvement over the earlier 60 seconds, but
still a very long time. Me, I come from DYNIX/ptx, where the equivalent
to the RCU CPU stall warning was 1.5 seconds. On the other hand, it
is also the case that DYNIX/ptx had nowhere near the variety of drivers
and subsystems, nor did it scale anywhere near as far as Linux does today.

But you only need one CPU to lock up for 21 seconds to get an RCU CPU
stall warning, not all of them. ;-)

> > You can can see a couple of people playing the game on a PDP-12 in
> > a computer museum: https://www.rcsri.org/collection/pdp-12/
> >
> > > Besides, this is the very idea behind the MG-LRU algorithm commit, to test eviction of
> > > memory pages in the system with heavy load and low on memory.
> > >
> > > I will probably test your commits, but now I have to do my own evening ritual, unwinding,
> > > and knowledge and memory consolidation (called "sleep").
> >
> > And yes, sleep is often one of the best debugging tools available.
> >
> > > I appreciate your lots of commits on the kernel.org and I hope I do not sound like
> > > I am thinking you are a village idiot :(
> > >
> > > I am trying to adhere to the Code of Conduct with mutual respect and politeness.
> >
> > Skepticism is not necessarily a bad thing, especially given that I
> > am not immune from errors and confusion. Me, I just thought you were
> > forcefully reporting the regression, so I forcefully pointed you at the
> > fix for that regression.
> >
> > Again, I have absolutely no objection to your improving the kernel's
> > response time.
>
> This is at present just the wishful thinking, as I lack your 30 years of
> experience with the kernel and RCU update system. I am only beginning to realise
> why it is more efficient than the traditional locking, and IMHO it should
> avoid locking up cores instead of increasing the number of complaints.

Just to set the record straight, RCU does not normally lock up any of
the cores. Instead, RCU detects that cores have been locked up.

Give or take the occasional bug in RCU, of course!

> But even if the Linux kernel source is magically "memory mapped" into my
> mind, I still do not see how it could be done. My Linux kernel learning curve
> had not yet got that up, but I have no doubts that it is designed by
> Intelligent Designers who are very witty people, and not village idiots ;-)

There is the school of thought that claims that the Linux kernel is
driven by evolutionary forces rather than intelligent design. And as
we all know, evolutionary forces are driven by random changes, which
absolutely anyone could make.

And one approach is to take a less aggressive RCU CPU stall timeout,
say reducing from 21 seconds to (say) 15 seconds instead of all the
way down to 20 milliseconds. This could allow you to ease into the
latency-reduction work.

Alternatively, consider that response time is a property of the
entire system plus the environment that it runs in. So I suspect that
the Android folks are accompanying that 20-millisecond timeout with
some restrictions on what the on-phone workloads are permitted to do.
Maybe ask the Android guys what those restrictions are and loosen them
slightly, again allowing you to ease into the latency-reduction work.

> > > I know that the Linux kernel is about 30 million lines by now, and by the security experts
> > > we should expect 30,000 bugs in such a solid piece of written code (one per thousand of
> > > lines). Only Mr. Thorsten mentioned 950 unresolved in the "open" list.
> >
> > At least 30,000 bugs, of which we know of maybe 950. ;-)
>
> So I need no point in banning the kernel from screaming to logs that it had
> core stalls that needed a physical NMI to recover from, or they would potentially
> last much longer.

Sometimes an NMI does get the CPUs back on track. Sometimes the RCU CPU
stall warning is a symptom of the CPU having gotten too old and failing.
Most often, though, it is a sign of some sort of lockup, a too-long
RCU read-side critical section, or as Robert Elliot noted, the lack of
a cond_resched().

But please keep in mind that cond_resched() helps only in kernels built
with CONFIG_PREEMPTION=n.

> > > Knowing all of this is difficult, but I still believe in open source and open systems
> > > interconnected.
> >
> > If it was easy, where would be the challenge?
>
> AFAIK, the point I was taught in life was obedience, not overcoming challenges.

Perhaps early in life I was ordered to overcome challenges? If so, then
my overcoming them would be a matter of obedience. ;-)

> > > Of course, I always remember a proverb "Who hath despised the day of the small beginnings?"
> > >
> > > Hope this helps. My $0.02.
> >
> > I think we are good. ;-)
>
> Yes, you guys do an amasing job of keeping 30 million lines of code organised
> and making some sense. I will cut the smalltalk as I know you are a busy man.
> If I make a progress to actually produce any patches fixing these lockups and
> stalls, I will be sure to include you into CC: as you requested.

Looking forward to seeing what you come up with!

Thanx, Paul