Re: cgroups-related hard lockup in 4.14?

From: Tejun Heo
Date: Wed Dec 20 2017 - 18:24:21 EST


On Thu, Dec 21, 2017 at 12:59:23AM +0200, Dan Aloni wrote:
> Hi,
>
> Using netconsole, I was able to capture a hard lockup that seems to be
> related to cgroups, on a Fedora kernel based on v4.14.4.
>
> By my analysis, from the 16 CPUs below, 14 are on css_set_lock, one is
> inside css_task_iter_advance, and the last one stuck trying to send an
> IPI, I guess because all other CPUs are spinning.
>
> To add some context, I have been experiencing deadlocks on various
> machines starting from 4.13 and it's the first time I was able to
> capture one. It takes a few days to reproduce while idling or doing
> random work, and I have not yet come up with precise steps that can
> nail it.
>
> I can try out patches in order to get more info on this issue.

Can you please try the following patch?

https://marc.info/?l=linux-cgroups&m=151378281708793&q=raw

Thanks.

--
tejun