Re: Glibc recvmsg from kernel netlink socket hangs forever

From: Guenter Roeck
Date: Fri Mar 11 2016 - 18:50:47 EST


On Fri, Mar 11, 2016 at 11:33:17AM -0800, Jun Wang wrote:
> > On 09/25/2015 08:55 AM, Herbert Xu wrote:
> >> On Thu, Sep 24, 2015 at 10:34:10PM -0700, Guenter Roeck wrote:
> >>>
> >>> Any idea what may be needed for 4.1 ?
> >>> I am currently trying https://patchwork.ozlabs.org/patch/473041/,
> >>
> >> This patch should not make any difference on 4.1 and later because
> >> 4.1 is where I rewrote rhashtable resizing and it should work (or
> >> if it is broken then the latest kernel should be broken too).
> >>
> > Yes, applying (only) the above patch to 4.1 didn't help.
> >
> >>> but I have no idea if that will help with the problem we are seeing there.
> >>
> >> Having looked at your message agin I don't think the issue I
> >> alluded to is relevant since the symptom there ought to be a
> >> straight kernel lock-up as opposed to just a user-space one because
> >> you will end up with the kernel sending a message to itself.
> >>
> >> And the fact that 4.2 works is more indicative as the bug is
> >> present in both 4.1 and 4.2.
> >>
> >> I'll try to reproduce this in 4.1 as time permits but no promises.
> >>
> >
> > I applied your patches (and a few additional netlink changes from 4.2)
> > to our 4.1 branch. I'll let you know if it makes a difference for us.
> >
> > Thanks,
> > Guenter
>
> Guenter,
>
> Which additional netlink changes from 4.2 did you patch? We still see
> the problem with your test program with 4.1.12 which have the
> following two patches mentioned by Herbert Xu on this thread.
>
Jun,

Sorry, I don't recall, and I no longer have access to the kernel since
I now work for a different company. I do recall that we had to apply
additional patches later, but I don't remember details.

Guenter