Re: HP ProLiant DL360p Gen8 hangs with Linux 4.13+.

From: Laurence Oberman
Date: Mon Jan 15 2018 - 08:02:30 EST


On Mon, 2018-01-15 at 20:17 +0800, Ming Lei wrote:
> On Sun, Jan 14, 2018 at 06:40:40PM -0500, Laurence Oberman wrote:
> > On Thu, 2018-01-04 at 14:32 -0800, Vinson Lee wrote:
> > > Hi.
> > >
> > > HP ProLiant DL360p Gen8 with Smart Array P420i boots to the login
> > > prompt and hangs with Linux 4.13 or later. I cannot log in on
> > > console
> > > or SSH into the machine. Linux 4.12 and older boot fine.
> > >
> > >
> >
> > ...
> >
> > ...
> >
> > This issue bit me for for two straight days.
> > I was testing Mike Snitzers combined tree and this commit crept
> > into
> > the latest combined tree.
> >
> > commit 84676c1f21e8ff54befe985f4f14dc1edc10046b
> > Author: Christoph Hellwig <hch@xxxxxx>
> > Date:ÂÂÂFri Jan 12 10:53:05 2018 +0800
> >
> > ÂÂÂÂgenirq/affinity: assign vectors to all possible CPUs
> > ÂÂÂ
> > ÂÂÂÂCurrently we assign managed interrupt vectors to all present
> > CPUs.ÂÂThis
> > ÂÂÂÂworks fine for systems were we only online/offline CPUs.ÂÂBut
> > in
> > case of
> > ÂÂÂÂsystems that support physical CPU hotplug (or the virtualized
> > version of
> > ÂÂÂÂit) this means the additional CPUs covered for in the ACPI
> > tables
> > or on
> > ÂÂÂÂthe command line are not catered for.ÂÂTo fix this we'd either
> > need
> > to
> > ÂÂÂÂintroduce new hotplug CPU states just for this case, or we can
> > start
> > ÂÂÂÂassining vectors to possible but not present CPUs.
> > ÂÂÂ
> > ÂÂÂÂReported-by: Christian Borntraeger <borntraeger@xxxxxxxxxx>
> > ÂÂÂÂTested-by: Christian Borntraeger <borntraeger@xxxxxxxxxx>
> > ÂÂÂÂTested-by: Stefan Haberland <sth@xxxxxxxxxxxxxxxxxx>
> > ÂÂÂÂFixes: 4b855ad37194 ("blk-mq: Create hctx for each present
> > CPU")
> > ÂÂÂÂCc: linux-kernel@xxxxxxxxxxxxxxx
> > ÂÂÂÂCc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > ÂÂÂÂSigned-off-by: Christoph Hellwig <hch@xxxxxx>
> > ÂÂÂÂSigned-off-by: Jens Axboe <axboe@xxxxxxxxx>
> >
> > Reason I never thought about this being my reason for the latest
> > hang
> > is I have used Linus' tree all the way to 4.15-rc7 with no issues.
> >
> > Vinson reporting it against 4.13 or later was not making sense
> > because
> > I had not seen the hang until this weekend.
> >
> > I checkedÂÂand its in Linus's tree but its not an issue in the
> > generic
> > 4.15-rc7 for me.
>
> Hi Laurence,
>
> Wrt. your issue, I have investigated a bit and found that it is
> because
> one irq vector may be assigned to all offline CPUs, and it may not be
> same with Vinson's.
>
> And the following patch can address your issue, I may prepare a
> formal
> version if no one objects this approach.
>
> Thomas, Christoph, could you take a look this patch?
>
> ---
> Âkernel/irq/affinity.c | 69 +++++++++++++++++++++++++++++++++++----
> ------------
> Â1 file changed, 47 insertions(+), 22 deletions(-)
>
> diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
> index a37a3b4b6342..dfc1f6a9c488 100644
> --- a/kernel/irq/affinity.c
> +++ b/kernel/irq/affinity.c
> @@ -94,6 +94,39 @@ static int get_nodes_in_cpumask(cpumask_var_t
> *node_to_possible_cpumask,
> Â return nodes;
> Â}
> Â
> +/*
> + * Spread the affinity of @nmsk into @nr_vecs irq vectors, and the
> + * result is stored to @start_irqmsk.
> + */
> +static int irq_vecs_spread_affinity(struct cpumask *irqmsk,
> + ÂÂÂÂint max_irqmsks,
> + ÂÂÂÂstruct cpumask *nmsk,
> + ÂÂÂÂint max_ncpus)
> +{
> + int v, ncpus;
> + int vecs_to_assign, extra_vecs;
> +
> + /* Calculate the number of cpus per vector */
> + ncpus = cpumask_weight(nmsk);
> + vecs_to_assign = min(max_ncpus, ncpus);
> +
> + /* Account for rounding errors */
> + extra_vecs = ncpus - vecs_to_assign * (ncpus /
> vecs_to_assign);
> +
> + for (v = 0; v < min(max_irqmsks, vecs_to_assign); v++) {
> + int cpus_per_vec = ncpus / vecs_to_assign;
> +
> + /* Account for extra vectors to compensate rounding
> errors */
> + if (extra_vecs) {
> + cpus_per_vec++;
> + --extra_vecs;
> + }
> + irq_spread_init_one(irqmsk + v, nmsk, cpus_per_vec);
> + }
> +
> + return v;
> +}
> +
> Â/**
> Â * irq_create_affinity_masks - Create affinity masks for multiqueue
> spreading
> Â * @nvecs: The total number of vectors
> @@ -104,7 +137,7 @@ static int get_nodes_in_cpumask(cpumask_var_t
> *node_to_possible_cpumask,
> Âstruct cpumask *
> Âirq_create_affinity_masks(int nvecs, const struct irq_affinity
> *affd)
> Â{
> - int n, nodes, cpus_per_vec, extra_vecs, curvec;
> + int n, nodes, curvec;
> Â int affv = nvecs - affd->pre_vectors - affd->post_vectors;
> Â int last_affv = affv + affd->pre_vectors;
> Â nodemask_t nodemsk = NODE_MASK_NONE;
> @@ -154,33 +187,25 @@ irq_create_affinity_masks(int nvecs, const
> struct irq_affinity *affd)
> Â }
> Â
> Â for_each_node_mask(n, nodemsk) {
> - int ncpus, v, vecs_to_assign, vecs_per_node;
> + int vecs_per_node;
> Â
> Â /* Spread the vectors per node */
> Â vecs_per_node = (affv - (curvec - affd-
> >pre_vectors)) / nodes;
> Â
> - /* Get the cpus on this node which are in the mask
> */
> - cpumask_and(nmsk, cpu_possible_mask,
> node_to_possible_cpumask[n]);
> Â
> - /* Calculate the number of cpus per vector */
> - ncpus = cpumask_weight(nmsk);
> - vecs_to_assign = min(vecs_per_node, ncpus);
> -
> - /* Account for rounding errors */
> - extra_vecs = ncpus - vecs_to_assign * (ncpus /
> vecs_to_assign);
> -
> - for (v = 0; curvec < last_affv && v <
> vecs_to_assign;
> - ÂÂÂÂÂcurvec++, v++) {
> - cpus_per_vec = ncpus / vecs_to_assign;
> -
> - /* Account for extra vectors to compensate
> rounding errors */
> - if (extra_vecs) {
> - cpus_per_vec++;
> - --extra_vecs;
> - }
> - irq_spread_init_one(masks + curvec, nmsk,
> cpus_per_vec);
> - }
> + /* spread non-online possible cpus */
> + cpumask_andnot(nmsk, node_to_possible_cpumask[n],
> cpu_online_mask);
> + irq_vecs_spread_affinity(&masks[curvec], last_affv -
> curvec,
> + Ânmsk, vecs_per_node);
> Â
> + /*
> + Â* spread online possible cpus to make sure each
> vector
> + Â* can get one online cpu to handle
> + Â*/
> + cpumask_and(nmsk, node_to_possible_cpumask[n],
> cpu_online_mask);
> + curvec += irq_vecs_spread_affinity(&masks[curvec],
> + ÂÂÂlast_affv -
> curvec,
> + ÂÂÂnmsk,
> vecs_per_node);
> Â if (curvec >= last_affv)
> Â break;
> Â --nodes;
> --Â
> 2.9.5
>
>

Hello Ming

I will test the patch. I did not spend a lot of time seeing if this
weekends stalls were an exact match to Vinson, I just knew pulling that
patch resolved it.
Perhaps this explains why I was not seeing this on generic 4.15-rc7.

Thanks
Laurence