Re: Re: [linus:master] [sched/eevdf] 2227a957e1: BUG:kernel_NULL_pointer_dereference,address

From: Chen Yu
Date: Mon Feb 19 2024 - 07:36:01 EST


On 2024-01-31 at 20:28:19 +0800, Abel Wu wrote:
> On 1/31/24 8:10 PM, Tiwei Bie Wrote:
> > On 1/30/24 6:13 PM, Abel Wu wrote:
> > > On 1/30/24 3:24 PM, kernel test robot Wrote:
> > > >
> > > > [  512.079810][ T8305] BUG: kernel NULL pointer dereference, address: 0000002c
> > > > [  512.080897][ T8305] #PF: supervisor read access in kernel mode
> > > > [  512.081636][ T8305] #PF: error_code(0x0000) - not-present page
> > > > [  512.082337][ T8305] *pde = 00000000
> > > > [  512.082829][ T8305] Oops: 0000 [#1] PREEMPT SMP
> > > > [  512.083407][ T8305] CPU: 1 PID: 8305 Comm: watchdog Tainted: G        W        N 6.7.0-rc1-00006-g2227a957e1d5 #1 819e6d1a8b887f5f97adb4aed77d98b15504c836
> > > > [  512.084986][ T8305] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> > > > [ 512.086203][ T8305] EIP: set_next_entity (fair.c:?)
> > >
> > > There was actually a NULL-test in pick_eevdf() before this commit,
> > > but I removed it by intent as I found it impossible to be NULL after
> > > examining 'all' the cases.
> > >
> > > Also cc Tiwei who once proposed to add this check back.
> > > https://lore.kernel.org/all/20231208112100.18141-1-tiwei.btw@xxxxxxxxxxxx/
> >
> > Thanks for cc'ing me. That's the case I worried about and why I thought
> > it might be worthwhile to add the sanity check back. I just sent out a
> > new version of the above patch with updated commit log and error message.
>
> I assuming the real problem is why it *can* be NULL at first place.
> IMHO the NULL check with a fallback selection doesn't solve this, but
> it indeed avoids kernel panic which is absolutely important.

I tried to reproduce this issue locally but with no luck. It might
be related to lkp's environment so I'm thinking of adding the following
change into lkp's test repo to help narrow down: when the pick_eevdf() finds
an NULL candidate, print the whole rb-tree, including each entity's vruntime,
min_vruntime, deadline, etc. So we can figure out why we failed to find a proper
entity, thoughts?

Example output:

[ 325.425017] cfs_rq avg_vruntime:-798130176 avg_load:8192 min_vruntime:29296269335
[ 325.425247] current se 0xff11000105858080 is on_rq, deadline:29299081075 min_vruntime:29296537185 vruntime:29297356572, load:1024
[ 325.425514] Traverse rb-tree from left to right
[ 325.425635] se 0xff11000101a28080 deadline:29293835648 min_vruntime:29290835648 vruntime:29290835648 eligible
[ 325.425876] se 0xff11000113113000 deadline:29298566205 min_vruntime:29290835648 vruntime:29297726206 non-eligible
[ 325.426121] se 0xff110001036cb000 deadline:29298613981 min_vruntime:29290835648 vruntime:29297094477 non-eligible
[ 325.426358] se 0xff11000118148080 deadline:29298616409 min_vruntime:29297255015 vruntime:29297255015 non-eligible
[ 325.426597] se 0xff11000111523000 deadline:29299066492 min_vruntime:29290835648 vruntime:29296949653 non-eligible
[ 325.426836] se 0xff11000106173000 deadline:29299152426 min_vruntime:29296537185 vruntime:29296537185 non-eligible
[ 325.427085] se 0xff11000105f03000 deadline:29299269335 min_vruntime:29296269335 vruntime:29296269335 eligible
[ 325.427322] se 0xff110001043c0080 deadline:29299707737 min_vruntime:29296707737 vruntime:29296707737 non-eligible
[ 325.427558] Traverse rb-tree from topdown
[ 325.427648] left se 0xff110001036cb000 deadline:29298613981 min_vruntime:29290835648 vruntime:29297094477 left half eligible
[ 325.427917] left se 0xff11000113113000 deadline:29298566205 min_vruntime:29290835648 vruntime:29297726206 left half eligible
[ 325.428171] left se 0xff11000101a28080 deadline:29293835648 min_vruntime:29290835648 vruntime:29290835648 left half eligible
[ 325.428440] middle se 0xff11000101a28080 deadline:29293835648 min_vruntime:29290835648 vruntime:29290835648 eligible
[ 325.428679] Found best:0xff11000101a28080


diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 533547e3c90a..ddacb3dc91b9 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -856,6 +856,69 @@ struct sched_entity *__pick_first_entity(struct cfs_rq *cfs_rq)
return __node_2_se(left);
}

+static void debug_print_eevdf(struct cfs_rq *cfs_rq)
+{
+ struct sched_entity *se, *curr = cfs_rq->curr;
+ struct sched_entity *best = NULL;
+ struct rb_node *node = cfs_rq->tasks_timeline.rb_root.rb_node;
+ struct rb_node *n;
+
+ printk("cfs_rq avg_vruntime:%lld avg_load:%lld min_vruntime:%lld\n",
+ cfs_rq->avg_vruntime, cfs_rq->avg_load, cfs_rq->min_vruntime);
+
+ if (curr && curr->on_rq)
+ printk("current on_rq se 0x%lx, deadline:%lld min_vruntime:%lld vruntime:%lld, load:%ld\n",
+ (unsigned long)curr, curr->deadline, curr->min_vruntime,
+ curr->vruntime, scale_load_down(curr->load.weight));
+
+ printk("Traverse rb-tree from left to right\n");
+ for (n = rb_first(&cfs_rq->tasks_timeline.rb_root); n != NULL; n = rb_next(n)) {
+ bool eli;
+
+ se = __node_2_se(n);
+ eli = entity_eligible(cfs_rq, se);
+ printk(" se 0x%lx deadline:%lld min_vruntime:%lld vruntime:%lld %s\n",
+ (unsigned long)se, se->deadline, se->min_vruntime, se->vruntime,
+ eli ? "eligible" : "non-eligible");
+ }
+ printk("Traverse rb-tree from topdown\n");
+ while (node) {
+ struct rb_node *left = node->rb_left;
+ struct sched_entity *se_left = __node_2_se(left);
+ bool eli = false;
+
+ if (left) {
+ eli = vruntime_eligible(cfs_rq,
+ __node_2_se(left)->min_vruntime);
+
+ printk(" left se 0x%lx deadline:%lld min_vruntime:%lld vruntime:%lld %s\n",
+ (unsigned long)se_left, se_left->deadline, se_left->min_vruntime,
+ se_left->vruntime,
+ eli ? "left half eligible" : "left half non-eligible");
+ }
+
+ if (left && eli) {
+ node = left;
+ continue;
+ }
+
+ se = __node_2_se(node);
+
+ eli = entity_eligible(cfs_rq, se);
+ printk(" middle se 0x%lx deadline:%lld min_vruntime:%lld vruntime:%lld %s\n",
+ (unsigned long)se, se->deadline, se->min_vruntime, se->vruntime,
+ eli ? "eligible" : "non-eligible");
+
+ if (eli) {
+ best = se;
+ break;
+ }
+
+ node = node->rb_right;
+ }
+ printk("Found best:0x%lx\n", (unsigned long)best);
+}
+
/*
* Earliest Eligible Virtual Deadline First
*
@@ -937,6 +1000,9 @@ static struct sched_entity *pick_eevdf(struct cfs_rq *cfs_rq)
if (!best || (curr && entity_before(curr, best)))
best = curr;

+ if (unlikely(!best))
+ debug_print_eevdf(cfs_rq);
+
return best;
}

--
2.34.1

thanks,
Chenyu