Re: [PATCH 1/2] ras: fix an off-by-one error in __find_elem()

From: Luck, Tony
Date: Wed Apr 17 2019 - 17:15:54 EST


On Tue, Apr 16, 2019 at 07:37:41PM -0700, Cong Wang wrote:
> On Tue, Apr 16, 2019 at 7:31 PM Cong Wang <xiyou.wangcong@xxxxxxxxx> wrote:
> > Yes it is, I have a stacktrace in production which clearly shows
> > del_elem.isra.1+0x34/0x40, unlike the one I triggered via fake
> > PFN's. I can show you if you want, it is on 4.14, so very unlikely
> > it is interesting to anyone here.
>
> Correct myself:
>
> The one triggered via fake PFN's also shows
> del_elem.constprop.1+0x39/0x40.
>
> Sorry for the mistake, I kinda read into another crash...

Ok. Now I follow what you are saying. I'll try to rephrase
here.

1) Initial setup is that we filled the array. Set the threshold
to a low value, and then repeatedly added a new pfn (with a
value higher than any in the array).

2) First time:
We find the array is full, so delete one item. Add the new pfn
to the end (in slot MAX_ELEM-1)

3) Next time. Array is full. Delete one item (assume not the one
for our target pfn). This moved our target to slot MAX_ELEM-2.

4) Next time. Array isn't full. We get a hit on our pfn. Bump
the count. Find that it exceeded the threshold, so ask to
offline it. We delete this from the array. It is a tail
deletion so we just decrement ca->n.

5) Final time. Array isn't full. When we look for our pfn in
the array it isn't in the array[0,ca->n) range. So we ought
to return ENOKEY. But we don't. When we fall out of the
search loop:

136 while (min < max) {
137 int tmp = (max + min) >> 1;
138
139 this_pfn = PFN(ca->array[tmp]);
140
141 if (this_pfn < pfn)
142 min = tmp + 1;
143 else if (this_pfn > pfn)
144 max = tmp;
145 else {
146 min = tmp;
147 break;
148 }
149 }

We exit because "min" is not less than "max". We assign
"*to" with the index of the available slot.

151 if (to)
152 *to = min;

This line is just wrong. "min" is still within the allocated
bounds of the array, but the value is "ca->n", i.e. the first
unused slot. The "garbage" value we pick up from here is what
was last assigned to the slot when it was valid. Because we have
been adding the same pfn over and over, the value we pick up is
the pfn that we "deleted" in step "4" above. But remember we
deleted it just by adjusting ca->n, not by overwriting this
entry.

154 this_pfn = PFN(ca->array[min]);


So now this test succeeds, when it should fail. We return
an index of "ca->n".
156 if (this_pfn == pfn)
157 return min;
158
159 return -ENOKEY;

Things go downhill fast from here. We update the count for
this entry. Find it is above the threshold, so offline the
page and try to delete.

226 static void del_elem(struct ce_array *ca, int idx)
227 {
228 /* Save us a function call when deleting the last element. */

idx is equal to ca->n. So this is "if (-1)" ... so we decide
we need to memmove some entries. The array indices "idx" and "idx+1"
are not out of bounds for the allocated size, but they aren't what we
want. The real problem is the count passed to memmove.

229 if (ca->n - (idx + 1))
230 memmove((void *)&ca->array[idx],
231 (void *)&ca->array[idx + 1],
232 (ca->n - (idx + 1)) * sizeof(u64));
233
234 ca->n--;
235 }


So I'm now liking this version of the patch by Cong:


diff --git a/drivers/ras/cec.c b/drivers/ras/cec.c
index 2d9ec378a8bc..b7cb0b9b1346 100644
--- a/drivers/ras/cec.c
+++ b/drivers/ras/cec.c
@@ -204,10 +204,12 @@ static int __find_elem(struct ce_array *ca, u64 pfn, unsigned int *to)
if (to)
*to = min;

- this_pfn = PFN(ca->array[min]);
+ if (min < ca->n) {
+ this_pfn = PFN(ca->array[min]);

- if (this_pfn == pfn)
- return min;
+ if (this_pfn == pfn)
+ return min;
+ }

return -ENOKEY;
}

Reviewed-by: Tony Luck <tony.luck@xxxxxxxxx>