Re: [PATCH] x86: memtest: fix compile warning

From: Yinghai Lu
Date: Thu Jun 11 2009 - 13:19:39 EST


On Thu, Jun 11, 2009 at 7:21 AM, Thomas Gleixner<tglx@xxxxxxxxxxxxx> wrote:
> On Thu, 11 Jun 2009, Andreas Herrmann wrote:
>
>> Commit c9690998ef48ffefeccb91c70a7739eebdea57f9
>> (x86: memtest: remove 64-bit division) introduced following compile warning:
>>
>>  arch/x86/mm/memtest.c: In function 'memtest':
>>  arch/x86/mm/memtest.c:56: warning: comparison of distinct pointer types lacks a cast
>>  arch/x86/mm/memtest.c:58: warning: comparison of distinct pointer types lacks a cast
>>
>> Signed-off-by: Andreas Herrmann <andreas.herrmann3@xxxxxxx>
>> ---
>>  arch/x86/mm/memtest.c |    4 ++--
>>  1 files changed, 2 insertions(+), 2 deletions(-)
>>
>> Sorry.
>> Please apply.
>
> I applied it already, but zapped it right away, as it is bad style to
> do the type casting in the loops. The proper fix is below.
>
> But aside of that this code is confusing.
>
>        start_phys_aligned = ALIGN(start_phys, incr);
>
> Why do we have to fiddle with the alignment. Are you really seing e820
> entries which are not 8 byte aligned ?
>
>        for (p = start; p < end; p++, start_phys_aligned += incr) {
>                if (*p == pattern)
>                        continue;
>                if (start_phys_aligned == last_bad + incr) {
>                        last_bad += incr;
>                        continue;
>                }
>                if (start_bad)
>                        reserve_bad_mem(pattern, start_bad, last_bad + incr);
>                start_bad = last_bad = start_phys_aligned;
>        }
>        if (start_bad)
>                reserve_bad_mem(pattern, start_bad, last_bad + incr);
>
> I really had to look more than once to understand what the heck
> start_phys_aligned and last_bad + incr are doing. Really non
> intuitive.
>
> But the reserve_bad_mem() semantics are even more scary:
>
> - if you hit flaky memory, which gives you bad and good results here
>  and there, you call reserve_bad_mem() totally unbound which is
>  likely to overflow the early reservation space and panics the
>  machine. You need to keep track of those events somehow (e.g. in a
>  bitmap) so you can detect such problems and mark the whole affected
>  region bad in one go.
if one pass found bad, it is reserved.
second pass will use find_e820_area_size() to get new range, so bad
one will not be used.
>
> - you call reserve_early() which calls __reserve_early(....,
>  overrun_ok = 0) so if you do the default multi pattern scan and each
>  run sees the same region of broken memory you will trigger the
>  "Overlapping early reservations" panic in __reserve_early() when you
>  reserve that region the second time. Why do you run the test twice
>  when the first one failed already ? Also there is no need to do the
>  wipeout run in that case, which will trigger it as well!

current problem in that: we could run out of res_reserve array.
solution will be make res_reserve array dynamically.
when can not find slot, need use find_e820_area to get double sized,
and copy the old to new one.
then free the old one.

YH
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/