Re: [PATCH v4] string.h: Add str_has_prefix() helper function

From: Namhyung Kim
Date: Sat Dec 22 2018 - 22:06:03 EST


On Sat, Dec 22, 2018 at 12:19:11PM -0500, Steven Rostedt wrote:
> On Sun, 23 Dec 2018 01:46:05 +0900
> Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
>
>
> > > What I meant by that is if a string is allocated at a end of a page,
> > > and the next page is marked as not present. A read into that page will
> > > cause a page fault, and since memcmp() does not stop at the '\0' it
> > > will read into that not-present memory and trigger a fault, and that
> > > read wont be in the exception table, and it will then BUG.
> >
> > Why it doesn't stop at the '\0' if one has it and the other doesn't?
> > It's not because it's '\0', it's because they are different. The '\0'
> > should be in the prev page (otherwise it's already a BUG) so it should
> > be detected and stopped before going to next page IMHO.
> >
>
> Because memcmp() isn't required to test byte by byte. In fact, most
> implementations don't which is why memcmp is faster than strcncmp.
>
> It can be checking in 8 byte chunks or more (although perhaps not
> likely). Perhaps there's an arch command that lets you compare 32 bytes
> at a time, if the size passed to memcmp is 32 or more, the
> implementation is allowed to read both src and dst of 32 bytes at a
> time. If there was a '\0' followed by not present memory, you will
> still get that fault.

I thought such implementation would check the alignment and not cross
the page boundary in a single read. But it's implementation's choice
and I found that glibc's default implementation for misaligned pointer
reads next chunk as well to form an aligned chunk using shifts. So
for the safety it'd be better to use strcmp()..

Thanks for your time and the explanation,
Namhyung