Re: [PATCH] arm64: mm: disable PAN during caches_clean_inval_user_pou

From: Brandt, Oliver - Lenze
Date: Tue Jan 09 2024 - 03:41:58 EST


On Mon, 2024-01-08 at 17:58 +0000, Mark Rutland wrote:
> On Mon, Jan 08, 2024 at 04:37:37PM +0000, Brandt, Oliver - Lenze wrote:
> > > On Mon, Jan 08, 2024 at 01:00:39PM +0000, Brandt, Oliver - Lenze wrote:
> > > > Using the cacheflush() syscall from an 32-bit user-space fails when
> > > > ARM64_PAN is used. We 'll get an endless loop:
> > > >
> > > > 1. executing "dc cvau, x2" results in raising an abort
> > > > 2. abort handler does not fix the reason for the abort and
> > > > returns to 1.
> > > >
> > > > Disabling PAN for the time of the cache maintenance fixes this.
> > >
> > > Hmm... the ARM ARM says PSTATE.PAN is not supposed to affect DC CVAU.
> > >
> > > Looking at the latest ARM ARM (ARM DDI 0487J.a), R_PMTWB states:
> > >
> > > > The PSTATE.PAN bit has no effect on all of the following:
> > > >
> > > > o Instruction fetches.
> > > > o Data cache instructions, except DC ZVA.
> > > > o If FEAT_PAN2 is not implemented, then address translation instructions.
> > > > o If FEAT_PAN2 is implemented, then the address translation instructions
> > > > other than AT S1E1RP and AT S1E1WP.
> > >
> > > So IIUC, DC CVAU shouldn't be affected by PAN.
> >
> > Ups... Sorry, didn't noticed this.
>
> No worries; this is not at all obvious!
>
> > > This could be a CPU bug; which CPU are you seeing this with?
> >
> > I've stumbled about this while using Intel's simulator "Simics" with a
> > model of the upcoming "Agilex5 socfpga". The "Agilex5" is a SoC
> > containing two Cortex A76 and two Cortex A55.
>
> Ah, so it could be a bug in Simics, then.
>

Now I think so, too. Not the first bug we've found, but the first in the
used CPU models.

> > We are expecting the real silicon in a couple of weeks. Seems to be a
> > good idea to check the silicon first. Sorry to bother you with this.
>
> Just to make sure I ran a quick test on an AML-905D3-CC board (quad-core
> Cortex-A55), and AFAICT we're not taking unexpected faults. Log below,
> including the test case.
>
> If you do see problems on silicon, please let us know!
>

I will. Thanks a lot for spending your time on this!

> Mark.

Oliver

>
> ---->8----
> mark@flodeboller:~/test/aarch32-cacheflush$ sudo dmesg | grep -i access
> [ 0.010476] CPU features: detected: Privileged Access Never
> mark@flodeboller:~/test/aarch32-cacheflush$ cat test.c
> #include <stdio.h>
>
> void cacheflush(void *start, size_t size)
> {
> printf("Attempting flush of [%p..%p]\n", start, start + size);
> __builtin___clear_cache(start, start + size);
> }
>
> int main(int argc, char *argv[])
> {
> static char buf[4096];
>
> cacheflush(buf, sizeof(buf));
>
> cacheflush(NULL, sizeof(buf));
>
> return 0;
> }
> mark@flodeboller:~/test/aarch32-cacheflush$ arm-linux-gnueabihf-gcc test.c -o test -O3
> mark@flodeboller:~/test/aarch32-cacheflush$ file test
> test: ELF 32-bit LSB pie executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-armhf.so.3, BuildID[sha1]=a53713f6b623b9b7c29cee4dc615fb7d43a0dcb6, for GNU/Linux 3.2.0, not stripped
> mark@flodeboller:~/test/aarch32-cacheflush$ strace ./test
> execve("./test", ["./test"], 0xffffd7e09890 /* 25 vars */ <unfinished ...>
> [ Process PID=7682 runs in 32 bit mode. ]
> strace: WARNING: Proper structure decoding for this personality is not supported, please consider building strace with mpers support enabled.
> <... execve resumed>) = 0
> brk(NULL) = 0x222b000
> access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
> openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
> statx(3, "", AT_STATX_SYNC_AS_STAT|AT_NO_AUTOMOUNT|AT_EMPTY_PATH, STATX_BASIC_STATS, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=31402, ...}) = 0
> mmap2(NULL, 31402, PROT_READ, MAP_PRIVATE, 3, 0) = 0xf7b0b000
> close(3) = 0
> openat(AT_FDCWD, "/lib/arm-linux-gnueabihf/libc.so.6", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
> read(3, "\177ELF\1\1\1\3\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0i\344\1\0004\0\0\0"..., 512) = 512
> statx(3, "", AT_STATX_SYNC_AS_STAT|AT_NO_AUTOMOUNT|AT_EMPTY_PATH, STATX_BASIC_STATS, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0755, stx_size=1102644, ...}) = 0
> mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xf7b09000
> mmap2(NULL, 1139660, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xf79f2000
> mmap2(0xf7afc000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x109000) = 0xf7afc000
> mmap2(0xf7aff000, 37836, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xf7aff000
> close(3) = 0
> set_tls(0xf7b09ce0) = 0
> set_tid_address(0xf7b09848) = 7682
> set_robust_list(0xf7b0984c, 12) = 0
> rseq(0xf7b09cc0, 0x20, 0, 0xe7f5def3) = 0
> mprotect(0xf7afc000, 8192, PROT_READ) = 0
> mprotect(0x572000, 4096, PROT_READ) = 0
> mprotect(0xf7b31000, 4096, PROT_READ) = 0
> ugetrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY}) = 0
> munmap(0xf7b0b000, 31402) = 0
> statx(1, "", AT_STATX_SYNC_AS_STAT|AT_NO_AUTOMOUNT|AT_EMPTY_PATH, STATX_BASIC_STATS, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFCHR|0620, stx_size=0, ...}) = 0
> getrandom("\x79\xe4\xe7\x57", 4, GRND_NONBLOCK) = 4
> brk(NULL) = 0x222b000
> brk(0x224c000) = 0x224c000
> write(1, "Attempting flush of [0x573040..0"..., 41Attempting flush of [0x573040..0x574040]
> ) = 41
> cacheflush(0x573040, 0x574040, 0) = 0
> write(1, "Attempting flush of [(nil)..0x10"..., 36Attempting flush of [(nil)..0x1000]
> ) = 36
> cacheflush(0, 0x1000, 0) = -1 EFAULT (Bad address)
> exit_group(0) = ?
> +++ exited with 0 +++