Re: tmpfs returns incorrect data on concurrent pread() and truncate()

From: Hugh Dickins
Date: Tue Nov 01 2016 - 19:52:11 EST


On Wed, 26 Oct 2016, Jakob Unterwurzacher wrote:

> tmpfs seems to be incorrectly returning 0-bytes when reading from
> a file that is concurrently being truncated.

That is an interesting observation, and you got me worried;
but in fact, it is not a tmpfs problem: if we call it a
problem at all, it's a VFS problem or a userspace problem.

You chose a ratio of 3 preads to 1 ftruncate in your program below:
let's call that the Unterwurzacher Ratio, 3 for tmpfs; YMMV, but for
me 4 worked well to show the same issue on ramfs, and 15 on ext4.

The Linux VFS does not serialize reads against writes or truncation
very strictly: it's unusual to need that serialization, and most
users prefer maximum speed to the additional locking, or intermediate
buffering, that would be required to avoid the issue you've seen.

>
> This is causing crashes in gocryptfs, a cryptographic FUSE overlay,
> when it reads a nonce from disk that should absolutely positively
> never be all-zero.

I don't think that there will be much appetite for changing the
kernel's VFS to prevent that. I hope that gocryptfs can provide
the serialization that it needs for itself, or otherwise handle
those zeroes without crashing.

(Though if something is free to truncate at an awkward moment,
then I imagine it can also mess things up by writing wrong data.)

>
> I have written a reproducer in C that triggers this issue on
> both boxes I tested, Linux 4.7.2 and 3.16.7, both amd64.
>
> It can be downloaded from here:
>
> https://gist.github.com/rfjakob/d01281c737db38075767f90bf03fc475
>
> or, alternatively, I have attached it to this email at the bottom.
>
> The reproducer:
> 1) Creates a 10MB file filled with 'x' at /dev/shm/x
> 2) Spawns a thread that truncates the file 3 bytes at a time
> 3) Spawns another thread that pread()s the file 1 byte at a time
> starting from the top
> 4) Prints "wrong data" whenever the pread() gets something that
> is not 'x' or an empty result.

Nice work, thank you, this helped me a lot.

Hugh

>
> Example run:
>
> $ gcc -Wall -lpthread truncate_read.c && ./a.out
> wrong data: 0
> wrong data: 0
> wrong data: 0
> wrong data: 0
> wrong data: 0
> wrong data: 0
> wrong data: 0
> wrong data: 0
> wrong data: 0
> wrong data: 0
> [...]
>
>
> Best regards,
> Jakob
>
>
> ---------------------------------------------------------------------
> truncate_read.c
> ------------------------------8<---------------------------------------
>
>
> // Compile and run:
> // gcc -Wall -lpthread truncate_read.c && ./a.out
>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <stdlib.h>
> #include <string.h>
> #include <unistd.h>
> #include <stdio.h>
> #include <pthread.h>
>
> int fd;
> int l = 10*1024*1024;
> pthread_t tid[2];
>
> void* read_thread(void *arg){
> int o,n;
> char b;
> for(o=l; o>0; o--) {
> b = 'a';
> n = pread(fd, &b, 1, o);
> if(n==0) {
> continue;
> }
> if(b != 'x') {
> printf("wrong data: %x\n", b);
> }
> }
> return NULL;
> }
>
> void* truncate_thread(void *arg){
> // Parent = Truncater
> int o,n;
> // "3" seems to be the sweet spot to trigger the most errors.
> for(o=l; o>0; o-=3) {
> n = ftruncate(fd, o);
> if(n!=0) {
> perror("ftruncate err");
> }
> }
> return NULL;
> }
>
> int main(int argc, char *argv[]) {
> fd = open("/dev/shm/x", O_RDWR | O_TRUNC | O_CREAT, 0666);
> if(fd < 0) {
> printf("open failed\n");
> exit(1);
> }
> char* x = malloc(l);
> memset(x, 'x', l);
> write(fd, x, l);
>
> pthread_create(&(tid[0]), NULL, &read_thread, NULL);
> pthread_create(&(tid[1]), NULL, &truncate_thread, NULL);
>
> void *res;
> pthread_join(tid[0], &res);
> pthread_join(tid[1], &res);
>
> return 0;
> }