Re: [PATCH v4 29/33] x86/mm: try VMA lock-based page fault handling first

From: Jiri Slaby
Date: Fri Jun 30 2023 - 04:43:47 EST


On 30. 06. 23, 10:28, Jiri Slaby wrote:
> 2348 clone3({flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, child_tid=0x7fcaa5882990, parent_tid=0x7fcaa5882990, exit_signal=0, stack=0x7fcaa5082000, stack_size=0x7ffe00, tls=0x7fcaa58826c0} => {parent_tid=[2351]}, 88) = 2351
> 2350  <... clone3 resumed> => {parent_tid=[2372]}, 88) = 2372
> 2351  <... clone3 resumed> => {parent_tid=[2354]}, 88) = 2354
> 2351  <... clone3 resumed> => {parent_tid=[2357]}, 88) = 2357
> 2354  <... clone3 resumed> => {parent_tid=[2355]}, 88) = 2355
> 2355  <... clone3 resumed> => {parent_tid=[2370]}, 88) = 2370
> 2370  mmap(NULL, 262144, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
> 2370  <... mmap resumed>)               = 0x7fca68249000
> 2372  <... clone3 resumed> => {parent_tid=[2384]}, 88) = 2384
> 2384  <... clone3 resumed> => {parent_tid=[2388]}, 88) = 2388
> 2388  <... clone3 resumed> => {parent_tid=[2392]}, 88) = 2392
> 2392  <... clone3 resumed> => {parent_tid=[2395]}, 88) = 2395
> 2395  write(2, "runtime: marked free object in s"..., 36 <unfinished ...>

I.e. IIUC, all are threads (CLONE_VM) and thread 2370 mapped ANON 0x7fca68249000 - 0x7fca6827ffff and go in thread 2395 thinks for some reason 0x7fca6824bec8 in that region is "bad".

As I was noticed, this might be as well be a fail of the go's inter-thread communication (or alike) too. It might now be only more exposed with vma-based locks as we can do more parallelism now.

There are older hard to reproduce bugs in go with similar symptoms (we see this error sometimes now too):
https://github.com/golang/go/issues/15246

Or this 2016 bug is a red herring. Hard to tell...

thanks,
--
js
suse labs