Avi Kivity wrote:Basic problem is that you can get a process which you can't interrupt (in in most cases can't kill) which has resources tied up. Given the choice between surprising a process with an EINTR or killing it during a reboot to get the system usable again, I would rather surprise.Applications should not assume that write() (or other syscalls) can't return EINTR. Not all filesystems have a bounded-time backing store.
The distinction between 'fast' (filesystem) and 'slow' (terminals and pipes) blocking syscalls goes back to the earliest days of Unix, and is part of the ABI. Most filesystem syscalls are not documented to ever return EINTR.
'soft' has its own problems; namely false positives when someone steps on the network cable, temporarily blocking packet flow, or when using a clustered server which may take some time to recover from a fault.
Sure. It's the basic problem of trying to make network access transparent by hiding the failure modes. You either need to put up with spurious timeouts caused by transient failures, or unbounded blocking on real failures.