Re: [PATCH 31/31] aio: implement io_pgetevents

From: Arnd Bergmann
Date: Wed Jan 10 2018 - 06:03:32 EST


On Wed, Jan 10, 2018 at 9:11 AM, Christoph Hellwig <hch@xxxxxx> wrote:
> On Tue, Jan 09, 2018 at 11:16:16PM +0100, Arnd Bergmann wrote:
>> Hmm, these two new syscall entry points turn into four when we add in
>> support for 64-bit time_t, as we'd have to support all combinations of 32/64
>> bit aio_context_t and time_t.
>
> At least they'll also replace plain old io_getevents :)

Ah, right, so we'd save two other calls in the process: the native 32-bit
io_getevents with compat_timespec, and the compat io_getevents with
64-bit timespec.

>> Would it be better to start this interface out by defining it using a 64-bit
>> timeout structure? The downside would be that the user space syscall
>> wrappers have to start out with a conversion, if we don't do it, then
>> the opposite conversion would have to get added later.
>
> Which structure do you want? In the end applications using libaio
> or even the syscalls directly (like seastar) are a special bread, so
> they could probably just deal with whatever structure we want to pass.

I'd suggest passing a variant of timespec with two 64-bit members.
Deepa has posted patches for this structure in the past and was planning
to do a new version (with minor changes from review) soon, but we
can just well use it in your patch if that gets merged first.

If we merge io_pgetevents quickly (before the bulk of the y2038 syscall
conversion), I'd say we should use

struct __kernel_timespec64 {
__s64 tv_sec;
__s64 tv_nsec;
};

The tv_nsec type is unfortunately much trickier than it should be:
C99 requires it to be 'long', so user space needs to define the 64-bit
'struct timespec' with internal padding in the right places depending
on endianess, and the kernel has to be careful about either zeroing
the upper half or checking it for being zeroed by user space depending
on whether we come from a 32-bit or 64-bit task, but I'm fairly sure
we have that part worked out by now.

Arnd