Re: [PATCH v3 0/4] open/accept directly into io_uring fixed file table

From: Jens Axboe
Date: Mon Aug 23 2021 - 15:40:29 EST


On 8/23/21 1:13 PM, Josh Triplett wrote:
> On Sat, Aug 21, 2021 at 08:18:12PM -0600, Jens Axboe wrote:
>> On 8/21/21 9:52 AM, Pavel Begunkov wrote:
>>> Add an optional feature to open/accept directly into io_uring's fixed
>>> file table bypassing the normal file table. Same behaviour if as the
>>> snippet below, but in one operation:
>>>
>>> sqe = prep_[open,accept](...);
>>> cqe = submit_and_wait(sqe);
>>> io_uring_register_files_update(uring_idx, (fd = cqe->res));
>>> close((fd = cqe->res));
>>>
>>> The idea in pretty old, and was brough up and implemented a year ago
>>> by Josh Triplett, though haven't sought the light for some reasons.
>>>
>>> The behaviour is controlled by setting sqe->file_index, where 0 implies
>>> the old behaviour. If non-zero value is specified, then it will behave
>>> as described and place the file into a fixed file slot
>>> sqe->file_index - 1. A file table should be already created, the slot
>>> should be valid and empty, otherwise the operation will fail.
>>>
>>> we can't use IOSQE_FIXED_FILE to switch between modes, because accept
>>> takes a file, and it already uses the flag with a different meaning.
>>>
>>> since RFC:
>>> - added attribution
>>> - updated descriptions
>>> - rebased
>>>
>>> since v1:
>>> - EBADF if slot is already used (Josh Triplett)
>>> - alias index with splice_fd_in (Josh Triplett)
>>> - fix a bound check bug
>>
>> With the prep series, this looks good to me now. Josh, what do you
>> think?
>
> I would still like to see this using a union with the `nofile` field in
> io_open and io_accept, rather than overloading the 16-bit buf_index
> field. That would avoid truncating to 16 bits, and make less work for
> expansion to more than 16 bits of fixed file indexes.
>
> (I'd also like that to actually use a union, rather than overloading the
> meaning of buf_index/nofile.)

Agree, and in fact there's room in the open and accept command parts, so
we can just make it a separate entry there instead of using ->buf_index.
Then just pass in the index to io_install_fixed_file() instead of having
it pull it from req->buf_index.

> I personally still feel that using non-zero to signify index-plus-one is
> both error-prone and not as future-compatible. I think we could do
> better with no additional overhead. But I think the final call on that
> interface is up to you, Jens. Do you think it'd be worth spending a flag
> bit or using a different opcode, to get a cleaner interface? If you
> don't, then I'd be fine with seeing this go in with just the io_open and
> io_accept change.

I'd be inclined to go the extra opcode route instead, as the flag only
really would make sense to requests that instantiate file descriptors.
For this particular case, we'd need 3 new opcodes for
openat/openat2/accept, which is probably a worthwhile expenditure.

Pavel, what do you think? Switch to using a different opcode for the new
requests, and just grab some space in io_open and io_accept for the fd
and pass it in to install.

I do think that'd end up being less hackish and easier to grok for a
user.

--
Jens Axboe