Re: [PATCH v3 3/3] man2/fincore.2: document general description about fincore(2)

From: Dave Hansen
Date: Mon Jul 07 2014 - 18:34:34 EST


On 07/07/2014 01:59 PM, Naoya Horiguchi wrote:
> On Mon, Jul 07, 2014 at 12:08:12PM -0700, Dave Hansen wrote:
>> On 07/07/2014 11:00 AM, Naoya Horiguchi wrote:
>>> +.SH RETURN VALUE
>>> +On success,
>>> +.BR fincore ()
>>> +returns 0.
>>> +On error, \-1 is returned, and
>>> +.I errno
>>> +is set appropriately.
>>
>> Is this accurate? From reading the syscall itself, it looked like it
>> did this:
>>
>>> + * Return value is the number of pages whose data is stored in fc->buffer.
>>> + */
>>> +static long do_fincore(struct fincore_control *fc, int nr_pages)
>>
>> and:
>>
>>> +SYSCALL_DEFINE6(fincore, int, fd, loff_t, start, long, nr_pages,
>> ...
>>> + while (fc.nr_pages > 0) {
>>> + memset(fc.buffer, 0, fc.buffer_size);
>>> + ret = do_fincore(&fc, min(step, fc.nr_pages));
>>> + /* Reached the end of the file */
>>> + if (ret == 0)
>>> + break;
>>> + if (ret < 0)
>>> + break;
>> ...
>>> + }
>> ...
>>> + return ret;
>>> +}
>>
>> Which seems that for a given loop of do_fincore(), you might end up
>> returning the result of that *single* iteration of do_fincore() instead
>> of the aggregate of the entire syscall.
>>
>> So, it can return <0 on failure, 0 on success, or also an essentially
>> random >0 number on success too.
>
> We don't break this while loop if do_fincore() returned a positive value
> unless copy_to_user() fails. And in that case ret is set to -EFAULT.
> So I think sys_fincore() never returns a positive value.

OK, that makes sense as I'm reading it again.

>> Why not just use the return value for something useful instead of
>> hacking in the extras->nr_entries stuff?
>
> Hmm, I got the opposite complaint previously, where we shouldn't
> interpret the return value differently depending on the flag.
> And I'd like to keep the extra argument for future extensibility.
> For example, if we want to collect pages only with a specific
> set of page flags, this extra argument will be necessary.

Couldn't it simply be the number of elements that it wrote in to the
buffer, or even the number of bytes?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/