Re: [PATCH v2 0/2] Add epoll round robin wakeup mode

From: Jason Baron
Date: Tue Feb 17 2015 - 15:33:28 EST


On 02/17/2015 02:46 PM, Andy Lutomirski wrote:
> On Tue, Feb 17, 2015 at 11:33 AM, Jason Baron <jbaron@xxxxxxxxxx> wrote:
>> When we are sharing a wakeup source among multiple epoll fds, we end up with
>> thundering herd wakeups, since there is currently no way to add to the
>> wakeup source exclusively. This series introduces 2 new epoll flags,
>> EPOLLEXCLUSIVE for adding to a wakeup source exclusively. And EPOLLROUNDROBIN
>> which is to be used in conjunction to EPOLLEXCLUSIVE to evenly
>> distribute the wakeups. This patch was originally motivated by a desire to
>> improve wakeup balance and cpu usage for a listen socket() shared amongst
>> multiple epoll fd sets.
>>
>> See: http://lwn.net/Articles/632590/ for previous test program and testing
>> resutls.
>>
>> Epoll manpage text:
>>
>> EPOLLEXCLUSIVE
>> Provides exclusive wakeups when attaching multiple epoll fds to a
>> shared wakeup source. Must be specified with an EPOLL_CTL_ADD operation.
>>
>> EPOLLROUNDROBIN
>> Provides balancing for exclusive wakeups when attaching multiple epoll
>> fds to a shared wakeup soruce. Depends on EPOLLEXCLUSIVE being set and
>> must be specified with an EPOLL_CTL_ADD operation.
>>
>> Thanks,
> What permissions do you need on the file descriptor to do this? This
> will be the first case where a poll-like operation has side effects,
> and that's rather weird IMO.
>

So in the case where you have both non-exclusive and exclusive
waiters, all of the non-exclusive waiters will continue to get woken
up. However, I think you're getting at having multiple exclusive
waiters and potentially 'starving' out other exclusive waiters.

In general, I think wait queues are associated with a 'struct file',
so I think unless you are sharing your fd table, this isn't an issue.
However, there may be cases where this is not true? In which
case, perhaps, we could limit this to CAP_SYS_ADMIN...

Thanks,

-Jason

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/