Re: F_SETOWN_TID: F_SETOWN was thread-specific for a while

From: Oleg Nesterov
Date: Mon Aug 17 2009 - 13:09:42 EST


Sorry for late reply.

And I am a bit confused.

On 08/11, Jamie Lokier wrote:
>
> Oleg Nesterov wrote:
> > Agreed, this looks a bit odd. But at least this is documented. From
> > man 2 fcntl:
> >
> > By using F_SETSIG with a nonzero value, and setting SA_SIGINFO
> > for the signal handler (see sigaction(2)), extra information
> > about I/O events is passed to the handler in a siginfo_t
> > structure. If the si_code field indicates the source is
> > SI_SIGIO, the si_fd field gives the file descriptor associated
> > with the event. Otherwise, there is no indication which file
> > descriptors are pending,
> >
> > Not sure if it is safe to change the historical behaviour.
>
> The change in 2.6.12 breaks some code of mine, which uses RT queued
> I/O signals on multiple threads but as far as I know it's not used
> anywhere now.
>
> In the <= 2.4 era, there were lots of web servers and benchmarks using
> queued I/O signals for scalable event-driven I/O, but I don't know of
> any implementation who dared do it with multiple threads, except mine.
>
> It was regarded as "beware ye who enter here" territory, which I can
> attest to from the long time it took to get it right and the multitude
> of kernel bugs and version changes needing to be worked around.
>
> Since 2.6, everyone uses epoll which is much better, except that
> occasionally SIGIO comes in handy when an async notification is
> required.
>
> So the change in 2.6.12 does break something that probably isn't much
> used, but it's too late now.

So, you seem to agree we should not change this odd behaviour?

> Occasionally thread-specific SIGIO (or
> F_SETSIG) is useful; F_SETOWN_TID makes that nice and clear.

Great. If you agree with F_SETOWN_TID, could you look at the next
Peter's patch

"[PATCH 3/2 -v4] fcntl: F_[SG]ETOWN_EX"
http://marc.info/?l=linux-kernel&m=124956452125468

and ack it?

> I would drop the pseudo-"bug compatible" behaviour of using negative
> tid to mean pid; that's pointless.

done,

> I'd also make F_GETOWN return an
> error when F_SETOWN_TID has been used,

This is not trivial, F_GETOWN can't return the error. A negative
result means PIDTYPE_PGID.

> and F_GETOWN_TID return an
> error when F_SETOWN has been used.

F_GETOWN_EX does this even better.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/