Re: [PATCH net-next v7 15/16] net: ethtool: ts: Let the active time stamping layer be selectable

From: Jakub Kicinski
Date: Mon Nov 20 2023 - 14:59:13 EST


On Mon, 20 Nov 2023 21:00:23 +0200 Vladimir Oltean wrote:
> Well, first of all, given my understanding of the "laws of physics",
> I think something has to give in your use case description. I can't
> see how on RX, the NIC can decide in advance whether to provide low
> rate MAC timestamps for packets going to a socket and high rate DMA
> timestamps for packets going to another socket. It can either provide
> MAC timestamps, or DMA timestamps, or an unreliable, unpresentable to
> user space, mix.

Rx time stamping is configured by filters. Is there a problem with user
specifying that they want "true" timestamps for PTP/NTP packets, and
"dma" timestamps for all the rest?

Maybe we can extend struct scm_timestamping to carry an indication
which stamp ended up in ts[2] but that's less important to me than
the ability to configure the thing. Right now, as I said, mlx5 uses
an ethtool priv flag :(

> But maybe I'm wrong and there are NICs which can do that filtering.
> If such NIC exists, then I guess a SOF_TIMESTAMPING_RX_DMA flag should
> be added to the socket layer, and the NIC driver provides timestamps
> according to the skb->sk->sk_tsflags, and that problem is completely out
> of scope for Köry's patch set - and implicitly compatible with it, since
> as you say, the device-wide timestamping layer - PHC index - does not
> really change.

IDK. Maybe the sniffles I picked up at LPC are clouding my judgment
but to me this patch set is shaped too much by current implementation
and not enough by what it's modeling. It basically exposes to user
space the "mux" for choosing NETDEV vs PHYLIB.

There are multiple time stamping points as the packet moves thru
the pipeline. Expose them so that SIOC[GS]HWTSTAMP can target each
on individually.

> If I'm not wrong and the MAC-or-DMA timestamp selection is NIC-wide
> (which diverges from your problem description),

Nope.

> then neither Köry's work
> nor my "everything is a phc_index" proposal will bring your use case to
> fruition without further work. Here I would avoid speculating, because a
> lot will depend upon the details which you haven't really given.

What are the details you'd like? PTP gets stamped at the PHY/MAC,
the rest gets stamped at DMA. mlx5 achieves this by splitting the
PTP traffic to a separate queue pair, and configuring that qp to
capture PHY/MAC stamps, AFAIU.

> One question will be whether, in the case of "NIC-wide DMA timestamps",
> DMA timestamps should be presented as hardware timestamps - struct
> scm_timestamping[2] from CMSG_DATA() - or as their own thing, that user
> space needs explicit support for - by parsing a new cmsg level/type.
> If DMA timestamps won't look to user space like hardware timestamps,
> then the use case is again out of scope for Köry's work, as far as I see
> it.
>
> Another simple question is - if NICs do this today - probably by giving
> the "unrepresentable mix" to user space in an implicit, hardcoded and
> very fine tuned way such that nobody bats an eye - then what is there
> more to support? Are you looking at extra UAPI as a way to legitimize
> hacks, or do you feel there is extra control that applications can gain?

I don't understand what you're asking me.

DMA timestamping is becoming increasingly important. Ready any
congestion control paper from the last 5 years and chances are
it will be using delay as a signal. If we're extending uAPI
for Hw stamping we should make sure to cater to CC use cases.