Re: [PATCH] [RFC v3] packet: experimental support for 64-bit timestamps

From: Willem de Bruijn
Date: Tue Nov 28 2017 - 17:29:45 EST


On Tue, Nov 28, 2017 at 3:32 PM, Arnd Bergmann <arnd@xxxxxxxx> wrote:
> As I noticed in my previous patch to remove the 'timespec' usage in
> the packet socket, the timestamps in the packet socket are slightly
> inefficient as they convert a nanosecond value into seconds/nanoseconds
> or seconds/microseconds.
>
> This adds two new socket options for the timestamp to resolve that:
>
> PACKET_SKIPTIMESTAMP sets a flag to indicate whether to generate
> timestamps at all. When this is set, all timestamps are hardcoded to
> zero, which saves a few cycles for the conversion and the access of
> the hardware clocksource. The idea was taken from pktgen, which has an
> F_NO_TIMESTAMP option for the same purpose.
>
> PACKET_TIMESTAMP_NS64 changes the interpretation of the time stamp fields:
> instead of having 32 bits for seconds plus 32 bits for nanoseconds or
> microseconds, we now always send down 64 bits worth of nanoseconds when
> this flag is set.
>
> Link: https://patchwork.kernel.org/patch/10077199/
> Suggested-by: Willem de Bruijn <willemdebruijn.kernel@xxxxxxxxx>
> Signed-off-by: Arnd Bergmann <arnd@xxxxxxxx>

This works. Another option would be to add a PACKET_TIMESTAMP_EX
with the semantics we discussed previously + fail hard when any undefined
bits are set. I don't feel strong either way, we don't intend to extend further.

If taking this approach, it might be good to split into separate patches, one
for each flag?

> -static __u32 tpacket_get_timestamp(struct sk_buff *skb, struct timespec64 *ts,
> +static __u32 tpacket_get_timestamp(struct sk_buff *skb, __u32 *hi, __u32 *lo,
> unsigned int flags)

Argument flags is no longer used.

> {
> + struct packet_sock *po = pkt_sk(skb->sk);
> struct skb_shared_hwtstamps *shhwtstamps = skb_hwtstamps(skb);
> + ktime_t stamp;
> + u32 type;
> +
> + if (po->tp_skiptstamp)
> + return 0;
>
> if (shhwtstamps &&
> - (flags & SOF_TIMESTAMPING_RAW_HARDWARE) &&
> - ktime_to_timespec64_cond(shhwtstamps->hwtstamp, ts))
> - return TP_STATUS_TS_RAW_HARDWARE;
> + (po->tp_tstamp & SOF_TIMESTAMPING_RAW_HARDWARE) &&
> + shhwtstamps->hwtstamp) {
> + stamp = shhwtstamps->hwtstamp;
> + type = TP_STATUS_TS_RAW_HARDWARE;
> + } else if (skb->tstamp) {
> + stamp = skb->tstamp;
> + type = TP_STATUS_TS_SOFTWARE;
> + } else {
> + return 0;
> + }
>
> - if (ktime_to_timespec64_cond(skb->tstamp, ts))
> - return TP_STATUS_TS_SOFTWARE;
> + if (po->tp_tstamp_ns64) {
> + __u64 ns = ktime_to_ns(stamp);
>
> - return 0;
> + *hi = upper_32_bits(ns);
> + *lo = lower_32_bits(ns);
> + } else {
> + struct timespec64 ts = ktime_to_timespec64(stamp);
> +
> + *hi = ts.tv_sec;
> + if (po->tp_version == TPACKET_V1)

Very minor: may want to invert test to make newer the protocols the
likely branch.

> static __u32 __packet_set_timestamp(struct packet_sock *po, void *frame,
> struct sk_buff *skb)
> {
> union tpacket_uhdr h;
> - struct timespec64 ts;
> - __u32 ts_status;
> + __u32 ts_status, hi, lo;
>
> - if (!(ts_status = tpacket_get_timestamp(skb, &ts, po->tp_tstamp)))
> + if (!(ts_status = tpacket_get_timestamp(skb, &hi, &lo, po->tp_tstamp)))
> return 0;
>
> h.raw = frame;
> - /*
> - * versions 1 through 3 overflow the timestamps in y2106, since they
> - * all store the seconds in a 32-bit unsigned integer.
> - * If we create a version 4, that should have a 64-bit timestamp,
> - * either 64-bit seconds + 32-bit nanoseconds, or just 64-bit
> - * nanoseconds.
> - */

Probably no need to introduce this in patch 1/2 when removing it in 2/2.

> @@ -2191,8 +2226,8 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev,
> unsigned long status = TP_STATUS_USER;
> unsigned short macoff, netoff, hdrlen;
> struct sk_buff *copy_skb = NULL;
> - struct timespec64 ts;
> __u32 ts_status;
> + __u32 hi, lo;

since this function is not time-specific, the context of hi and lo is not
immediately obvious here. tstamp_hi, tstamp_lo? Or even __u32
tstamp[2] and have tpacket_get_timestamp and packet_get_time take
one fewer argument.