Re: [PATCH net-next 6/7] net: stmmac: Fix comment about default addend calculation

From: Andrew Halaney
Date: Wed Aug 30 2023 - 17:42:23 EST


On Wed, Aug 30, 2023 at 01:16:31PM +0300, Serge Semin wrote:
> On Tue, Aug 29, 2023 at 10:01:20AM -0500, Andrew Halaney wrote:
> > On Sun, Aug 27, 2023 at 03:02:07AM +0300, Serge Semin wrote:
> > > Hi Andrew
> > >
> > > On Thu, Aug 24, 2023 at 01:32:57PM -0500, Andrew Halaney wrote:
> > > > The comment neglects that freq_div_ratio is the ratio between
> > > > the subsecond increment frequency and the clk_ptp_rate frequency.
> > > >
> > > > Signed-off-by: Andrew Halaney <ahalaney@xxxxxxxxxx>
> > > > ---
> > > > drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 10 ++++++----
> > > > 1 file changed, 6 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > > > index dfead0df6163..64185753865f 100644
> > > > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > > > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > > > @@ -853,10 +853,12 @@ int stmmac_init_tstamp_counter(struct stmmac_priv *priv, u32 systime_flags)
> > > > /* Store sub second increment for later use */
> > > > priv->sub_second_inc = sub_second_inc;
> > > >
> > >
> > > > - /* calculate default addend value:
> > > > - * formula is :
> > > > - * addend = (2^32)/freq_div_ratio;
> > > > - * where, freq_div_ratio = 1e9ns/sub_second_inc
> > > > + /* Calculate default addend so the accumulator overflows (2^32) in
> > > > + * sub_second_inc (ns). The addend is added to the accumulator
> > > > + * every clk_ptp cycle.
> > > > + *
> > > > + * addend = (2^32) / freq_div_ratio
> > > > + * where, freq_div_ratio = (1e9ns / sub_second_inc) / clk_ptp_rate
> > > > */
> > > > temp = div_u64(NSEC_PER_SEC, sub_second_inc);
> > > > temp = temp << 32;
> > >
> > > I am not well familiar with the way PTP works but at my naked eyes the
> > > calculation implemented here looks a bit different than what is
> > > described in the comment.
> > >
> > > Basically config_sub_second_increment(clk_ptp_rate, sub_second_inc)
> > > returns clk_ptp_rate period in nanoseconds or twice that period, or have it
> > > scaled up on 0.465. So we have one of the next formulae:
> > > X1 = NSEC_PER_SEC / clk_ptp_rate
> > > X2 = 2 * NSEC_PER_SEC / clk_ptp_rate
> > > X3 = X1 / 0.465
> > > X4 = X2 / 0.465
> >
>
> > X5 = PTP_SSIR_SSINC_MAX (0xFF) is a case as well to consider
>
> I noticed that option too, but then I thought it must have been not
> that much probable to be considered as a real case seeing it's a
> boundary case. The clamping happens if
> if (X1 > 255 || X2 > 255 || X3 > 255 || X4 > 255)
> X5 = 255
> so in the worst case PTP-rate period in nanoseconds multiplied by 4.3
> must be greater than 255 which is equivalent to X1 >= 60. It means
> PTP clock rate must be greater than 16.6MHz to avoid the clamping. In
> the best case - 3.9MHz. I doubted that these limits are crossed in
> reality. But in anyways you are right saying that it still needs to be
> taken into account in case if the implemented algo would be a subject
> for optimizations.
>
> > >
> > > Then stmmac_init_tstamp_counter() handles the retrieved period in the
> > > next manner:
> > > temp = div_u64(NSEC_PER_SEC, sub_second_inc); // Convert back to frequency
> > > temp = temp << 32; // multiply by 2^32
> > > addend = div_u64(temp, priv->plat->clk_ptp_rate); // Divide by clk_ptp_rate
> > >
> > > The code above is equivalent:
> > >
> > > addend = ((NSEC_PER_SEC / X) * 2^32 ) / clk_ptp_rate =
> > > (2^32 * NSEC_PER_SEC / X) / clk_ptp_rate =
> > > 2^32 / (clk_ptp_rate / (NSEC_PER_SEC / X))
> > >
> > > AFAICS this doesn't match to what is in the comment (X = sub_second_inc).
> > > freq_div_ratio gets to be inverted. Does it?
> >
>
> > You're right, my comment needs to be inverted to match all of the above
> > (which is a great recap, thank you!).
>
> Good. Then an hour spent for decyphering of that stuff wasn't a waste
> of time after all.)
>
> >
> > >
> > > Substituting X to the formulae above we'll have just four possible results:
> > > addend1 = 2^32
> > > addend2 = 2^32 / 2
> > > addend3 = 0.465 * 2^32
> > > addend4 = 0.465 * 2^32 / 2
> >
> > addend5 = 2^32 / (clk_ptp_rate / (NSEC_PER_SEC / 0xFF))
> >
> > I think that would be the PTP_SSIR_SSINC_MAX case (X5) I inserted above
> >
> > >
> > > So basically clk_ptp_rate is irrelevant (neglecting all the
> > > integer divisions rounding). Is that what implied by the implemented
> > > algo?
> > >
> > > Am I missing something? (it's quite possible since it's long past
> > > midnight already.)
> >
> > I believe you've captured everything, minus the one conditional I added.
> >
> > I think because of that conditional we can't just nicely code up some
> > contants here independent of sub_second_inc. Now I can blame the morning
> > and not enough coffee, do you see anything wrong with that thought
>
> I am not that much aware of the PTP internals but it just seems weird
> to have clk_ptp_rate not affecting anything except the boundary case.
> Do you have a DW *MAC HW databook with the PTP-engine chapter
> describing the way the System Time Register Module works?

Unfortunately I do not have that documentation (or admittedly much
experience in the area).

>
> > process? I'm all ears for suggestions for cleaning this up, especially
> > since others like Richard have indicated that it could use some love,
>
> * I would have said more definitive - some _hard_ love.)
>
> > but right now I'm hung up thinking the best I can do is fix the bad
> > comment in this patch.
>
> Just at the first very swift glance:
> 1. See attached patch.
> 2. Exporting stmmac_init_tstamp_counter() isn't necessary. It doesn't
> seem like being utilized anywhere except in the stmmac_main.c module.
> 3. stmmac_hwtimestamp-based abstraction seems redundant since: just a
> single PTP implementation is provided; DW GMAC, DW XGMAC and DW QoS
> Eth PTP implementations don't seem like very much different (XGMAC and
> QoS Eth seems to have some additional features but the basics looks
> the same). Moreover developing a HW-abstraction without having all the
> IP-core databooks at hands and having at least two different engines
> description seems like a needless over-complication of the code. I
> have doubts it was possible to create a comprehensive enough
> sub-module to be suitable for the real and any other not yet known PTP
> engine.)
> 4. For the same reason as 2. splitting up the PTP support into two
> files seems redundant. stmmac_hwtstamp.c content can be moved to
> stmmac_ptp.c .
> 5. ...
>
> 3 and 5 imply bulky and delicate work which I would have attempted
> only after much deeper PTP engine studying in all the DW *MAC IP-cores
> (I might have missed something) and only having a real PTP-charged
> device at hands.
>

Thanks, this is a good list to brainstorm with. I agree 3 and 5 might
not be advisable for someone who doesn't have good mastery over the
hardware, etc.