RE: [PATCH 1/2] RISC-V: Probe for unaligned access speed

From: David Laight
Date: Fri Jun 30 2023 - 04:30:08 EST


...
> Yeah, one thing I could do is disable interrupts, measure the cycle
> count of doing an individual iteration, do this N times, and take the
> minimum value as the time to compare. In the end I'll then have two
> numbers to compare, like I do in this patch. In theory the variance on
> that should be really tight. N will have to depend on the overall
> amount of time I'm taking so as not to shut interrupts off for very
> long. Let me experiment with this and see how the results look.
> -Evan

I doubt you'll need many iterations or a long test.

You can do tests in userspace without disabling pre-emption
or interrupts - the large/silly values they generate are
easily ignored.

I suspect you'll get enough info from something like:
unsigned long x[2];
volatile unsigned long *p = (void *)((unsigned char *)x + 1);
full_cpu_barrier()
start = rdtsc();
full_cpu_barrier();
*p; *p; *p; *p; *p; *p; *p; *p;
*p; *p; *p; *p; *p; *p; *p; *p;
full_cpu_barrier()
elapsed = rdtsc() - start;
Once the i-cache is loaded it should be pretty constant.
For aligned addresses I'd expect each extra '*p' to be
one more clock.
With hardware support for misaligned transfers at most
2 clocks (test on x86 and it will be 1 clock).
The emulated version will be 100s or 1000s.

I'm not sure how much of a cpu barrier you need.
Definitely needs to wait for all memory accesses
and the rdtsc().

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)