[PATCH v2] clarify usage expectations for cnt32_to_63()

From: Nicolas Pitre
Date: Sun Nov 09 2008 - 00:28:22 EST


Currently, all existing users of cnt32_to_63() are fine since the CPU
architectures where it is used don't do read access reordering, and user
mode preemption is disabled already. It is nevertheless a good idea to
better elaborate usage requirements wrt preemption, and use an explicit
memory barrier on SMP to avoid different CPUs accessing the counter
value in the wrong order. On UP a simple compiler barrier is
sufficient.

Signed-off-by: Nicolas Pitre <nico@xxxxxxxxxxx>
---

On Sun, 9 Nov 2008, Nicolas Pitre wrote:

> On Sat, 8 Nov 2008, Nicolas Pitre wrote:
>
> > On Sat, 8 Nov 2008, Mathieu Desnoyers wrote:
> >
> > > I *think* that smp_rmb() would be enough, supposing the access to memory
> > > is done in program order wrt local interrupts in UP. This is basically
> > > Steven's question, which has not received any clear answer yet. I'd like
> > > to know what others think about it.
> >
> > In the mean time a pure rmb() is the safest thing to do now. Once we
> > can convince ourselves that out-of-order reads are always rolled back
> > upon the arrival of an interrupt then this could be relaxed.
>
> After thinking about it some more, a smp_rmb() must be correct.
>
> On UP, that would be completely insane if an exception didn't resume
> the whole sequence since the CPU cannot presume anything when returning
> from it.
>
> If the instruction flows says:
>
> READ A
> READ B
>
> And speculative execution makes for B to be read before A, and you get
> an interrupt after B was read but before A was read, then the program
> counter may only point at READ A upon returning from the exception and B
> will be read again. Doing otherwise would require the CPU to remember
> any reordering that it might have performed upon every exceptions which
> is completely insane.
>
> So smp_rmb() it shall be.

diff --git a/include/linux/cnt32_to_63.h b/include/linux/cnt32_to_63.h
index 8c0f950..7605fdd 100644
--- a/include/linux/cnt32_to_63.h
+++ b/include/linux/cnt32_to_63.h
@@ -16,6 +16,7 @@
#include <linux/compiler.h>
#include <linux/types.h>
#include <asm/byteorder.h>
+#include <asm/system.h>

/* this is used only to give gcc a clue about good code generation */
union cnt32_to_63 {
@@ -53,11 +54,19 @@ union cnt32_to_63 {
* needed increment. And any race in updating the value in memory is harmless
* as the same value would simply be stored more than once.
*
- * The only restriction for the algorithm to work properly is that this
- * code must be executed at least once per each half period of the 32-bit
- * counter to properly update the state bit in memory. This is usually not a
- * problem in practice, but if it is then a kernel timer could be scheduled
- * to manage for this code to be executed often enough.
+ * The restrictions for the algorithm to work properly are:
+ *
+ * 1) this code must be called at least once per each half period of the
+ * 32-bit counter;
+ *
+ * 2) this code must not be preempted for a duration longer than the
+ * 32-bit counter half period minus the longest period between two
+ * calls to this code.
+ *
+ * Those requirements ensure proper update to the state bit in memory.
+ * This is usually not a problem in practice, but if it is then a kernel
+ * timer should be scheduled to manage for this code to be executed often
+ * enough.
*
* Note that the top bit (bit 63) in the returned value should be considered
* as garbage. It is not cleared here because callers are likely to use a
@@ -68,9 +77,10 @@ union cnt32_to_63 {
*/
#define cnt32_to_63(cnt_lo) \
({ \
- static volatile u32 __m_cnt_hi; \
+ static u32 __m_cnt_hi; \
union cnt32_to_63 __x; \
__x.hi = __m_cnt_hi; \
+ smp_rmb(); \
__x.lo = (cnt_lo); \
if (unlikely((s32)(__x.hi ^ __x.lo) < 0)) \
__m_cnt_hi = __x.hi = (__x.hi ^ 0x80000000) + (__x.hi >> 31); \
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/