Re: [cpuops cmpxchg double V2 1/4] Generic support forthis_cpu_cmpxchg_double

From: Mathieu Desnoyers
Date: Fri Jan 07 2011 - 13:04:29 EST

Next message: Sedat Dilek: "Re: linux-next: build failure after merge of the block tree"
Previous message: Iain Paton: "Re: 2.6.37 vlans on bnx2 not functional, panic with tcpdump"
In reply to: Christoph Lameter: "Re: [cpuops cmpxchg double V2 1/4] Generic support forthis_cpu_cmpxchg_double"
Next in thread: Christoph Lameter: "Re: [cpuops cmpxchg double V2 1/4] Generic support forthis_cpu_cmpxchg_double"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

* Christoph Lameter (cl@xxxxxxxxx) wrote:
> On Thu, 6 Jan 2011, H. Peter Anvin wrote:
>
> > On 01/06/2011 12:45 PM, Christoph Lameter wrote:
> > > Introduce this_cpu_cmpxchg_double. this_cpu_cmpxchg_double() allows the
> > > comparision between two consecutive words and replaces them if there is
> > > a match.
> > >
> > > bool this_cpu_cmpxchg_double(pcp1, pcp2,
> > > old_word1, old_word2, new_word1, new_word2)
> > >
> > > this_cpu_cmpxchg_double does not return the old value (difficult since
> > > there are two words) but a boolean indicating if the operation was
> > > successful.
> > >
> > > The first percpu variable must be double word aligned!
> >
> > I really truly hate this interface. The whole notion of passing two
> > pointers where only one is really used, is just painful.
>
> Well the second cpu variable is just there for correctness checking. That
> way we can verify that the types are compatible for the second word and
> that the second word was properly placed relative to the first one. It
> also helps the reader in the source because it shows explicitly which two
> words are modified by the operation.
>
> If you look at the prior patch series at the use case you will see that
> s->cpu_slab->tid was not referred to in the cmpxchg operation but
> implicitly modified. That is bad and was the initial motivation to require
> both to be specified.
>
> During debugging I had a couple of problems with the way the compiler
> placed those two fields that only became evident after painful debugging
> sessions. With the checks possible only because both are specifiec there
> is an explicit failure if either of the two cpu variables is specified in
> such a way that the cmpxchg32b cannot work.

I have to admit that I also hate the current interface for two reasons:

a) the way it splits the target address in two.

b) the loss of the value read (the fact that the only current user of this API
does not need the value returned seems like a very weak argument to define an
API).

I'm probably missing important compiler trickiness here, but why are we not
rather relying on forced compiler alignment to declare the target address type ?
e.g.:

typedef struct {
unsigned long v[2];
} __attribute__((aligned(2 * sizeof(long)))) cmpxchg_double_t;

For the usage you are doing within the allocator code, you can use an union (the
bitfield sizes are completely arbitrary).

union alloc_cd {
cmpxchg_double_t cd;
struct {
void *ptr;
unsigned long seq:48;
unsigned long cpu:16;
} v;
};

So this_cpu_cmpxchg_double would expect the type "cmpxchg_double_t *" as its
first argument.

One way to manage to return the value read would be to pass a pointer to
"old_word1-2" as argument rather than the value itself, and return the value
read in these locations. Anyway, we won't need the old words for comparison with
the return value, because the returned "bool" takes care of telling us if they
differ.

Adding the types just for clarity sake (should be handled differently in your
macro scheme of course), this_cpu_cmpxchg_double would look like:

typedef struct {
unsigned long v[2];
} __attribute__((aligned(2 * sizeof(long)))) cmpxchg_double_t;

bool this_cpu_cmpxchg_double(cmpxchg_double_t *pcp,
unsigned long *old_ret_word1,
unsigned long *old_ret_word2,
unsigned long new_word1,
unsigned long new_word2)

Thoughts ?

Thanks,

Mathieu

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Sedat Dilek: "Re: linux-next: build failure after merge of the block tree"
Previous message: Iain Paton: "Re: 2.6.37 vlans on bnx2 not functional, panic with tcpdump"
In reply to: Christoph Lameter: "Re: [cpuops cmpxchg double V2 1/4] Generic support forthis_cpu_cmpxchg_double"
Next in thread: Christoph Lameter: "Re: [cpuops cmpxchg double V2 1/4] Generic support forthis_cpu_cmpxchg_double"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]