RE: [PATCH v2] x86: modernize sync_bitops.h

From: David Laight
Date: Wed Nov 21 2018 - 10:53:03 EST




> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> Sent: 21 November 2018 14:42
> To: David Laight
> Cc: mingo@xxxxxxx; tglx@xxxxxxxxxxxxx; Boris Ostrovsky; Juergen Gross; linux-kernel@xxxxxxxxxxxxxxx;
> hpa@xxxxxxxxx
> Subject: RE: [PATCH v2] x86: modernize sync_bitops.h
>
> >>> On 21.11.18 at 14:49, <David.Laight@xxxxxxxxxx> wrote:
> > From: Jan Beulich
> >> Sent: 21 November 2018 13:03
> >>
> >> >>> On 21.11.18 at 12:55, <David.Laight@xxxxxxxxxx> wrote:
> >> > From: Jan Beulich
> >> >> Sent: 21 November 2018 10:11
> >> >>
> >> >> Add missing insn suffixes and use rmwcc.h just like was (more or less)
> >> >> recently done for bitops.h as well.
> >> >
> >> > Why? bts (etc) on memory don't really have an 'operand size'.
> >>
> >> Of course they do - depending on operand size they operate on
> >> 2-, 4-, or 8-byte quantities. When the second operand is a
> >> register, the suffix is redundant (but doesn't hurt), but when
> >> the second operand is an immediate, the assembler (in AT&T
> >> syntax) has no way of knowing what operand size you mean.
> >
> > You need to RTFM.
>
> Excuse me? How about you look at this table from the SDM
> (format of course comes out better in the .pdf):
>
> 0F AB /r BTS r/m16, r16 MR Valid Valid Store selected bit in CF flag and set.
> 0F AB /r BTS r/m32, r32 MR Valid Valid Store selected bit in CF flag and set.
> REX.W + 0F AB /r BTS r/m64, r64 MR Valid N.E. Store selected bit in CF flag and set.
> 0F BA /5 ib BTS r/m16, imm8 MI Valid Valid Store selected bit in CF flag and set.
> 0F BA /5 ib BTS r/m32, imm8 MI Valid Valid Store selected bit in CF flag and set.
> REX.W + 0F BA /5 ib BTS r/m64, imm8 MI Valid N.E. Store selected bit in CF flag and set.
>
> Please read manuals yourself before making such statements.
>
> > Regardless of the 'operand size' the 'bit' instructions do a 32 bit aligned
> > 32 bit wide read/modify/write cycle.
> >
> > The 'operand size' does affect whether the bit number (which is signed)
> > comes from %cl (8 bits), %cx (16 bits), %rcx (32 bits) or (%ecx) 64 bits.
> > But that is implicit in the register name used.
>
> There is no form with %cl as operand. Instead there are forms with
> an immediate operand.

See also section 3.1.1.9 (in the copy of 325462.pdf I have):
(edited out references to the figures)

Bit(BitBase, BitOffset) â Returns the value of a bit within a bit string.
The bit string is a sequence of bits in memory or a register. Bits are
numbered from low-order to high-order within registers and within memory
bytes. If the BitBase is a register, the BitOffset can be in the range 0
to [15, 31, 63] depending on the mode and register size.

If BitBase is a memory address, the BitOffset can range has different ranges
depending on the operand size (see Table 3-2).

The addressed bit is numbered (Offset MOD 8) within the byte at address
(BitBase + (BitOffset DIV 8)) where DIV is signed division with rounding
towards negative infinity and MOD returns a positive number.

Table 3-2. Range of Bit Positions Specified by Bit Offset Operands
Operand Size Immediate BitOffset Register BitOffset
16 0 to 15 2**15 to 2**15 â 1
32 0 to 31 2**31 to 2**31 â 1
64 0 to 63 2**63 to 2**63 â 1

For Bit test it says:

When accessing a bit in memory, the processor may access 4 bytes starting from
the memory address for a 32-bit operand size, using by the following relationship:
Effective Address + (4 â (BitOffset DIV 32))
Or, it may access 2 bytes starting from the memory address for a 16-bit operand, using this relationship:
Effective Address + (2 â (BitOffset DIV 16))
It may do so even when only a single byte needs to be accessed to reach the given bit.

(That is the text I remember being in the old docs, but I don't think it
used to refer to the address being aligned.
My 8086, 286 and 386 books are all at home.)

The other instructions (eg btc) say:

If the bit base operand specifies a memory location, the operand represents the
address of the byte in memory that contains the bit base (bit 0 of the specified byte)
of the bit string. The range of the bit position that can be
referenced by the offset operand depends on the operand size.

So the 'operand size' gives the size of the bit offset, not the size of
the memory cycle.

Maybe I'll run some instructions against a PCIe slave and monitor the TLPs.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)