Re: [PATCH 16/19] x86, mpx: support 32-bit binaries on 64-bit kernel

From: Thomas Gleixner
Date: Mon May 18 2015 - 17:53:47 EST


On Fri, 8 May 2015, Dave Hansen wrote:

>
> From: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
>
> Changes from v20:
>
> * Fix macro confusion between BD and BT
> * Add accessor for bt_entry_size_bytes()

Forgot to say this earlier. Please put the changes after the changelog
itself, i.e. after the '---'

> -#ifdef CONFIG_X86_64
> -
> -/* upper 28 bits [47:20] of the virtual address in 64-bit used to
> - * index into bounds directory (BD).
> +/*
> + * The upper 28 bits [47:20] of the virtual address in 64-bit
> + * are used to index into bounds directory (BD).
> + *
> + * The directory is 2G (2^31) in size, and with 8-byte entries
> + * it has 2^28 entries.
> */
> -#define MPX_BD_ENTRY_OFFSET 28
> -#define MPX_BD_ENTRY_SHIFT 3
> -/* bits [19:3] of the virtual address in 64-bit used to index into
> - * bounds table (BT).
> +#define MPX_BD_SIZE_BYTES_64 (1UL<<31)
> +/* An entry is a long, so 8 bytes and a shift of 3 */

I can see the 8 bytes, but where is the shift constant?

> +#define MPX_BD_ENTRY_BYTES_64 8
> +#define MPX_BD_NR_ENTRIES_64 (MPX_BD_SIZE_BYTES_64/MPX_BD_ENTRY_BYTES_64)

> +
> +/*
> + * The 32-bit directory is 4MB (2^22) in size, and with 4-byte
> + * entries it has 2^20 entries.
> */
> +#define MPX_BD_SIZE_BYTES_32 (1UL<<22)
> +/* An entry is a long, so 4 bytes and a shift of 2 */

Ditto.

> +#define MPX_BD_ENTRY_BYTES_32 4
> +#define MPX_BD_NR_ENTRIES_32 (MPX_BD_SIZE_BYTES_32/MPX_BD_ENTRY_BYTES_32)

Otherwise this macro zoo looks good.

> +static inline int bt_entry_size_bytes(struct mm_struct *mm)
> +{
> + if (is_64bit_mm(mm))
> + return MPX_BT_ENTRY_BYTES_64;
> + else
> + return MPX_BT_ENTRY_BYTES_32;
> +}
> +
> +/*
> + * Take a virtual address and turns it in to the offset in bytes
> + * inside of the bounds table where the bounds table entry
> + * controlling 'addr' can be found.
> + */
> +static unsigned long mpx_get_bt_entry_offset_bytes(struct mm_struct *mm,
> + unsigned long addr)
> +{
> + unsigned long bt_table_nr_entries;
> + unsigned long offset = addr;
> +
> + if (is_64bit_mm(mm)) {
> + /* Bottom 3 bits are ignored on 64-bit */
> + offset >>= 3;
> + bt_table_nr_entries = MPX_BT_NR_ENTRIES_64;
> + } else {
> + /* Bottom 2 bits are ignored on 32-bit */
> + offset >>= 2;
> + bt_table_nr_entries = MPX_BT_NR_ENTRIES_32;
> + }
> + /*
> + * We know the size of the table in to which we are
> + * indexing, and we have eliminated all the low bits
> + * which are ignored for indexing.
> + *
> + * Mask out all the high bits which we do not need
> + * to index in to the table.
> + */
> + offset &= (bt_table_nr_entries-1);

.... entries - 1);

And you might explain why nr_entries - 1 is a proper mask,
i.e. nr_entries is always a power of 2.

> + /*
> + * We now have an entry offset in terms of *entries* in
> + * the table. We need to scale it back up to bytes.
> + */
> + offset *= bt_entry_size_bytes(mm);

You could store the scale value out in the if () construct above, but I
guess the compiler can figure that out as well :)

> + return offset;
> +}
> +
> +/*
> + * Total size of the process's virtual address space
> + * Use a u64 because 4GB (for 32-bit) won't fit in a long.
> + *
> + * __VIRTUAL_MASK does not work here. It only covers the
> + * user address space and the tables cover the *entire*
> + * virtual address space supported on the CPU.
> + */
> +static inline unsigned long long mm_virt_space(struct mm_struct *mm)
> +{
> + if (is_64bit_mm(mm))
> + return 1ULL << 48;

cpu_info->x86_phys_bits will tell you the proper value

> + else
> + return 1ULL << 32;

And for a 32bit kernel 32 might be wrong because with PAE you have 36
bits.

> +/*
> + * How much virtual address space does a single bounds
> + * directory entry cover?
> + */
> +static inline unsigned long bd_entry_virt_space(struct mm_struct *mm)
> +{
> + if (is_64bit_mm(mm))
> + return mm_virt_space(mm) / MPX_BD_NR_ENTRIES_64;
> + else
> + return mm_virt_space(mm) / MPX_BD_NR_ENTRIES_32;
> +}
> +
> +/*
> + * Return an offset in terms of bytes in to the bounds
> + * directory where the bounds directory entry for a given
> + * virtual address resides.
> + *
> + * This has to be in bytes because the directory entries
> + * are different sizes on 64/32 bit.
> + */
> +static unsigned long mpx_get_bd_entry_offset(struct mm_struct *mm,
> + unsigned long addr)
> +{
> + /*
> + * There are several ways to derive the bd offsets. We
> + * use the following approach here:
> + * 1. We know the size of the virtual address space
> + * 2. We know the number of entries in a bounds table
> + * 3. We know that each entry covers a fixed amount of
> + * virtual address space.
> + * So, we can just divide the virtual address by the
> + * virtual space used by one entry to determine which
> + * entry "controls" the given virtual address.
> + */
> + if (is_64bit_mm(mm)) {
> + int bd_entry_size = 8; /* 64-bit pointer */
> + /*
> + * Take the 64-bit addressing hole in to account.
> + * This is a noop on 32-bit since it has no hole.

But a 32bit kernel will not take this code path because
is_64bit_mm(mm) evaluates to false.

> + */
> + addr &= ~(mm_virt_space(mm) - 1);
> + return (addr / bd_entry_virt_space(mm)) * bd_entry_size;
> + } else {
> + int bd_entry_size = 4; /* 32-bit pointer */
> + return (addr / bd_entry_virt_space(mm)) * bd_entry_size;
> + }
> + /*
> + * The two return calls above are exact copies. If we
> + * pull out a single copy and put it in here, gcc won't
> + * realize that we're doing a power-of-2 divide and use
> + * shifts. It uses a real divide. If we put them up
> + * there, it manages to figure it out (gcc 4.8.3).

Can't we provide the shift values from bd_entry_virt_space() so we
don't have to worry about gcc versions being more or less clever?

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/