Re: [PATCH v10 02/13] x86/insn-eval: Compute linear address in several utility functions

From: Ricardo Neri
Date: Fri Nov 03 2017 - 20:41:43 EST


On Fri, Nov 03, 2017 at 11:17:49AM +0100, Ingo Molnar wrote:
>
> * Ricardo Neri <ricardo.neri-calderon@xxxxxxxxxxxxxxx> wrote:
>
> > On Thu, Nov 02, 2017 at 09:51:08AM +0100, Ingo Molnar wrote:
> > >
> > > * Ricardo Neri <ricardo.neri-calderon@xxxxxxxxxxxxxxx> wrote:
> > >
> > > > + /*
> > > > + * -EDOM means that we must ignore the address_offset. In such a case,
> > > > + * in 64-bit mode the effective address relative to the RIP of the
> > > > + * following instruction.
> > > > + */
> > > > + if (*regoff == -EDOM) {
> > > > + if (user_64bit_mode(regs))
> > > > + tmp = (long)regs->ip + insn->length;
> > > > + else
> > > > + tmp = 0;
> > > > + } else if (*regoff < 0) {
> > > > + return -EINVAL;
> > > > + } else {
> > > > + tmp = (long)regs_get_register(regs, *regoff);
> > > > + }
> > >
> > > > + else
> > > > + indx = (long)regs_get_register(regs, indx_offset);
> > >
> > > This and subsequent patches include a disgustly insane amount of type casts - why?
> > >
> > > For example here 'tmp' is 'long', while regs_get_register() returns
> > > 'unsigned long', but no type cast is necessary for that.
> > >
> > > > + ret = get_eff_addr_modrm(insn, regs, &addr_offset,
> > > > + &eff_addr);
> >
> > One of the goals of this series is to have the ability to compute 16-bit, 32-bit
> > and 64-bit addresses. I put lost of casts, between signed and unsigned types,
> > between 64-bit and 32-bit and 16-bit casts. After seeing your comment I have gone
> > through the code and I have removed most of the casts. Instead I will use masks.
> > I will also inspect the resulting assembly code to make sure the arithmetic is
> > performed in the address size pertinent to each case.
>
> Well, casts are probably fine when the goal is to zero out high bits

I was able to remove the majority of casts and used masks. I see many other parts
of Linux doing similarly. For instance, in arch/x86/kernel/kexec-bzimage64.c I see

params->hdr.ramdisk_image = initrd_load_addr & 0xffffffffUL;

ramdisk_image is of type __u32 while initrd_load_addr is of type unsigned long.

I guess that in this example doing

params->hdr.ramdisk_image = (__u32)(initrd_load_addr & 0xffffffffUL);

would be redundant? The mask would indicate better what is going on.

> but the ones I quoted converted types of the same with.

I made sure I removed these.

>
> For register values it would also probably be cleaner to use the u8, u16, u32 and
> u64 types instead of char/short/int/long - this goes hand in hand with how the
> instructions are documented in the SDMs.

In the rest of the functions I have used char/short/int/long. Would it be OK to have
a mixture of type styles? Perhaps I can rewrite only the functions that deal with
variables of different width.

Plus, one more advantage of using char/short/int/long is that when building a 32-bit
kernel long will be a 32-bit type. Thus, all the aritmetic would be naturally done
with variables of the appropriate width. Perhaps I could use u8/u16/u32/long? It
looks white odd, though.

Thanks and BR,
Ricardo