Re: [PATCH] x86/msr: do not warn on writes to OC_MAILBOX

From: Srinivas Pandruvada
Date: Tue Oct 20 2020 - 13:22:07 EST


On Mon, 2020-10-19 at 19:15 +0200, Borislav Petkov wrote:
> On Tue, Sep 08, 2020 at 06:02:05PM -0700, Srinivas Pandruvada wrote:
> > The actual OC mailbox implementation itself is implemented in Linux
> > in
> > intel_turbo_max_3 driver. So that is public.
> > So someone can develop a driver and provide some sysfs to send
> > mailbox
> > commands, but kernel can't validate commands which can cause any
> > security or stability issues. Not sure if this is acceptable
> > standard.
> > I don't think there is any precedent of creating such blind sysfs
> > entries.
>
> So we don't need to validate those commands - we can issue a
> pr_warn_once() when something pokes at that to say that issuing those
> commands is dangerous.
>
> For example, from looking at
>
> drivers/platform/x86/intel_turbo_max_3.c::get_oc_core_priority()
>
> we should at least provide a well-defined interface to at least
> synchronize access to that MSR with the kernel. And then maybe allow
> a
> well-defined set of commands or better yet, we do them ourselves.
> Here's
> what I mean:
>
> Looking at the code in intel-undervolt:
>
> bool undervolt(struct config_t * config, bool * nl, bool write) {
> bool success = true;
> bool nll = false;
> int i;
>
> for (i = 0; config->undervolts && i < config->undervolts-
> >count; i++) {
> struct undervolt_t * undervolt = array_get(config-
> >undervolts, i);
>
> static const int mask = 0x800;
> uint64_t uvint = ((uint64_t) (mask - absf(undervolt-
> >value) * 1.024f +
> 0.5f) << 21) & 0xffffffff;
> uint64_t rdval = 0x8000001000000000 |
> ((uint64_t) undervolt->index << 40);
> uint64_t wrval = rdval | 0x100000000 | uvint;
>
> bool write_success = !write ||
> wr(config, MSR_ADDR_VOLTAGE, wrval);
> bool read_success = write_success &&
> wr(config, MSR_ADDR_VOLTAGE, rdval) &&
> rd(config, MSR_ADDR_VOLTAGE, rdval);
>
>
> That MSR_ADDR_VOLTAGE is 0x150, i.e., MSR_OC_MAILBOX.
>
> Trying to decipher the MSR accesses, it looks like it does the write
> with:
>
> 0x8000001000000000 | (0xf << 40) | (0x3 << 21) | 0x100000000
>
> and I've made the uvint 0x3 so that I can see the two 11s in the
> bitfield below.
>
> The undervolt index I made 0xffff for a similar reason:
>
> And the result is:
>
> Hex: 0x80000f1100600000 Dec: 9.223.388.602.549.927.936
> 31 27 23 19 15 11 7 3 31 27 23 19 15 11
> 7 3
> 1000_0000_0000_0000_0000_1111_0001_0001_0000_0000_0110_0000_0000_0000
> _0000_0000
> 63 59 55 51 47 43 39 35 31 27 23 19 15 11
> 7 3
>
> With
>
> - bit 63: MSR_OC_MAILBOX_BUSY_BIT
>
> - [47?:40]: that's some index, undervolting index, who knows. I'm
> assuming this is
> a byte, thus the 47?.
>
>
> - [39?:32]: cmd, in this case, 0x11, gonna assume that the command is
> bits [39:32]
> looking how this is a byte too:
>
> #define OC_MAILBOX_FC_CONTROL_CMD 0x1C
>
> and
>
> - [31:21]: the undervolt value
>
> The second write does:
>
> 0x8000001000000000 | (0xf << 40)
> Hex: 0x80000f1000000000 Dec: 9.223.388.598.248.669.184
> 31 27 23 19 15 11 7 3 31 27 23 19 15 11
> 7 3
> 1000_0000_0000_0000_0000_1111_0001_0000_0000_0000_0000_0000_0000_0000
> _0000_0000
> 63 59 55 51 47 43 39 35 31 27 23 19 15 11
> 7 3
>
> - bit 63: MSR_OC_MAILBOX_BUSY_BIT
> - [47:40] index
> - [39:32] cmd - 0x10
>
> All from only staring at this anyway - could very well be wrong.
>
These command id are model specific. There is no guarantee that even
meaning changes. So I don't think we should write any code in kernel
which can't stick.


> In any case, my point is that we could have a sysfs interface for
> those userspace-suppliable values like the undervolt value at
> [31:21],
> dunno if the index can be inferred by the kernel automatically or
> enumerated and the commands we should issue ourselves depending on
> the
> functionality, etc.
>
> And put all that in drivers/platform/x86/intel_turbo_max_3.c instead
> of
> leaving userspace to poke at it.
>
May be something like this:
- Separate mailbox stuff from intel_turbo_max_3.c
- Create a standalone module which creates a debugfs interface
- This debugs interface takes one 64 bit value from user space and use
protocol to avoid contention
- Warns users on writes via new interfaces you suggested above

> Thoughts?
>
> Btw, intel-undervolt pokes all in all at:
>
> #define MSR_ADDR_TEMPERATURE 0x1a2
Need to check use case for undervolt.

> #define MSR_ADDR_UNITS 0x606
Why not reuse powercap rapl interface. That interface will take care of
units.

> #define MSR_ADDR_VOLTAGE 0x150
This will not be needed once we expose the above debugfs interface.
This is the OC mailbox MSR.

Thanks,
Srinivas

>
> and those should probably be exposed too.
>
> The temperature target one is read at least by this too:
>
> https://py3status.readthedocs.io/en/latest/modules.html
>
> but at least that MSR is documented so exposing it is trivial.
>
> Thx.
>