Re: [RFC v2-fix-v1 3/3] x86/tdx: Handle port I/O

From: Kuppuswamy, Sathyanarayanan
Date: Sat Jun 05 2021 - 16:12:10 EST




On 6/5/21 11:52 AM, Dan Williams wrote:
On Wed, May 26, 2021 at 9:24 PM Kuppuswamy Sathyanarayanan
<sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx> wrote:

From: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx>

TDX hypervisors cannot emulate instructions directly. This
includes port IO which is normally emulated in the hypervisor.
All port IO instructions inside TDX trigger the #VE exception
in the guest and would be normally emulated there.

For the really early code in the decompressor, #VE cannot be
used because the IDT needed for handling the exception is not
set-up, and some other infrastructure needed by the handler
is missing. So to support port IO in decompressor code, add
support for paravirt based I/O port virtualization.

Also string I/O is not supported in TDX guest. So, unroll the
string I/O operation into a loop operating on one element at
a time. This method is similar to AMD SEV, so just extend the
support for TDX guest platform.

Given early port IO is broken out in its own previous I think it makes
sense to break out the decompressor port IO enabling from final
runtime port IO support.

Patch titled "x86/tdx: Handle early IO operations" mainly adds
IO #VE support in early exception handler. Decompression code IO
support does not have dependency on it. You still think it is
better to move it that patch?


The argument in the previous patch about using #VE emulation in the
early code was collisions with trace and printk support in the "fully
featured" #VE handler later in the series. My interpretation of that
collision was due to the possibility of the #VE handler going into
infinite recursion if a printk in the handler triggered port IO. It

No. AFAIK, It has nothing to do with infinite recursion. We are just
highlighting the fact that when kernel uses early exception handler
support, we cannot use code path that enables tracing support. So we
use simplest way to trigger IO hypercalls.

if (early #VE exception path)
handle_io_ve()
__tdx_hypercall

if (normal #VE path)
handle_io_ve()
__tdx_hypercall (current version)
// Later on when adding tracing support, we will replace it
// with trace hypercalls.
__trace_tdx_hypercall

As you can see in above design flow, later on when adding tracing
support we will have split the early #IO handling code from
normal IO handling code. So instead of using common code now and
refactor it later on, we just use different code path for both
of them.

seems I do not have the right picture of the constraints. Given the
runtime kernel can direct replace in/out macros I would expect a
statement of the tradeoff with #VE emulation and why the post
decompressor code is still using emulation.

Currently decompression code cannot use #VE based IO emulation. It does
not know how to handle #VE exceptions. Also, It is much easier to replace
IO calls with TDX hypercalls in decompression code when compared with
teaching how to handle #VE exceptions in decompression code.



Co-developed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
Reviewed-by: Andi Kleen <ak@xxxxxxxxxxxxxxx>
---
arch/x86/boot/compressed/Makefile | 1 +
arch/x86/boot/compressed/tdcall.S | 3 ++
arch/x86/boot/compressed/tdx.c | 28 ++++++++++++++++++
arch/x86/include/asm/io.h | 7 +++--
arch/x86/include/asm/tdx.h | 47 ++++++++++++++++++++++++++++++-
arch/x86/kernel/tdx.c | 39 +++++++++++++++++++++++++
6 files changed, 122 insertions(+), 3 deletions(-)
create mode 100644 arch/x86/boot/compressed/tdcall.S

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index a2554621cefe..a944a2038797 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -97,6 +97,7 @@ endif


static int __ro_after_init tdx_guest = -1;

@@ -30,3 +32,29 @@ bool is_tdx_guest(void)
return !!tdx_guest;
}

+/*
+ * Helper function used for making hypercall for "out"
+ * instruction. It will be called from __out IO
+ * macro (in tdx.h).
+ */
+void tdg_out(int size, int port, unsigned int value)
+{
+ __tdx_hypercall(EXIT_REASON_IO_INSTRUCTION, size, 1,
+ port, value, NULL);
+}
+
+/*
+ * Helper function used for making hypercall for "in"
+ * instruction. It will be called from __in IO macro
+ * (in tdx.h). If IO is failed, it will return all 1s.
+ */
+unsigned int tdg_in(int size, int port)
+{
+ struct tdx_hypercall_output out = {0};
+ int err;
+
+ err = __tdx_hypercall(EXIT_REASON_IO_INSTRUCTION, size, 0,
+ port, 0, &out);
+
+ return err ? UINT_MAX : out.r11;
+}

The previous patch open coded tdg_{in,out} and this one provides
helpers. I think at a minimum they should be consistent and pick one
style.

As I have mentioned above, early IO #VE handler is a special case. we
don't want to complicate its code path with debug or tracing support.
So it is not a good comparison target.

In this case, the reason for adding helper function is to make it easier
for calling it from tdx.h.


diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index ef7a686a55a9..daa75c8eef5d 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -40,6 +40,7 @@


snip

+
+/* Helper function for converting {b,w,l} to byte size */
+static inline int tdx_get_iosize(char *str)
+{
+ if (str[0] == 'w')
+ return 2;
+ else if (str[0] == 'l')
+ return 4;
+
+ return 1;
+}

This seems like an unnecessary novelty. The BUILDIO() macro in
arch/x86/include/asm/io.h takes a type argument, why can't the size be
explicitly specified rather than inferred from string parsing?

I don't want to make changes to generic macros in io.h if it can be
avoided. It follows similar argument/type in all arch/* code. Also, it
is easier to handle TDX as a special case here.


--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer