Re: [PATCH v7 06/20] x86/virt/tdx: Shut down TDX module in case of error

From: Dave Hansen
Date: Tue Nov 22 2022 - 14:25:00 EST


On 11/22/22 11:13, Peter Zijlstra wrote:
> On Tue, Nov 22, 2022 at 07:14:14AM -0800, Dave Hansen wrote:
>> On 11/22/22 01:13, Peter Zijlstra wrote:
>>> On Mon, Nov 21, 2022 at 01:26:28PM +1300, Kai Huang wrote:
>>>> +/*
>>>> + * Call the SEAMCALL on all online CPUs concurrently. Caller to check
>>>> + * @sc->err to determine whether any SEAMCALL failed on any cpu.
>>>> + */
>>>> +static void seamcall_on_each_cpu(struct seamcall_ctx *sc)
>>>> +{
>>>> + on_each_cpu(seamcall_smp_call_function, sc, true);
>>>> +}
>>>
>>> Suppose the user has NOHZ_FULL configured, and is already running
>>> userspace that will terminate on interrupt (this is desired feature for
>>> NOHZ_FULL), guess how happy they'll be if someone, on another parition,
>>> manages to tickle this TDX gunk?
>>
>> Yeah, they'll be none too happy.
>>
>> But, what do we do?
>
> Not intialize TDX on busy NOHZ_FULL cpus and hard-limit the cpumask of
> all TDX using tasks.

I don't think that works. As I mentioned to Thomas elsewhere, you don't
just need to initialize TDX on the CPUs where it is used. Before the
module will start working you need to initialize it on *all* the CPUs it
knows about. The module itself has a little counter where it tracks
this and will refuse to start being useful until it gets called
thoroughly enough.

>> There are technical solutions like detecting if NOHZ_FULL is in play and
>> refusing to initialize TDX. There are also non-technical solutions like
>> telling folks in the documentation that they better modprobe kvm early
>> if they want to do TDX, or their NOHZ_FULL apps will pay.
>
> Surely modprobe kvm isn't the point where TDX gets loaded? Because
> that's on boot for everybody due to all the auto-probing nonsense.
>
> I was expecting TDX to not get initialized until the first TDX using KVM
> instance is created. Am I wrong?

I went looking for it in this series to prove you wrong. I failed. :)

tdx_enable() is buried in here somewhere:

> https://lore.kernel.org/lkml/CAAhR5DFrwP+5K8MOxz5YK7jYShhaK4A+2h1Pi31U_9+Z+cz-0A@xxxxxxxxxxxxxx/T/

I don't have the patience to dig it out today, so I guess we'll have Kai
tell us.

>> We could also force the TDX module to be loaded early in boot before
>> NOHZ_FULL is in play, but that would waste memory on TDX metadata even
>> if TDX is never used.
>
> I'm thikning it makes sense to have a tdx={off,on-demand,force} toggle
> anyway.

Yep, that makes total sense. Kai had one in an earlier version but I
made him throw it out because it wasn't *strictly* required and this set
is fat enough.

>> How do NOHZ_FULL folks deal with late microcode updates, for example?
>> Those are roughly equally disruptive to all CPUs.
>
> I imagine they don't do that -- in fact I would recommend we make the
> whole late loading thing mutually exclusive with nohz_full; can't have
> both.

So, if we just use schedule_on_cpu() for now and have the TDX code wait,
will a NOHZ_FULL task just block the schedule_on_cpu() indefinitely?

That doesn't seem like _horrible_ behavior to start off with for a
minimal series.