Re: [Question] int3_selftest() generates a #UD instead of a #BP when create a SEV VM

From: Sean Christopherson
Date: Wed Jul 26 2023 - 13:54:53 EST


On Wed, Jul 26, 2023, Tom Lendacky wrote:
> On 7/25/23 21:41, Wu Zongyong wrote:
> > Hi,
> >
> > I try to boot a SEV VM (just SEV, no SEV-ES and no SEV-SNP) with a
> > firmware written by myself.
> >
> > But when the linux kernel executed the int3_selftest(), a #UD generated
> > instead of a #BP.
> >
> > The stack is as follows.
> >
> > [ 0.141804] invalid opcode: 0000 [#1] PREEMPT SMP^M
> > [ 0.141804] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.3.0+ #37^M
> > [ 0.141804] RIP: 0010:int3_selftest_ip+0x0/0x2a^M
> > [ 0.141804] Code: eb bc 66 90 0f 1f 44 00 00 48 83 ec 08 48 c7 c7 90 0d 78 83 c7 44 24 04 00 00 00 00 e8 23 fe ac fd 85 c0 75 22 48 8d 7c 24 04 <cc> 90 90 90 90 83 7c 24 04 01 75 13 48 c7 c7 90 0d 78 83 e8 42 fc^M
> > [ 0.141804] RSP: 0000:ffffffff82803f18 EFLAGS: 00010246^M
> > [ 0.141804] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000007ffffffe^M
> > [ 0.141804] RDX: ffffffff82fd4938 RSI: 0000000000000296 RDI: ffffffff82803f1c^M
> > [ 0.141804] RBP: 0000000000000000 R08: 0000000000000000 R09: 00000000fffeffff^M
> > [ 0.141804] R10: ffffffff82803e08 R11: ffffffff82f615a8 R12: 00000000ff062350^M
> > [ 0.141804] R13: 000000001fddc20a R14: 000000000090122c R15: 0000000002000000^M
> > [ 0.141804] FS: 0000000000000000(0000) GS:ffff88801f200000(0000) knlGS:0000000000000000^M
> > [ 0.141804] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
> > [ 0.141804] CR2: ffff888004c00000 CR3: 000800000281f000 CR4: 00000000003506f0^M
> > [ 0.141804] Call Trace:^M
> > [ 0.141804] <TASK>^M
> > [ 0.141804] alternative_instructions+0xe/0x100^M
> > [ 0.141804] check_bugs+0xa7/0x110^M
> > [ 0.141804] start_kernel+0x320/0x430^M
> > [ 0.141804] secondary_startup_64_no_verify+0xd3/0xdb^M
> > [ 0.141804] </TASK>^M
> > [ 0.141804] Modules linked in:^M
> > [ 0.141804] ---[ end trace 0000000000000000 ]--
> >
> > I'm curious how this happend. I cannot find any condition that would
> > cause the int3 instruction generate a #UD according to the AMD's spec.

One possibility is that the value from memory that gets executed diverges from the
value that is read out be the #UD handler, e.g. due to patching (doesn't seem to
be the case in this test), stale cache/tlb entries, etc.

> > BTW, it worked nomarlly with qemu and ovmf.
>
> Does this happen every time you boot the guest with your firmware? What
> processor are you running on?

And have you ruled out KVM as the culprit? I.e. verified that KVM is NOT injecting
a #UD. That obviously shouldn't happen, but it should be easy to check via KVM
tracepoints.