RE: [PATCH 2/6] x86/tdx: Retry TDVMCALL_MAP_GPA() when needed

From: Dexuan Cui
Date: Sun Nov 27 2022 - 19:07:43 EST


> From: Michael Kelley (LINUX) <mikelley@xxxxxxxxxxxxx>
> Sent: Wednesday, November 23, 2022 5:30 AM
> > [...]
> > static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc)
> > {
> > int max_retry_cnt = 1000, retry_cnt = 0;
> > struct tdx_hypercall_args args;
> > u64 map_fail_paddr, ret;
> >
> > while (1) {
> > args.r10 = TDX_HYPERCALL_STANDARD;
> > args.r11 = TDVMCALL_MAP_GPA;
> > args.r12 = start;
> > args.r13 = end - start;
> > args.r14 = 0;
> > args.r15 = 0;
> >
> > ret = __tdx_hypercall(&args, TDX_HCALL_HAS_OUTPUT);
> > if (!ret)
> > break;
>
> The above test is redundant and can be removed. The "success" case is
> implicitly handled by the test below for != TDVMCALL_STATUS_RETRY.

Good point. Will remove the redundant test.

> > if (ret != TDVMCALL_STATUS_RETRY)
> > break;
> > /*
> > * The guest must retry the operation for the pages in
> the
> > * region starting at the GPA specified in R11. Make sure
> R11
> > * contains a sane value.
> > */
> > map_fail_paddr = args.r11 ;
> > if (map_fail_paddr < start || map_fail_paddr >= end)
> > return false;
> >
> > if (map_fail_paddr == start) {
> > retry_cnt++;
> > if (retry_cnt > max_retry_cnt)
> > return false;
> > } else {
> > retry_cnt = 0;;
> > start = map_fail_paddr;
>
> Just summarizing the code, we increment the retry count if the hypercall
> returns STATUS_RETRY but did nothing (i.e., map_fail_paddr == start). But
> if the hypercall returns STATUS_RETRY after making at least some progress,
> then we reset the retry count. So in the worst case, for example, if the
> hypercall processed only one page on each invocation, the loop will continue
> until completion, without hitting any retry limits. That scenario seems
> plausible and within the spec.

Exactly.

> Do we have any indication about the likelihood of the "RETRY but did
> nothing" case? The spec doesn't appear to disallow this case, but does
> Hyper-V actually do this? It seems like a weird case.
>
> Michael

Yes, Hyper-V does do this, according to my test. It looks like this is not
because the operation is too time-consuming -- it looks like there is some
Hyper-V specific activity going on.