Re: WARNING: CPU: 1 PID: 83 at arch/x86/kernel/cpu/sgx/main.c:446 ksgxd+0x1b7/0x1d0

From: Haitao Huang
Date: Wed Aug 24 2022 - 22:12:20 EST


Hi Paul

On Tue, 23 Aug 2022 08:48:52 -0500, Paul Menzel <pmenzel@xxxxxxxxxxxxx> wrote:

Dear Dave,


Am 20.08.22 um 08:13 schrieb Paul Menzel:

Am 19.08.22 um 20:28 schrieb Dave Hansen:
On 8/19/22 09:02, Paul Menzel wrote:
On the Dell XPS 13 9370, Linux 5.18.16 prints the warning below:

```
[ 0.000000] Linux version 5.18.0-4-amd64 (debian-kernel@xxxxxxxxxxxxxxxx) (gcc-11 (Debian 11.3.0-5) 11.3.0, GNU ld (GNU Binutils for Debian) 2.38.90.20220713) #1 SMP PREEMPT_DYNAMIC Debian 5.18.16-1 (2022-08-10)
[ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-5.18.0-4-amd64 root=UUID=56f398e0-1e25-4fda-aa9f-611dece4b333 ro quiet
[…]
[ 0.000000] DMI: Dell Inc. XPS 13 9370/0RMYH9, BIOS 1.21.0 07/06/2022
[…]
[ 0.235418] sgx: EPC section 0x40200000-0x45f7ffff

Would you be able to send the entire dmesg, along with:
The log message are attached to the first message, where I missed to carbon-copy linux-sgx@ [1].

cat /proc/iomem # (as root)
and
cpuid -1 --raw
I am going to provide that next week. (Side note, Intel might have some Dell XPS 9370 test machines in some QA lab.)

Please find both outputs at the end of the file.


Could you also check output of "sudo rdmsr -x 0x3a"?
Also was CONFIG_X86_SGX_KVM set?

If CONFIG_X86_SGX_KVM is not set and bit 17 (SGX_LC) of the MSR 3A not set,
then I think following sequence during sgx_init is possible:

sgx_page_cache_init -> sgx_setup_epc_section
->put all physical EPC pages in sgx_dirty_page_list.
Kick off ksgxd.
Later, sgx_drv_init returns none-zero due to this check:
if (!cpu_feature_enabled(X86_FEATURE_SGX_LC))
return -ENODEV;
sgx_vepc_init also returns none-zero if CONFIG_X86_SGX_KVM was not set.

And sgx_init will call kthread_stop(ksgxd_tsk):
ret = sgx_drv_init();

if (sgx_vepc_init() && ret)
goto err_provision;
...
err_provision:
misc_deregister(&sgx_dev_provision);

err_kthread:
kthread_stop(ksgxd_tsk);


That triggers __sgx_sanitize_pages return early due to these lines:
/* dirty_page_list is thread-local, no need for a lock: */
while (!list_empty(dirty_page_list)) {
if (kthread_should_stop())
return;

And that would trigger (depends on timing?) the warning in ksgxd due to non-empty sgx_dirty_page_list
at that moment.

Thanks
Haitao