Re: [PATCH v2 2/6] x86/sgx: Do not consider unsanitized pages an error

From: Reinette Chatre
Date: Thu Sep 01 2022 - 18:34:27 EST


Hi Jarkko,

On 9/1/2022 2:53 PM, Jarkko Sakkinen wrote:
> On Wed, Aug 31, 2022 at 01:39:53PM -0700, Reinette Chatre wrote:
>> On 8/31/2022 10:38 AM, Jarkko Sakkinen wrote:

>> I think I am missing something here. A lot of logic is added here but I
>> do not see why it is necessary. ksgxd() knows via kthread_should_stop() if
>> the reclaimer was canceled. I am thus wondering, could the above not be
>> simplified to something similar to V1:
>>
>> @@ -388,6 +393,8 @@ void sgx_reclaim_direct(void)
>>
>> static int ksgxd(void *p)
>> {
>> + unsigned long left_dirty;
>> +
>> set_freezable();
>>
>> /*
>> @@ -395,10 +402,10 @@ static int ksgxd(void *p)
>> * required for SECS pages, whose child pages blocked EREMOVE.
>> */
>> __sgx_sanitize_pages(&sgx_dirty_page_list);
>
> IMHO, would make sense also to have here:
>
> if (!kthread_should_stop())
> return 0;
>

Would this not prematurely stop the thread when it should not be?

>> - __sgx_sanitize_pages(&sgx_dirty_page_list);
>>
>> - /* sanity check: */
>> - WARN_ON(!list_empty(&sgx_dirty_page_list));
>> + left_dirty = __sgx_sanitize_pages(&sgx_dirty_page_list);
>> + if (left_dirty && !kthread_should_stop())
>> + pr_err("%lu unsanitized pages\n", left_dirty);
>
> That would be incorrect, if the function returned
> because of kthread stopped.


I should have highlighted this but in my example I changed
left_dirty to be "unsigned long" with the intention that the
"return -ECANCELED" is replaced with "return 0".

__sgx_sanitize_pages() returns 0 when it exits because of
kthread stopped.

To elaborate I was thinking about:

+static unsigned long __sgx_sanitize_pages(struct list_head *dirty_page_list)
{
+ unsigned long left_dirty = 0;
struct sgx_epc_page *page;
LIST_HEAD(dirty);
int ret;

- /* dirty_page_list is thread-local, no need for a lock: */
while (!list_empty(dirty_page_list)) {
if (kthread_should_stop())
- return;
+ return 0;

page = list_first_entry(dirty_page_list, struct sgx_epc_page, list);

@@ -92,12 +95,14 @@ static void __sgx_sanitize_pages(struct list_head *dirty_page_list)
} else {
/* The page is not yet clean - move to the dirty list. */
list_move_tail(&page->list, &dirty);
+ left_dirty++;
}

cond_resched();
}

list_splice(&dirty, dirty_page_list);
+ return left_dirty;
}


and then with what I had in previous email the checks should work:

@@ -388,6 +393,8 @@ void sgx_reclaim_direct(void)

static int ksgxd(void *p)
{
+ unsigned long left_dirty;
+
set_freezable();

/*
@@ -395,10 +402,10 @@ static int ksgxd(void *p)
* required for SECS pages, whose child pages blocked EREMOVE.
*/
__sgx_sanitize_pages(&sgx_dirty_page_list);
- __sgx_sanitize_pages(&sgx_dirty_page_list);

- /* sanity check: */
- WARN_ON(!list_empty(&sgx_dirty_page_list));
+ left_dirty = __sgx_sanitize_pages(&sgx_dirty_page_list);
+ if (left_dirty && !kthread_should_stop())
+ pr_err("%lu unsanitized pages\n", left_dirty);

while (!kthread_should_stop()) {
if (try_to_freeze())


>
> If you do the check here you already have a window
> where kthread could have been stopped anyhow.
>
> So even this would be less correct:
>
> if (kthreas_should_stop()) {
> return 0;
> } else if (left_dirty) {
> pr_err("%lu unsanitized pages\n", left_dirty);
> }
>
> So in the end you end as complicated and less correct
> fix. This all is explained in the commit message.
>
> If you unconditionally print error, you don't have
> a meaning for the number of unsanitized pags.

Understood that the goal is to only print the
number of unsanitized pages if ksgxd has not been
stopped prematurely.

Reinette