Re: [PATCH v9] misc: sgi-gru: fix use-after-free error in gru_set_context_option, gru_fault and gru_handle_user_call_os

From: Zheng Hacker
Date: Wed Nov 09 2022 - 22:25:03 EST


Dimitri Sivanich <sivanich@xxxxxxx> 于2022年11月10日周四 06:33写道:
>

> > diff --git a/drivers/misc/sgi-gru/grufault.c b/drivers/misc/sgi-gru/grufault.c
> > index d7ef61e602ed..bdd515d33225 100644
> > --- a/drivers/misc/sgi-gru/grufault.c
> > +++ b/drivers/misc/sgi-gru/grufault.c
> > @@ -656,7 +656,9 @@ int gru_handle_user_call_os(unsigned long cb)
> > if (ucbnum >= gts->ts_cbr_au_count * GRU_CBR_AU_SIZE)
> > goto exit;
> >
> > - gru_check_context_placement(gts);
> > + ret = gru_check_context_placement(gts);
>
> Below we do not want to skip over the rest of the logic in either return case.
> You will have to do the gru_find_lock_gts again after unloading the gru context,
> then check the gts value, then return EINVAL if not set (same as earlier in the
> function).

Hello my friend! Thanks for your detailed review. I think you are right.
I'll see what I can do and make a revision.

>
> > + if (ret)
> > + goto err;
> >
> > /*
> > * CCH may contain stale data if ts_force_cch_reload is set.
> > @@ -677,6 +679,10 @@ int gru_handle_user_call_os(unsigned long cb)
> > exit:
> > gru_unlock_gts(gts);
> > return ret;
> > +err:
> > + gru_unlock_gts(gts);
> > + gru_unload_context(gts, 1);
> > + return -EINVAL;
> > }
> >
> > /*
> > @@ -874,7 +880,11 @@ int gru_set_context_option(unsigned long arg)
> > } else {
> > gts->ts_user_blade_id = req.val1;
> > gts->ts_user_chiplet_id = req.val0;
> > - gru_check_context_placement(gts);
> > + if (gru_check_context_placement(gts)) {
> > + gru_unlock_gts(gts);
> > + gru_unload_context(gts, 1);
>
> Looking at this again, I think we should return 0, as we originally would
> have done in this case anyway.

I think here we have finished the handling of the gts when it's illegal.
So return 0 is fine for me.

>
> > + return -EINVAL;
> > + }
> > }
> > break;
> > case sco_gseg_owner:

> > * If the current task is the context owner, verify that the
> > @@ -726,15 +727,23 @@ void gru_check_context_placement(struct gru_thread_state *gts)
> > * references. Pthread apps use non-owner references to the CBRs.
> > */
> > gru = gts->ts_gru;
> > + /*
> > + * If gru or gts->ts_tgid_owner isn't initialized properly, return
> > + * success is fine, for it's not a deadly error. The related variable
> > + * can be reconfigure in other function.The caller is responsible
> > + * for their inspection, and reinitialization if needed.
> > + */
>
> How about this instead?
> "If gru or gts->ts_tgid_owner isn't initialized properly, return
> success to indicate that the caller does not need to unload the gru
> context. The caller is responsible for their inspection and
> reinitialization if needed."
>

Yes, sounds much more clear for the developer, thanks for your suggestion :)

Best regards,
Zheng Wang