Re: [syzbot] BUG: sleeping function called from invalid context in __fdget_pos

From: Herbert Xu
Date: Wed Jun 30 2021 - 04:10:06 EST


Hi Ard:

On Wed, Jun 30, 2021 at 09:42:14AM +0200, Ard Biesheuvel wrote:
>
> > There's one suspect-looking site in xts_crypt():
> >
> > > kernel_fpu_begin();
> > >
> > > /* calculate first value of T */
> > > aesni_enc(aes_ctx(ctx->raw_tweak_ctx), walk.iv, walk.iv);
> > >
> > > while (walk.nbytes > 0) {
> > > int nbytes = walk.nbytes;
> > >
> > > ...
> > >
> > > err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
> > >
> > > kernel_fpu_end();
> > >
> > > if (walk.nbytes > 0)
> > > kernel_fpu_begin();
> > > }
> >
> > I wonder if a slab allocation failure could leave us with walk.nbytes==0.
>
> The code is actually the other way around: kernel_fpu_end() comes
> before the call to skcipher_walk_done().
>
> So IIUC, this code forces an allocation failure, and checks whether
> the code deals with this gracefully, right?
>
> The skcipher walk API guarantees that walk.nbytes == 0 if an error is
> returned, so the pairing of FPU begin/end looks correct to me. And
> skcipher_walk_next() should not invoke anything that might sleep from
> this particular context.
>
> Herbert, any ideas?

xts_crypt looks buggy to me. In particular, if the second
skcipher_walk_virt call (the one in the if clause) fails, then
we will return without calling kernel_fpu_end.

Another issue, we are not checking for errors on the first
skcipher_walk_virt call, this may cause a double-free with
the subsequent skcipher_walk_abort inside the if clause.

With skcikpher_walk_virt, you must check for errors explicitly
*unless* you use it in a loop construct which exits on !walk->nbytes.

Thanks,
--
Email: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt