Re: [PATCH] afs: Fix ENOSPC, EDQUOT and other errors to fail a write rather than retrying

From: Matthew Wilcox
Date: Wed Nov 03 2021 - 23:25:37 EST


On Wed, Nov 03, 2021 at 11:43:20PM +0000, David Howells wrote:
> Currently, at the completion of a storage RPC from writepages, the errors
> ENOSPC, EDQUOT, ENOKEY, EACCES, EPERM, EKEYREJECTED and EKEYREVOKED cause
> the pages involved to be redirtied and the write to be retried by the VM at
> a future time.
>
> However, this is probably not the right thing to do, and, instead, the
> writes should be discarded so that the system doesn't get blocked (though
> unmounting will discard the uncommitted writes anyway).

umm. I'm not sure that throwing away the write is the best answer
for some of these errors. Our whole story around error handling in
filesystems, the page cache and the VFS is pretty sad, but I don't think
that this is the right approach.

Ideally, we'd hold onto the writes in the page cache until (eg for ENOSPC
/ EDQUOT), the user has deleted some files, then retry the writes.

We should definitely stop the user dirtying more pages on this mount,
or at least throttle processes which are dirtying new pages (eg in
folio_mark_dirty()), which implies a check of the superblock. Until the
ENOSPC is cleared up, at which time writeback can resume ... of course,
the server won't necessarily notify us when it is cleared up (because
it might be due to a different client filling the storage), so we might
need to peridically re-attempt writeback so that we know whether ENOSPC
has been resolved.