Re: [PATCH] bcachefs: Avoid a potential memory over-allocation in bch2_printbuf_make_room()

From: Kent Overstreet
Date: Tue Sep 19 2023 - 15:21:30 EST


On Tue, Sep 19, 2023 at 08:34:00PM +0200, Christophe JAILLET wrote:
> Le 19/09/2023 à 15:18, Brian Foster a écrit :
> > On Sat, Sep 16, 2023 at 10:45:23AM +0200, Christophe JAILLET wrote:
> > > kmalloc() and co. don't always allocate a power of 2 number of bytes.
> > > There are some special handling for 64<n<=96 and 128<n<=192 cases.
> > >
> >
> > It's not immediately clear to me what you mean by "special handling."
> > Taking a quick look at slabinfo, it looks like what you mean is that
> > slab rounding is a bit more granular than power of two, particularly in
> > these ranges. Is that right? If so, JFYI it would be helpful to describe
> > that more explicitly in the commit log.
>
> That's what I tried to do with my 2 phrases.
> Sound good and clear to the French speaking man I am :)
>
> Would you mind updating the phrasing yourself?
> A trial and error method about wording with a non native English speaking
> person can be somewhat a long and boring experience to me.
>
> All what I could propose, with the help of google translate, is:
>
> "
> kmalloc() does not necessarily allocate a number of bytes equal to a power
> of two. There are special cases for sizes between 65 and 96 and between 129
> and 192. In these cases, 96 and 192 bytes are allocated respectively.
>
> So, instead of forcing an allocation always equal to a power of two, it may
> be interesting to use the same rounding rules as kmalloc(). This helps avoid
> over-allocating some memory.
>
> Use kmalloc_size_roundup() instead of roundup_pow_of_two().

kmalloc_size_roundup() actually isn't correct in this situation.

Whenever doing a dynamically growable array (e.g. a vector), when
reallocating the new size has to be a constant factor multiple of the
old size. This gets you amortized constant time for vector insertion;
growing the array differently can easily get you O(n^2) time.

IOW, avoiding internal fragmentation isn't what we want; internal
fragmentation is already bounded by the current code.