Re: [PATCH v2 06/12] mm/execmem: introduce execmem_data_alloc()

From: Thomas Gleixner
Date: Sun Jun 18 2023 - 20:44:12 EST


Kent!

On Sun, Jun 18 2023 at 19:14, Kent Overstreet wrote:
> On Mon, Jun 19, 2023 at 12:32:55AM +0200, Thomas Gleixner wrote:
>
> Thomas, you're confusing an internal interface with external

No. I am not.

Whether that's an internal function or not does not make any difference
at all.

Having a function growing _eight_ parameters where _six_ of them are
derived from a well defined data structure is a clear sign of design
fail.

It's not rocket science to do:

struct generic_allocation_info {
....
};

struct execmem_info {
....
struct generic_allocation_info alloc_info;
;

struct execmem_param {
...
struct execmem_info[NTYPES];
};

and having a function which can either operate on execmem_param and type
or on generic_allocation_info itself. It does not matter as long as the
data structure which is handed into this internal function is
describing it completely or needs a supplementary argument, i.e. flags.

Having tons of wrappers which do:

a = generic_info.a;
b = generic_info.b;
....
n = generic_info.n;

internal_func(a, b, ....,, n);

is just hillarious and to repeat myself tasteless and therefore
disgusting.

That's CS course first semester hackery, but TBH, I can only tell from
imagination because I did not take CS courses - maybe that's the
problem...

Data structure driven design works not from the usage site down to the
internals. It's the other way round:

1) Define a data structure which describes what the internal function
needs to know

2) Implement use case specific variants which describe that

3) Hand the use case specific variant to the internal function
eventually with some minimal supplementary information.

Object oriented basics, right?

There is absolutely nothing wrong with having

internal_func(generic_info *, modifier);

but having

internal_func(a, b, ... n)

is fundamentally wrong in the context of an extensible and future proof
internal function.

Guess what. Today it's sufficient to have _eight_ arguments and we are
happy to have 10 nonsensical wrappers around this internal
function. Tomorrow there happens to be a use case which needs another
argument so you end up:

Changing 10 wrappers plus the function declaration and definition in
one go

instead of adding

One new naturally 0 initialized member to generic_info and be done
with it.

Look at the evolution of execmem_alloc() in this very pathset which I
pointed out. That very patchset covers _two_ of at least _six_ cases
Song and myself identified. It already had _three_ steps of evolution
from _one_ to _five_ to _eight_ parameters.

C is not like e.g. python where you can "solve" that problem by simply
doing:

- internal_func(a, b, c):
+ internal_func(a, b, c, d=None, ..., n=None):

But that's not a solution either. That's a horrible workaround even in
python once your parameter space gets sufficiently large. The way out in
python is to have **kwargs. But that's not an option in C, and not
necessarily the best option for python either.

Even in python or any other object oriented language you get to the
point where you have to rethink your approach, go back to the drawing
board and think about data representation.

But creating a new interface based on "let's see what we need over
time and add parameters as we see fit" is simply wrong to begin with
independent of the programming language.

Even if the _eight_ parameters are the end of the range, then they are
beyond justifyable because that's way beyond the natural register
argument space of any architecture and you are offloading your lazyness
to wrappers and the compiler to emit pointlessly horrible code.

There is a reason why new syscalls which need more than a few parameters
are based on 'struct DATA_WHICH_I_NEED_TO_KNOW' and 'flags'.

We've got burned on the non-extensibilty often enough. Why would a new
internal function have any different requirements especially as it is
neither implemented to the full extent nor a hotpath function?

Now you might argue that it _is_ a "hotpath" due to the BPF usage, but
then even more so as any intermediate wrapper which converts from one
data representation to another data representation is not going to
increase performance, right?

> ... I made the same mistake reviewing Song's patchset...

Songs series had rough edges, but was way more data structure driven
and palatable than this hackery.

The fact that you made a mistake while reviewing Songs series has
absolutely nothing to do with the above or my previous reply to Mike.

Thanks,

tglx