Re: [PATCH 09/10] Change table chaining layout

From: Linus Torvalds
Date: Tue Oct 23 2007 - 11:26:30 EST




On Tue, 23 Oct 2007, Boaz Harrosh wrote:
>
> A nice design is to have an struct like BIO. That holds a pointer to the
> array of scatterlists, size, ..., and a next and prev pointers to the next
> chunks. Than have all kernel code that now accepts scatterlist* and size
> accept a pointer to such structure. And all is clear and defined.

Yes, that would be one clean situation.

> But since we do not do that, and every single API in the kernel that
> receives a scatterlist pointer also receives an sg_count parameter,
> than I do not see what is so hacky about giving that sg_count parameter
> to the one that needs it the most. sg_next();

Well, I'd personally actually prefer to *not* have the count be passed
down explicitly, because it's just too error prone. So I'd much rather see
the count implicit in the list: whether it's in an explicit header
structure (that is the *only* thing passed down) or whether it's embedded
in the list itself is not important.

Since the list itself has to have the "next pointer" for chaining, and
thus already has "embedded information" in it, it actually does make sense
in my opinion to just embed the end-of-list information too. And the end
result right now is pretty simple, with "sg_next()" being really simple to
use, and there being no way to screw things up by getting the count and
the sg pointer out of sync.

My biggest complaint right now is that a lot of users of the sg *filling*
functions were mindlessly converted, so we have code like

cryptoloop.c: sg_set_page(&sg_in, in_page);
cryptoloop.c: sg_in.offset = in_offs;
cryptoloop.c: sg_in.length = sz;

which is just really stupid, and we should have a function for that. But
worse is code like this:

ub.c: sg_set_page(sg, virt_to_page(sc->top_sense));
ub.c: sg->offset = (unsigned long)sc->top_sense & (PAGE_SIZE-1);
ub.c: sg->length = UB_SENSE_SIZE;

which again was converted "line by line" and we actually *do* have a
function to do the above three lines as

sg_set_buf(sg, sc->top_sense, UB_SENSE_SIZE);

where that *single* line is just tons shorter but more importantly, more
readable, than the mess that is a brute-force conversion.

So I think the SG stuff looks ok now, but I think we have a lot of "fix up
the rough edges" to go!

(The above is not the only case. Just grep for "sg_set_page", and you'll
see several examples of this kind of hard-to-read code. Basically, I don't
think it's ever a good idea to initialize the SG entries one by one, and
even when we have a hard page/offset/size thing, we should not set them
one by one, and we should probably extend sg_set_page() to always take
offset and length too, since setting one without the other two is never
really sensible!)

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/