Re: [PATCH v7 1/2] tty: hvc: pass DMA capable memory to put_chars()

From: Jiri Slaby
Date: Tue Aug 17 2021 - 23:17:34 EST


Hi,

On 17. 08. 21, 15:22, Xianting Tian wrote:
As well known, hvc backend can register its opertions to hvc backend.
the opertions contain put_chars(), get_chars() and so on.

"operations". And there too:

Some hvc backend may do dma in its opertions. eg, put_chars() of
virtio-console. But in the code of hvc framework, it may pass DMA
incapable memory to put_chars() under a specific configuration, which
is explained in commit c4baad5029(virtio-console: avoid DMA from stack):
1, c[] is on stack,
hvc_console_print():
char c[N_OUTBUF] __ALIGNED__;
cons_ops[index]->put_chars(vtermnos[index], c, i);
2, ch is on stack,
static void hvc_poll_put_char(,,char ch)
{
struct tty_struct *tty = driver->ttys[0];
struct hvc_struct *hp = tty->driver_data;
int n;

do {
n = hp->ops->put_chars(hp->vtermno, &ch, 1);
} while (n <= 0);
}

Commit c4baad5029 is just the fix to avoid DMA from stack memory, which
is passed to virtio-console by hvc framework in above code. But I think
the fix is aggressive, it directly uses kmemdup() to alloc new buffer
from kmalloc area and do memcpy no matter the memory is in kmalloc area
or not. But most importantly, it should better be fixed in the hvc
framework, by changing it to never pass stack memory to the put_chars()
function in the first place. Otherwise, we still face the same issue if
a new hvc backend using dma added in the furture.

We make 'char c[N_OUTBUF]' part of 'struct hvc_struct', so hp->c is no
longer the stack memory. we can use it in above two cases.

In fact, you need only a single char for the poll case (hvc_poll_put_char), so hvc_struct needs to contain only c, not an array.

OTOH, you need c[N_OUTBUF] in the console case (hvc_console_print), but not whole hvc_struct. So cons_hvcs should be an array of structs composed of only the lock and the buffer.

Hum.

Or maybe rethink and take care of the console case by kmemdup and be done with that case? What problem do you have with allocating 16 bytes? It should be quite easy and really fast (lockless) in most cases. On the contrary, your solution has to take a spinlock to access the global buffer.

Other fix is use L1_CACHE_BYTES as the alignment, use 'sizeof(long)' as
dma alignment is wrong. And use struct_size() to calculate size of
hvc_struct.

This ought to be in separate patches.

thanks,
--
js
suse labs