Re: [PATCH printk v2 08/26] printk: nbcon: Implement processing in port->lock wrapper

From: John Ogness
Date: Fri Mar 15 2024 - 11:05:18 EST


On 2024-03-14, Petr Mladek <pmladek@xxxxxxxx> wrote:
> Well, it brings another question. Does this allow to have
> the following situation?
>
> CPU0 CPU1
>
> some_function()
> uart_port_lock()
> // locked just with up->lock
> // doing something with the port
>
> register_console()
> // add struct console using the same
> // port as CPU0
> printk()
> console_try_lock()
> console_unlock()
> console_flush_all()
> // acquire context for the newly
> // registered nbcon
> nbcon_context_try_acquire(ctxt)
> con->write()
>
> BANG: Both CPU0 and CPU1 are writing to the same port.
>
> Reason: CPU0 locked only via port->lock.
> CPU1 locked only by acquiring nbcon context.

Great catch! Yes, this is possible. :-/

When the kthread series part is introduced, there will be additional
callbacks that nbcon consoles must implement
(driver_enter()/driver_exit()). These provide driver-level
synchronization. In the case of serial uarts, the callbacks map to
locking/unlocking the port lock.

If I were to introduce those callbacks in _this_ series, they can be
used when adding a console to the list in register_console(). This
changes your example to:

CPU0 CPU1

some_function()
uart_port_lock()
// locked just with up->lock
// doing something with the port

register_console()
// add struct console using the same
// port as CPU0
newcon->driver_enter()
spin_lock(port_lock)
// spin on CPU0
uart_port_unlock()
// add new console to console list
newcon->driver_exit()
spin_unlock(port_lock)
...

If any other CPUs come in and call uart_port_lock(), they will see the
console as registered and will acquire the nbcon to avoid the BANG.

> Maybe, this is not possible because the console is registered when
> the struct uart_port is being initialized and nobody could
> use the same port in parallel, except for the early console.
> Where the early console is serialized using the console_lock().

Yes, it is possible. Just check out:

find /sys/ -name console -type f

If you echo 'Y' or 'N' into any of those files, you can dynamically
register and unregister those consoles, respectively.

I just ran some tests to verify this and was even able to trigger a
mainline bug because probe_baud() of the 8250 driver is not called under
the port lock. This is essentially the same scenario you
illustrated. But the 8250 probe_baud() issue is a driver bug and not
related to this series.

Getting back to this series, my proposal would change register_console()
like this:

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 68657d4d6649..25a0a81e8397 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -3733,6 +3733,7 @@ void register_console(struct console *newcon)
struct console *con;
bool bootcon_registered = false;
bool realcon_registered = false;
+ unsigned long flags;
int err;

console_list_lock();
@@ -3831,6 +3832,19 @@ void register_console(struct console *newcon)
if (newcon->flags & CON_BOOT)
have_boot_console = true;

+ /*
+ * If another context is actively using the hardware of this new
+ * console, it will not be aware of the nbcon synchronization. This
+ * is a risk that two contexts could access the hardware
+ * simultaneously if this new console is used for atomic printing
+ * and the other context is still using the hardware.
+ *
+ * Use the driver synchronization to ensure that the hardware is not
+ * in use while this new console transitions to being registered.
+ */
+ if ((newcon->flags & CON_NBCON) && newcon->write_atomic)
+ newcon->driver_enter(newcon, &flags);
+
/*
* Put this console in the list - keep the
* preferred driver at the head of the list.
@@ -3855,6 +3869,10 @@ void register_console(struct console *newcon)
* register_console() completes.
*/

+ /* This new console is now registered. */
+ if ((newcon->flags & CON_NBCON) && newcon->write_atomic)
+ newcon->driver_exit(newcon, flags);
+
console_sysfs_notify();

/*

> One solution would be to add nbcon consoles into the console_list
> under uart_port_lock().

This is what I have proposed and I think it is the most straight forward
solution.

> Another solution would be to make sure that any code serialized
> by uart_port_lock() will be already synchronized by nbcon context
> while the nbcon is added into the console_list.

I do not think this would be acceptable. It would mean that non-console
ports would need to lock the nbcon. Not only will that slow down the
non-console ports, but it will also cause serious contention between the
ports. (Remember, all the ports share the same struct console.)

> Maybe, we could do this in con->setup() callback. Something like:

This proposal would work, but IMHO it adds too much complexity by
requiring console drivers to implement the callbacks and do special
things in those callbacks.

>> The various fields have the following meaning (AFAICT):
>>
>> port->line: An identifier to represent a particular port supported by a
>> driver.
>>
>> port->cons: The struct console to use if this port is chosen to be a
>> console.
>>
>> port->console: Boolean, true if this port was chosen to be a
>> console. (Used only by the tty layer.)
>>
>> cons->index: The port chosen by printk to be a console.
>>
> This is a great description. It would be great to have it somewhere in
> the sources. Maybe, above the locking/acquire functions.

OK.

John