Re: [PATCHv2 17/29] x86/acpi, x86/boot: Add multiprocessor wake-up support

From: Thomas Gleixner
Date: Tue Feb 01 2022 - 18:27:52 EST


On Mon, Jan 24 2022 at 18:02, Kirill A. Shutemov wrote:
> +#ifdef CONFIG_X86_64
> +/* Physical address of the Multiprocessor Wakeup Structure mailbox */
> +static u64 acpi_mp_wake_mailbox_paddr;
> +/* Virtual address of the Multiprocessor Wakeup Structure mailbox */
> +static struct acpi_madt_multiproc_wakeup_mailbox *acpi_mp_wake_mailbox;
> +/* Lock to protect mailbox (acpi_mp_wake_mailbox) from parallel access */
> +static DEFINE_SPINLOCK(mailbox_lock);
> +#endif
> +
> #ifdef CONFIG_X86_IO_APIC
> /*
> * Locks related to IOAPIC hotplug
> @@ -336,6 +345,80 @@ acpi_parse_lapic_nmi(union acpi_subtable_headers * header, const unsigned long e
> return 0;
> }
>
> +#ifdef CONFIG_X86_64
> +/* Virtual address of the Multiprocessor Wakeup Structure mailbox */
> +static int acpi_wakeup_cpu(int apicid, unsigned long start_ip)
> +{
> + static physid_mask_t apic_id_wakemap = PHYSID_MASK_NONE;
> + unsigned long flags;
> + u8 timeout;
> +
> + /* Remap mailbox memory only for the first call to acpi_wakeup_cpu() */
> + if (physids_empty(apic_id_wakemap)) {
> + acpi_mp_wake_mailbox = memremap(acpi_mp_wake_mailbox_paddr,
> + sizeof(*acpi_mp_wake_mailbox),
> + MEMREMAP_WB);
> + }
> +
> + /*
> + * According to the ACPI specification r6.4, section titled
> + * "Multiprocessor Wakeup Structure" the mailbox-based wakeup
> + * mechanism cannot be used more than once for the same CPU.
> + * Skip wakeups if they are attempted more than once.
> + */
> + if (physid_isset(apicid, apic_id_wakemap)) {
> + pr_err("CPU already awake (APIC ID %x), skipping wakeup\n",
> + apicid);
> + return -EINVAL;
> + }
> +
> + spin_lock_irqsave(&mailbox_lock, flags);

What's the reason that interrupts need to be disabled here? The comment
above this invocation is not really informative ...

> + /*
> + * Mailbox memory is shared between firmware and OS. Firmware will
> + * listen on mailbox command address, and once it receives the wakeup
> + * command, CPU associated with the given apicid will be booted.
> + *
> + * The value of apic_id and wakeup_vector has to be set before updating
> + * the wakeup command. To let compiler preserve order of writes, use
> + * smp_store_release.

What? If the only purpose is to tell the compiler to preserve code
ordering then why are you using smp_store_release() here?
smp_store_release() is way more than that...

> + */
> + smp_store_release(&acpi_mp_wake_mailbox->apic_id, apicid);
> + smp_store_release(&acpi_mp_wake_mailbox->wakeup_vector, start_ip);
> + smp_store_release(&acpi_mp_wake_mailbox->command,
> + ACPI_MP_WAKE_COMMAND_WAKEUP);
> +
> + /*
> + * After writing the wakeup command, wait for maximum timeout of 0xFF
> + * for firmware to reset the command address back zero to indicate
> + * the successful reception of command.
> + * NOTE: 0xFF as timeout value is decided based on our experiments.
> + *
> + * XXX: Change the timeout once ACPI specification comes up with
> + * standard maximum timeout value.
> + */
> + timeout = 0xFF;
> + while (READ_ONCE(acpi_mp_wake_mailbox->command) && --timeout)
> + cpu_relax();
> +
> + /* If timed out (timeout == 0), return error */
> + if (!timeout) {

So this leaves a stale acpi_mp_wake_mailbox->command. What checks that
acpi_mp_wake_mailbox->command is 0 on the next invocation?

Aside of that assume timeout happens and the firmware acts after this
returned. Then you have inconsistent state as well. Error handling is
not trivial, but making it hope based is the worst kind.

> + spin_unlock_irqrestore(&mailbox_lock, flags);
> + return -EIO;
> + }
> +
> + /*
> + * If the CPU wakeup process is successful, store the
> + * status in apic_id_wakemap to prevent re-wakeup
> + * requests.
> + */
> + physid_set(apicid, apic_id_wakemap);
> +
> + spin_unlock_irqrestore(&mailbox_lock, flags);
> +
> + return 0;
> +}
> +#endif
> #endif /*CONFIG_X86_LOCAL_APIC */

Thanks,

tglx