Re: [patch 2/4] CPU Hotplug support for X86_64

From: Andi Kleen
Date: Tue May 24 2005 - 07:30:13 EST


On Tue, May 24, 2005 at 01:11:15AM -0700, Ashok Raj wrote:
> /*
> @@ -97,6 +97,26 @@ cpumask_t cpu_core_map[NR_CPUS] __cachel
> extern unsigned char trampoline_data[];
> extern unsigned char trampoline_end[];
>
> +/* State of each CPU */
> +DEFINE_PER_CPU(int, cpu_state) = { 0 };
> +
> +#ifdef CONFIG_HOTPLUG_CPU
> +/*
> + * Store all idle threads, this can be reused instead of creating
> + * a new thread. Also avoids complicated thread destroy functionality
> + * for idle threads.
> + */
> +struct task_struct *idle_thread_array[NR_CPUS];
> +
> +#define get_idle_for_cpu(x) (idle_thread_array[(x)])
> +#define set_idle_for_cpu(x,p) (idle_thread_array[(x)] = (p))

Why is this only enabled for HOTPLUG_CPU? It looks like it could
be used for the !HOTPLUG case too. That would be preferable
to have less ifdefs.

>
> -static __cpuinitdata DEFINE_SPINLOCK(tsc_sync_lock);
> -static volatile __cpuinitdata unsigned long go[SLAVE + 1];
> -static int notscsync __cpuinitdata;
> +static __devinitdata DEFINE_SPINLOCK(tsc_sync_lock);
> +static volatile __devinitdata unsigned long go[SLAVE + 1];
> +static int notscsync __devinitdata;

Should be __cpuinitdata

>
> #undef DEBUG_TSC_SYNC
>
> @@ -192,7 +212,7 @@ static int notscsync __cpuinitdata;
> #define NUM_ITERS 5 /* likewise */
>
> /* Callback on boot CPU */
> -static __cpuinit void sync_master(void *arg)
> +static __devinit void sync_master(void *arg)

Didnt we agree to not do these changes?

Lots more cases in this file. The patch would be a lot smaller without
them.

> @@ -410,6 +430,8 @@ void __cpuinit smp_callin(void)
> * Allow the master to continue.
> */
> cpu_set(cpuid, cpu_callin_map);
> + mb();
> + local_flush_tlb();

Why is this needed?

> +#ifndef CONFIG_HOTPLUG_CPU
> cpu_set(i, cpu_possible_map);
> +#endif
> }
> +#ifdef CONFIG_HOTPLUG_CPU
> + printk ("Setting possible cpus %d\n", i);
> + cpu_set(i, cpu_possible_map);
> +#endif

Why these two ifdefs? If possible remove them.


> @@ -1007,7 +1080,10 @@ int __cpuinit __cpu_up(unsigned int cpu)
>
> while (!cpu_isset(cpu, cpu_online_map))
> cpu_relax();
> - return 0;
> + err = 0;
> +ret:
> + flush_tlb_all();

Why this flush again?

How do you prevent the BP from being offlined? Currently
some stuff (NMIs, timer) rely on it being present :/ Longer
term they need to be fixed of course, but short term I would
refuse to offline it. Needs an audit probably.

> + return -EBUSY;
> +
> + disable_APIC_timer();
> +
> + /* Allow any queued timer interrupts to get serviced */
> + local_irq_enable();
> + mdelay(1);

This wont work with variable timer tick. Need some other way
to kick the timer. It looks unreliable anyways.

> +
> + /*
> + * Need this per zwane, but this uses IPI, so cannot be used
> + * in the machine down state. Need to find something else
> + *
> + * flush_tlb_all();
> + */
> + local_flush_tlb();
> + local_irq_disable();
> + remove_siblinginfo(cpu);

No idea what all these TLB flushes are good for. Can you describe
them?

> /* Only the BSP gets external NMIs from the system. */
> - if (!smp_processor_id())
> + if (!cpu)

e.g. here you have the first BP dependency... Ideally
it would shift if it is ever offlined.

> reason = get_nmi_reason();
>
> +#ifdef CONFIG_HOTPLUG_CPU
> + if (!cpu_online(cpu))
> + return;
> +#endif

Please remove the ifdef.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/