Re: [PATCH] x86: make generic arch support NUMAQ v5

From: Andy Whitcroft
Date: Mon Jun 09 2008 - 10:43:00 EST


On Sun, Jun 08, 2008 at 06:31:54PM -0700, Yinghai Lu wrote:
>
> so it could fallback to normal numa.
> NUMAQ depends on GENERICARCH
> also decouple genericarch numa with acpi.
> also make it fallback to bigsmp if apicid > 8.
>
> v3: return early if not found_numaq in pci_numa_init
> remove xquad_portio in misc.c
> v4: make summit, bigsmp and es7000 depend on GENERICARCH too
> v5: seperate apicid check for bigsmp to another patch
> [PATCH] x86: introduce max_physical_apicid for bigsmp switching

Do you have a NUMA-Q to test this on? Also, what is the baseline here
as I would like to test it?

>
> Signed-off-by: Yinghai Lu <yhlu.kernel@xxxxxxxxx>
>
> Index: linux-2.6/arch/x86/Kconfig
> ===================================================================
> --- linux-2.6.orig/arch/x86/Kconfig
> +++ linux-2.6/arch/x86/Kconfig
> @@ -264,36 +264,6 @@ config X86_VOYAGER
> If you do not specifically know you have a Voyager based machine,
> say N here, otherwise the kernel you build will not be bootable.
>
> -config X86_NUMAQ
> - bool "NUMAQ (IBM/Sequent)"
> - depends on SMP && X86_32 && PCI
> - select NUMA
> - help
> - This option is used for getting Linux to run on a (IBM/Sequent) NUMA
> - multiquad box. This changes the way that processors are bootstrapped,
> - and uses Clustered Logical APIC addressing mode instead of Flat Logical.
> - You will need a new lynxer.elf file to flash your firmware with - send
> - email to <Martin.Bligh@xxxxxxxxxx>.
> -
> -config X86_SUMMIT
> - bool "Summit/EXA (IBM x440)"
> - depends on X86_32 && SMP
> - help
> - This option is needed for IBM systems that use the Summit/EXA chipset.
> - In particular, it is needed for the x440.
> -
> - If you don't have one of these computers, you should say N here.
> - If you want to build a NUMA kernel, you must select ACPI.
> -
> -config X86_BIGSMP
> - bool "Support for other sub-arch SMP systems with more than 8 CPUs"
> - depends on X86_32 && SMP
> - help
> - This option is needed for the systems that have more than 8 CPUs
> - and if the system is not of any sub-arch type above.
> -
> - If you don't have such a system, you should say N here.
> -
> config X86_VISWS
> bool "SGI 320/540 (Visual Workstation)"
> depends on X86_32 && !PCI
> @@ -307,12 +277,33 @@ config X86_VISWS
> and vice versa. See <file:Documentation/sgi-visws.txt> for details.
>
> config X86_GENERICARCH
> - bool "Generic architecture (Summit, bigsmp, ES7000, default)"
> + bool "Generic architecture"
> depends on X86_32
> help
> - This option compiles in the Summit, bigsmp, ES7000, default subarchitectures.
> - It is intended for a generic binary kernel.
> - If you want a NUMA kernel, select ACPI. We need SRAT for NUMA.
> + This option compiles in the NUMAQ, Summit, bigsmp, ES7000, default
> + subarchitectures. It is intended for a generic binary kernel.
> + if you select them all, kernel will probe it one by one. and will
> + fallback to default.
> +
> +if X86_GENERICARCH
> +
> +config X86_NUMAQ
> + bool "NUMAQ (IBM/Sequent)"
> + depends on SMP && X86_32 && PCI

Can we not just add && X86_GENERICARCH here instead of putting them in
that if ?

> + select NUMA
> + help
> + This option is used for getting Linux to run on a NUMAQ (IBM/Sequent)
> + NUMA multiquad box. This changes the way that processors are
> + bootstrapped, and uses Clustered Logical APIC addressing mode instead
> + of Flat Logical. You will need a new lynxer.elf file to flash your
> + firmware with - send email to <Martin.Bligh@xxxxxxxxxx>.
> +
> +config X86_SUMMIT
> + bool "Summit/EXA (IBM x440)"
> + depends on X86_32 && SMP
> + help
> + This option is needed for IBM systems that use the Summit/EXA chipset.
> + In particular, it is needed for the x440.
>
> config X86_ES7000
> bool "Support for Unisys ES7000 IA32 series"
> @@ -320,8 +311,15 @@ config X86_ES7000
> help
> Support for Unisys ES7000 systems. Say 'Y' here if this kernel is
> supposed to run on an IA32-based Unisys ES7000 system.
> - Only choose this option if you have such a system, otherwise you
> - should say N here.
> +
> +config X86_BIGSMP
> + bool "Support for big SMP systems with more than 8 CPUs"
> + depends on X86_32 && SMP
> + help
> + This option is needed for the systems that have more than 8 CPUs
> + and if the system is not of any sub-arch type above.
> +
> +endif
>
> config X86_RDC321X
> bool "RDC R-321x SoC"
> @@ -908,9 +906,9 @@ config X86_PAE
> config NUMA
> bool "Numa Memory Allocation and Scheduler Support (EXPERIMENTAL)"
> depends on SMP
> - depends on X86_64 || (X86_32 && HIGHMEM64G && (X86_NUMAQ || (X86_SUMMIT || X86_GENERICARCH) && ACPI) && EXPERIMENTAL)
> + depends on X86_64 || (X86_32 && HIGHMEM64G && (X86_NUMAQ || X86_GENERICARCH || X86_SUMMIT && ACPI) && EXPERIMENTAL)
> default n if X86_PC
> - default y if (X86_NUMAQ || X86_SUMMIT)
> + default y if (X86_NUMAQ || X86_SUMMIT || X86_GENERICARCH)

If I am reading this right we are making all genericarch kernels NUMA,
which before they were not. Hmmm is that going to cause problems
elsewhere? Mind you can you get non-numa boxes any more?

If its only NUMAQ which makes that requireemnt it seems wrong to add
GENERICARCH here. ie. its NUMAQ or SUMMIT that brings the requirement.

> help
> Enable NUMA (Non Uniform Memory Access) support.
> The kernel will try to allocate memory used by a CPU on the
> Index: linux-2.6/arch/x86/kernel/io_apic_32.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kernel/io_apic_32.c
> +++ linux-2.6/arch/x86/kernel/io_apic_32.c
> @@ -1715,7 +1715,6 @@ void disable_IO_APIC(void)
> * by Matt Domsch <Matt_Domsch@xxxxxxxx> Tue Dec 21 12:25:05 CST 1999
> */
>
> -#ifndef CONFIG_X86_NUMAQ
> static void __init setup_ioapic_ids_from_mpc(void)
> {
> union IO_APIC_reg_00 reg_00;
> @@ -1725,6 +1724,11 @@ static void __init setup_ioapic_ids_from
> unsigned char old_id;
> unsigned long flags;
>
> +#ifdef CONFIG_X86_NUMAQ
> + if (found_numaq)
> + return;
> +#endif
> +

Could this not be always compiled in? As long as found_numaq is never 1
we should be ok.

> /*
> * Don't check I/O APIC IDs for xAPIC systems. They have
> * no meaning without the serial APIC bus.
> @@ -1821,9 +1825,6 @@ static void __init setup_ioapic_ids_from
> apic_printk(APIC_VERBOSE, " ok.\n");
> }
> }
> -#else
> -static void __init setup_ioapic_ids_from_mpc(void) { }
> -#endif
>
> int no_timer_check __initdata;
>
> Index: linux-2.6/arch/x86/kernel/mpparse.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kernel/mpparse.c
> +++ linux-2.6/arch/x86/kernel/mpparse.c
> @@ -49,15 +49,73 @@ static int __init mpf_checksum(unsigned
> }
>
> #ifdef CONFIG_X86_NUMAQ
> +int found_numaq;
> /*
> * Have to match translation table entries to main table entries by counter
> * hence the mpc_record variable .... can't see a less disgusting way of
> * doing this ....
> */
> +struct mpc_config_translation {
> + unsigned char mpc_type;
> + unsigned char trans_len;
> + unsigned char trans_type;
> + unsigned char trans_quad;
> + unsigned char trans_global;
> + unsigned char trans_local;
> + unsigned short trans_reserved;
> +};
> +
>
> static int mpc_record;
> static struct mpc_config_translation *translation_table[MAX_MPC_ENTRY]
> __cpuinitdata;
> +
> +static inline int generate_logical_apicid(int quad, int phys_apicid)
> +{
> + return (quad << 4) + (phys_apicid ? phys_apicid << 1 : 1);
> +}
> +
> +
> +static inline int mpc_apic_id(struct mpc_config_processor *m,
> + struct mpc_config_translation *translation_record)
> +{
> + int quad = translation_record->trans_quad;
> + int logical_apicid = generate_logical_apicid(quad, m->mpc_apicid);
> +
> + printk(KERN_DEBUG "Processor #%d %u:%u APIC version %d (quad %d, apic %d)\n",
> + m->mpc_apicid,
> + (m->mpc_cpufeature & CPU_FAMILY_MASK) >> 8,
> + (m->mpc_cpufeature & CPU_MODEL_MASK) >> 4,
> + m->mpc_apicver, quad, logical_apicid);
> + return logical_apicid;
> +}
> +
> +int mp_bus_id_to_node[MAX_MP_BUSSES];
> +
> +int mp_bus_id_to_local[MAX_MP_BUSSES];
> +
> +static void mpc_oem_bus_info(struct mpc_config_bus *m, char *name,
> + struct mpc_config_translation *translation)
> +{
> + int quad = translation->trans_quad;
> + int local = translation->trans_local;
> +
> + mp_bus_id_to_node[m->mpc_busid] = quad;
> + mp_bus_id_to_local[m->mpc_busid] = local;
> + printk(KERN_INFO "Bus #%d is %s (node %d)\n",
> + m->mpc_busid, name, quad);
> +}
> +
> +int quad_local_to_mp_bus_id [NR_CPUS/4][4];
> +static void mpc_oem_pci_bus(struct mpc_config_bus *m,
> + struct mpc_config_translation *translation)
> +{
> + int quad = translation->trans_quad;
> + int local = translation->trans_local;
> +
> + quad_local_to_mp_bus_id[quad][local] = m->mpc_busid;
> +}
> +
> #endif
>
> static void __cpuinit MP_processor_info(struct mpc_config_processor *m)
> @@ -321,11 +382,11 @@ static void __init smp_read_mpc_oem(stru
> }
> }
>
> -static inline void mps_oem_check(struct mp_config_table *mpc, char *oem,
> +void numaq_mps_oem_check(struct mp_config_table *mpc, char *oem,
> char *productid)
> {
> if (strncmp(oem, "IBM NUMA", 8))
> - printk("Warning! May not be a NUMA-Q system!\n");
> + printk("Warning! Not a NUMA-Q system!\n");
> else
> found_numaq = 1;
>
> @@ -388,7 +449,16 @@ static int __init smp_read_mpc(struct mp
> return 0;
>
> #ifdef CONFIG_X86_32
> - mps_oem_check(mpc, oem, str);
> + /*
> + * need to make sure summit and es7000's mps_oem_check is safe to be
> + * called early via genericarch 's mps_oem_check
> + */
> + if (early) {
> +#ifdef CONFIG_X86_NUMAQ
> + numaq_mps_oem_check(mpc, oem, str);
> +#endif

Is there any reason we cannot use:

if (found_numaq)
numaq_mps_oem_check(mpc, oem, str);

Also why is this dependant on 'early'. There doesn't seem to be such
a check in the original path?


> + } else
> + mps_oem_check(mpc, oem, str);
> #endif
>
> /* save the local APIC address, it might be non-default */
> Index: linux-2.6/arch/x86/kernel/numaq_32.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kernel/numaq_32.c
> +++ linux-2.6/arch/x86/kernel/numaq_32.c
> @@ -36,8 +36,6 @@
>
> #define MB_TO_PAGES(addr) ((addr) << (20 - PAGE_SHIFT))
>
> -int found_numaq;
> -
> /*
> * Function: smp_dump_qct()
> *
> @@ -105,13 +103,3 @@ static int __init numaq_tsc_disable(void
> }
> arch_initcall(numaq_tsc_disable);
>
> -#ifdef CONFIG_ACPI
> -/*
> - * Dummy implementation:
> - */
> -struct pci_bus * __devinit
> -pci_acpi_scan_root(struct acpi_device *device, int domain, int busnum)
> -{
> - return NULL;
> -}
> -#endif
> Index: linux-2.6/arch/x86/mach-generic/Makefile
> ===================================================================
> --- linux-2.6.orig/arch/x86/mach-generic/Makefile
> +++ linux-2.6/arch/x86/mach-generic/Makefile
> @@ -2,7 +2,11 @@
> # Makefile for the generic architecture
> #
>
> -EXTRA_CFLAGS := -Iarch/x86/kernel
> +EXTRA_CFLAGS := -Iarch/x86/kernel
>
> -obj-y := probe.o summit.o bigsmp.o es7000.o default.o
> -obj-y += ../../x86/mach-es7000/
> +obj-y := probe.o default.o
> +obj-$(CONFIG_X86_NUMAQ) += numaq.o
> +obj-$(CONFIG_X86_SUMMIT) += summit.o
> +obj-$(CONFIG_X86_BIGSMP) += bigsmp.o
> +obj-$(CONFIG_X86_ES7000) += es7000.o
> +obj-$(CONFIG_X86_ES7000) += ../../x86/mach-es7000/
> Index: linux-2.6/arch/x86/mach-generic/probe.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/mach-generic/probe.c
> +++ linux-2.6/arch/x86/mach-generic/probe.c
> @@ -16,6 +16,7 @@
> #include <asm/apicdef.h>
> #include <asm/genapic.h>
>
> +extern struct genapic apic_numaq;
> extern struct genapic apic_summit;
> extern struct genapic apic_bigsmp;
> extern struct genapic apic_es7000;
> @@ -24,9 +25,18 @@ extern struct genapic apic_default;
> struct genapic *genapic = &apic_default;
>
> static struct genapic *apic_probe[] __initdata = {
> +#ifdef CONFIG_X86_NUMAQ
> + &apic_numaq,
> +#endif
> +#ifdef CONFIG_X86_SUMMIT
> &apic_summit,
> +#endif
> +#ifdef CONFIG_X86_BIGSMP
> &apic_bigsmp,
> +#endif
> +#ifdef CONFIG_X86_ES7000
> &apic_es7000,
> +#endif
> &apic_default, /* must be last */
> NULL,
> };
> @@ -54,6 +64,7 @@ early_param("apic", parse_apic);
>
> void __init generic_bigsmp_probe(void)
> {
> +#if CONFIG_X86_BIGSMP
> /*
> * This routine is used to switch to bigsmp mode when
> * - There is no apic= option specified by the user
> @@ -67,6 +78,7 @@ void __init generic_bigsmp_probe(void)
> printk(KERN_INFO "Overriding APIC driver with %s\n",
> genapic->name);
> }
> +#endif
> }
>
> void __init generic_apic_probe(void)
> @@ -88,7 +100,8 @@ void __init generic_apic_probe(void)
>
> /* These functions can switch the APIC even after the initial ->probe() */
>
> -int __init mps_oem_check(struct mp_config_table *mpc, char *oem, char *productid)
> +int __init mps_oem_check(struct mp_config_table *mpc, char *oem,
> + char *productid)
> {
> int i;
> for (i = 0; apic_probe[i]; ++i) {

That looks like an unrelated cleanup?

> Index: linux-2.6/arch/x86/pci/Makefile_32
> ===================================================================
> --- linux-2.6.orig/arch/x86/pci/Makefile_32
> +++ linux-2.6/arch/x86/pci/Makefile_32
> @@ -13,10 +13,11 @@ pci-y := fixup.o
> pci-$(CONFIG_ACPI) += acpi.o
> pci-y += legacy.o irq.o
>
> -# Careful: VISWS and NUMAQ overrule the pci-y above. The colons are
> +# Careful: VISWS overrule the pci-y above. The colons are
> # therefor correct. This needs a proper fix by distangling the code.
> pci-$(CONFIG_X86_VISWS) := visws.o fixup.o
> -pci-$(CONFIG_X86_NUMAQ) := numa.o irq.o
> +
> +pci-$(CONFIG_X86_NUMAQ) += numa.o
>
> # Necessary for NUMAQ as well
> pci-$(CONFIG_NUMA) += mp_bus_to_node.o
> Index: linux-2.6/arch/x86/pci/numa.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/pci/numa.c
> +++ linux-2.6/arch/x86/pci/numa.c
> @@ -6,45 +6,21 @@
> #include <linux/init.h>
> #include <linux/nodemask.h>
> #include <mach_apic.h>
> +#include <asm/mpspec.h>
> #include "pci.h"
>
> #define XQUAD_PORTIO_BASE 0xfe400000
> #define XQUAD_PORTIO_QUAD 0x40000 /* 256k per quad. */
>
> -int mp_bus_id_to_node[MAX_MP_BUSSES];
> #define BUS2QUAD(global) (mp_bus_id_to_node[global])
>
> -int mp_bus_id_to_local[MAX_MP_BUSSES];
> #define BUS2LOCAL(global) (mp_bus_id_to_local[global])
>
> -void mpc_oem_bus_info(struct mpc_config_bus *m, char *name,
> - struct mpc_config_translation *translation)
> -{
> - int quad = translation->trans_quad;
> - int local = translation->trans_local;
> -
> - mp_bus_id_to_node[m->mpc_busid] = quad;
> - mp_bus_id_to_local[m->mpc_busid] = local;
> - printk(KERN_INFO "Bus #%d is %s (node %d)\n",
> - m->mpc_busid, name, quad);
> -}
> -
> -int quad_local_to_mp_bus_id [NR_CPUS/4][4];
> #define QUADLOCAL2BUS(quad,local) (quad_local_to_mp_bus_id[quad][local])
> -void mpc_oem_pci_bus(struct mpc_config_bus *m,
> - struct mpc_config_translation *translation)
> -{
> - int quad = translation->trans_quad;
> - int local = translation->trans_local;
> -
> - quad_local_to_mp_bus_id[quad][local] = m->mpc_busid;
> -}
>
> /* Where the IO area was mapped on multiquad, always 0 otherwise */
> void *xquad_portio;
> -#ifdef CONFIG_X86_NUMAQ
> EXPORT_SYMBOL(xquad_portio);
> -#endif
>
> #define XQUAD_PORT_ADDR(port, quad) (xquad_portio + (XQUAD_PORTIO_QUAD*quad) + port)
>
> @@ -179,6 +155,9 @@ static int __init pci_numa_init(void)
> {
> int quad;
>
> + if (!found_numaq)
> + return 0;
> +
> raw_pci_ops = &pci_direct_conf1_mq;
>
> if (pcibios_scanned++)
> Index: linux-2.6/include/asm-x86/mach-generic/mach_mpparse.h
> ===================================================================
> --- linux-2.6.orig/include/asm-x86/mach-generic/mach_mpparse.h
> +++ linux-2.6/include/asm-x86/mach-generic/mach_mpparse.h
> @@ -1,7 +1,10 @@
> #ifndef _MACH_MPPARSE_H
> #define _MACH_MPPARSE_H 1
>
> -int mps_oem_check(struct mp_config_table *mpc, char *oem, char *productid);
> -int acpi_madt_oem_check(char *oem_id, char *oem_table_id);
> +
> +extern int mps_oem_check(struct mp_config_table *mpc, char *oem,
> + char *productid);
> +
> +extern int acpi_madt_oem_check(char *oem_id, char *oem_table_id);
>
> #endif
> Index: linux-2.6/include/asm-x86/mach-numaq/mach_apic.h
> ===================================================================
> --- linux-2.6.orig/include/asm-x86/mach-numaq/mach_apic.h
> +++ linux-2.6/include/asm-x86/mach-numaq/mach_apic.h
> @@ -20,8 +20,14 @@ static inline cpumask_t target_cpus(void
> #define INT_DELIVERY_MODE dest_LowestPrio
> #define INT_DEST_MODE 0 /* physical delivery on LOCAL quad */
>
> -#define check_apicid_used(bitmap, apicid) physid_isset(apicid, bitmap)
> -#define check_apicid_present(bit) physid_isset(bit, phys_cpu_present_map)
> +static inline unsigned long check_apicid_used(physid_mask_t bitmap, int apicid)
> +{
> + return physid_isset(apicid, bitmap);
> +}
> +static inline unsigned long check_apicid_present(int bit)
> +{
> + return physid_isset(bit, phys_cpu_present_map);
> +}
> #define apicid_cluster(apicid) (apicid & 0xF0)
>
> static inline int apic_id_registered(void)
> @@ -77,11 +83,6 @@ static inline int cpu_present_to_apicid(
> return BAD_APICID;
> }
>
> -static inline int generate_logical_apicid(int quad, int phys_apicid)
> -{
> - return (quad << 4) + (phys_apicid ? phys_apicid << 1 : 1);
> -}
> -
> static inline int apicid_to_node(int logical_apicid)
> {
> return logical_apicid >> 4;
> @@ -95,30 +96,6 @@ static inline physid_mask_t apicid_to_cp
> return physid_mask_of_physid(cpu + 4*node);
> }
>
> -struct mpc_config_translation {
> - unsigned char mpc_type;
> - unsigned char trans_len;
> - unsigned char trans_type;
> - unsigned char trans_quad;
> - unsigned char trans_global;
> - unsigned char trans_local;
> - unsigned short trans_reserved;
> -};
> -
> -static inline int mpc_apic_id(struct mpc_config_processor *m,
> - struct mpc_config_translation *translation_record)
> -{
> - int quad = translation_record->trans_quad;
> - int logical_apicid = generate_logical_apicid(quad, m->mpc_apicid);
> -
> - printk("Processor #%d %u:%u APIC version %d (quad %d, apic %d)\n",
> - m->mpc_apicid,
> - (m->mpc_cpufeature & CPU_FAMILY_MASK) >> 8,
> - (m->mpc_cpufeature & CPU_MODEL_MASK) >> 4,
> - m->mpc_apicver, quad, logical_apicid);
> - return logical_apicid;
> -}
> -
> extern void *xquad_portio;
>
> static inline void setup_portio_remap(void)
> Index: linux-2.6/include/asm-x86/mach-numaq/mach_mpparse.h
> ===================================================================
> --- linux-2.6.orig/include/asm-x86/mach-numaq/mach_mpparse.h
> +++ linux-2.6/include/asm-x86/mach-numaq/mach_mpparse.h
> @@ -1,14 +1,7 @@
> #ifndef __ASM_MACH_MPPARSE_H
> #define __ASM_MACH_MPPARSE_H
>
> -extern void mpc_oem_bus_info(struct mpc_config_bus *m, char *name,
> - struct mpc_config_translation *translation);
> -extern void mpc_oem_pci_bus(struct mpc_config_bus *m,
> - struct mpc_config_translation *translation);
> -
> -/* Hook from generic ACPI tables.c */
> -static inline void acpi_madt_oem_check(char *oem_id, char *oem_table_id)
> -{
> -}
> +extern void numaq_mps_oem_check(struct mp_config_table *mpc, char *oem,
> + char *productid);
>
> #endif /* __ASM_MACH_MPPARSE_H */
> Index: linux-2.6/include/asm-x86/mmzone_32.h
> ===================================================================
> --- linux-2.6.orig/include/asm-x86/mmzone_32.h
> +++ linux-2.6/include/asm-x86/mmzone_32.h
> @@ -12,11 +12,9 @@
> extern struct pglist_data *node_data[];
> #define NODE_DATA(nid) (node_data[nid])
>
> -#ifdef CONFIG_X86_NUMAQ
> - #include <asm/numaq.h>
> -#elif defined(CONFIG_ACPI_SRAT)/* summit or generic arch */
> - #include <asm/srat.h>
> -#endif
> +#include <asm/numaq.h>
> +/* summit or generic arch */
> +#include <asm/srat.h>
>
> extern int get_memcfg_numa_flat(void);
> /*
> @@ -26,14 +24,11 @@ extern int get_memcfg_numa_flat(void);
> */
> static inline void get_memcfg_numa(void)
> {
> -#ifdef CONFIG_X86_NUMAQ
> +
> if (get_memcfg_numaq())
> return;
> -#elif defined(CONFIG_ACPI_SRAT)
> if (get_memcfg_from_srat())
> return;
> -#endif
> -
> get_memcfg_numa_flat();
> }
>
> @@ -42,7 +37,6 @@ extern int early_pfn_to_nid(unsigned lon
> #else /* !CONFIG_NUMA */
>
> #define get_memcfg_numa get_memcfg_numa_flat
> -#define get_zholes_size(n) (0)
>
> #endif /* CONFIG_NUMA */
>
> @@ -83,9 +77,6 @@ static inline int pfn_to_nid(unsigned lo
> __pgdat->node_start_pfn + __pgdat->node_spanned_pages; \
> })
>
> -#ifdef CONFIG_X86_NUMAQ /* we have contiguous memory on NUMA-Q */
> -#define pfn_valid(pfn) ((pfn) < num_physpages)
> -#else
> static inline int pfn_valid(int pfn)
> {
> int nid = pfn_to_nid(pfn);
> @@ -94,7 +85,6 @@ static inline int pfn_valid(int pfn)
> return (pfn < node_end_pfn(nid));
> return 0;
> }
> -#endif /* CONFIG_X86_NUMAQ */

Ok, that is a small change in pfn_valid for numaq, but essentially its a
little less efficient. We can probabally live with that.

> #endif /* CONFIG_DISCONTIGMEM */
>
> Index: linux-2.6/include/asm-x86/numaq.h
> ===================================================================
> --- linux-2.6.orig/include/asm-x86/numaq.h
> +++ linux-2.6/include/asm-x86/numaq.h
> @@ -157,9 +157,10 @@ struct sys_cfg_data {
> struct eachquadmem eq[MAX_NUMNODES]; /* indexed by quad id */
> };
>
> -static inline unsigned long *get_zholes_size(int nid)
> +#else
> +static inline int get_memcfg_numaq(void)
> {
> - return NULL;
> + return 0;
> }
> #endif /* CONFIG_X86_NUMAQ */
> #endif /* NUMAQ_H */
> Index: linux-2.6/include/asm-x86/srat.h
> ===================================================================
> --- linux-2.6.orig/include/asm-x86/srat.h
> +++ linux-2.6/include/asm-x86/srat.h
> @@ -27,11 +27,13 @@
> #ifndef _ASM_SRAT_H_
> #define _ASM_SRAT_H_
>
> -#ifndef CONFIG_ACPI_SRAT
> -#error CONFIG_ACPI_SRAT not defined, and srat.h header has been included
> -#endif
> -
> +#ifdef CONFIG_ACPI_SRAT
> extern int get_memcfg_from_srat(void);
> -extern unsigned long *get_zholes_size(int);
> +#else
> +static inline int get_memcfg_from_srat(void)
> +{
> + return 0;
> +}
> +#endif
>
> #endif /* _ASM_SRAT_H_ */
> Index: linux-2.6/arch/x86/mach-generic/numaq.c
> ===================================================================
> --- /dev/null
> +++ linux-2.6/arch/x86/mach-generic/numaq.c
> @@ -0,0 +1,41 @@
> +/*
> + * APIC driver for the IBM NUMAQ chipset.
> + */
> +#define APIC_DEFINITION 1
> +#include <linux/threads.h>
> +#include <linux/cpumask.h>
> +#include <linux/smp.h>
> +#include <asm/mpspec.h>
> +#include <asm/genapic.h>
> +#include <asm/fixmap.h>
> +#include <asm/apicdef.h>
> +#include <linux/kernel.h>
> +#include <linux/string.h>
> +#include <linux/init.h>
> +#include <asm/mach-numaq/mach_apic.h>
> +#include <asm/mach-numaq/mach_apicdef.h>
> +#include <asm/mach-numaq/mach_ipi.h>
> +#include <asm/mach-numaq/mach_mpparse.h>
> +#include <asm/mach-numaq/mach_wakecpu.h>
> +#include <asm/numaq.h>
> +
> +static int mps_oem_check(struct mp_config_table *mpc, char *oem,
> + char *productid)
> +{
> + numaq_mps_oem_check(mpc, oem, productid);
> + return found_numaq;
> +}
> +
> +static int probe_numaq(void)
> +{
> + /* already know from get_memcfg_numaq() */
> + return found_numaq;
> +}
> +
> +/* Hook from generic ACPI tables.c */
> +static int acpi_madt_oem_check(char *oem_id, char *oem_table_id)
> +{
> + return 0;
> +}
> +
> +struct genapic apic_numaq = APIC_INIT("NUMAQ", probe_numaq);
> Index: linux-2.6/arch/x86/mach-generic/bigsmp.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/mach-generic/bigsmp.c
> +++ linux-2.6/arch/x86/mach-generic/bigsmp.c
> @@ -23,10 +23,8 @@ static int dmi_bigsmp; /* can be set by
>
> static int hp_ht_bigsmp(const struct dmi_system_id *d)
> {
> -#ifdef CONFIG_X86_GENERICARCH
> printk(KERN_NOTICE "%s detected: force use of apic=bigsmp\n", d->ident);
> dmi_bigsmp = 1;
> -#endif
> return 0;
> }
>
> Index: linux-2.6/drivers/acpi/Kconfig
> ===================================================================
> --- linux-2.6.orig/drivers/acpi/Kconfig
> +++ linux-2.6/drivers/acpi/Kconfig
> @@ -4,7 +4,6 @@
>
> menuconfig ACPI
> bool "ACPI (Advanced Configuration and Power Interface) Support"
> - depends on !X86_NUMAQ
> depends on !X86_VISWS
> depends on !IA64_HP_SIM
> depends on IA64 || X86
> Index: linux-2.6/include/asm-x86/mpspec.h
> ===================================================================
> --- linux-2.6.orig/include/asm-x86/mpspec.h
> +++ linux-2.6/include/asm-x86/mpspec.h
> @@ -13,6 +13,12 @@ extern int apic_version[MAX_APICS];
> extern u8 apicid_2_node[];
> extern int pic_mode;
>
> +#ifdef CONFIG_X86_NUMAQ
> +extern int mp_bus_id_to_node[MAX_MP_BUSSES];
> +extern int mp_bus_id_to_local[MAX_MP_BUSSES];
> +extern int quad_local_to_mp_bus_id [NR_CPUS/4][4];
> +#endif
> +
> #define MAX_APICID 256
>
> #else
> Index: linux-2.6/arch/x86/kernel/summit_32.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kernel/summit_32.c
> +++ linux-2.6/arch/x86/kernel/summit_32.c
> @@ -36,7 +36,9 @@ static struct rio_table_hdr *rio_table_h
> static struct scal_detail *scal_devs[MAX_NUMNODES] __initdata;
> static struct rio_detail *rio_devs[MAX_NUMNODES*4] __initdata;
>
> +#ifndef CONFIG_X86_NUMAQ
> static int mp_bus_id_to_node[MAX_MP_BUSSES] __initdata;
> +#endif
>
> static int __init setup_pci_node_map_for_wpeg(int wpeg_num, int last_bus)
> {
> Index: linux-2.6/arch/x86/boot/compressed/misc.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/boot/compressed/misc.c
> +++ linux-2.6/arch/x86/boot/compressed/misc.c
> @@ -217,10 +217,6 @@ static char *vidmem;
> static int vidport;
> static int lines, cols;
>
> -#ifdef CONFIG_X86_NUMAQ
> -void *xquad_portio;
> -#endif
> -
> #include "../../../../lib/inflate.c"
>
> static void *malloc(int size)
> Index: linux-2.6/arch/x86/Makefile
> ===================================================================
> --- linux-2.6.orig/arch/x86/Makefile
> +++ linux-2.6/arch/x86/Makefile
> @@ -117,29 +117,11 @@ mcore-$(CONFIG_X86_VOYAGER) := arch/x86/
> mflags-$(CONFIG_X86_VISWS) := -Iinclude/asm-x86/mach-visws
> mcore-$(CONFIG_X86_VISWS) := arch/x86/mach-visws/
>
> -# NUMAQ subarch support
> -mflags-$(CONFIG_X86_NUMAQ) := -Iinclude/asm-x86/mach-numaq
> -mcore-$(CONFIG_X86_NUMAQ) := arch/x86/mach-default/
> -
> -# BIGSMP subarch support
> -mflags-$(CONFIG_X86_BIGSMP) := -Iinclude/asm-x86/mach-bigsmp
> -mcore-$(CONFIG_X86_BIGSMP) := arch/x86/mach-default/
> -
> -#Summit subarch support
> -mflags-$(CONFIG_X86_SUMMIT) := -Iinclude/asm-x86/mach-summit
> -mcore-$(CONFIG_X86_SUMMIT) := arch/x86/mach-default/
> -
> # generic subarchitecture
> mflags-$(CONFIG_X86_GENERICARCH):= -Iinclude/asm-x86/mach-generic
> fcore-$(CONFIG_X86_GENERICARCH) += arch/x86/mach-generic/
> mcore-$(CONFIG_X86_GENERICARCH) := arch/x86/mach-default/
>
> -
> -# ES7000 subarch support
> -mflags-$(CONFIG_X86_ES7000) := -Iinclude/asm-x86/mach-es7000
> -fcore-$(CONFIG_X86_ES7000) := arch/x86/mach-es7000/
> -mcore-$(CONFIG_X86_ES7000) := arch/x86/mach-default/
> -
> # RDC R-321x subarch support
> mflags-$(CONFIG_X86_RDC321X) := -Iinclude/asm-x86/mach-rdc321x
> mcore-$(CONFIG_X86_RDC321X) := arch/x86/mach-default/
> Index: linux-2.6/arch/x86/kernel/acpi/boot.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kernel/acpi/boot.c
> +++ linux-2.6/arch/x86/kernel/acpi/boot.c
> @@ -858,7 +858,7 @@ static int __init acpi_parse_madt_lapic_
> #ifdef CONFIG_X86_IO_APIC
> #define MP_ISA_BUS 0
>
> -#if defined(CONFIG_X86_ES7000) || defined(CONFIG_X86_GENERICARCH)
> +#ifdef CONFIG_X86_ES7000
> extern int es7000_plat;
> #endif
>
> @@ -1007,7 +1007,7 @@ void __init mp_config_acpi_legacy_irqs(v
> set_bit(MP_ISA_BUS, mp_bus_not_pci);
> Dprintk("Bus #%d is ISA\n", MP_ISA_BUS);
>
> -#if defined(CONFIG_X86_ES7000) || defined(CONFIG_X86_GENERICARCH)
> +#ifdef CONFIG_X86_ES7000
> /*
> * Older generations of ES7000 have no legacy identity mappings
> */
> Index: linux-2.6/arch/x86/mach-es7000/Makefile
> ===================================================================
> --- linux-2.6.orig/arch/x86/mach-es7000/Makefile
> +++ linux-2.6/arch/x86/mach-es7000/Makefile
> @@ -3,4 +3,3 @@
> #
>
> obj-$(CONFIG_X86_ES7000) := es7000plat.o
> -obj-$(CONFIG_X86_GENERICARCH) := es7000plat.o
> Index: linux-2.6/arch/x86/mach-es7000/es7000plat.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/mach-es7000/es7000plat.c
> +++ linux-2.6/arch/x86/mach-es7000/es7000plat.c
> @@ -177,53 +177,6 @@ find_unisys_acpi_oem_table(unsigned long
> }
> #endif
>
> -/*
> - * This file also gets compiled if CONFIG_X86_GENERICARCH is set. Generic
> - * arch already has got following function definitions (asm-generic/es7000.c)
> - * hence no need to define these for that case.
> - */
> -#ifndef CONFIG_X86_GENERICARCH
> -void es7000_sw_apic(void);
> -void __init enable_apic_mode(void)
> -{
> - es7000_sw_apic();
> - return;
> -}
> -
> -__init int mps_oem_check(struct mp_config_table *mpc, char *oem,
> - char *productid)
> -{
> - if (mpc->mpc_oemptr) {
> - struct mp_config_oemtable *oem_table =
> - (struct mp_config_oemtable *)mpc->mpc_oemptr;
> - if (!strncmp(oem, "UNISYS", 6))
> - return parse_unisys_oem((char *)oem_table);
> - }
> - return 0;
> -}
> -#ifdef CONFIG_ACPI
> -/* Hook from generic ACPI tables.c */
> -int __init acpi_madt_oem_check(char *oem_id, char *oem_table_id)
> -{
> - unsigned long oem_addr;
> - if (!find_unisys_acpi_oem_table(&oem_addr)) {
> - if (es7000_check_dsdt())
> - return parse_unisys_oem((char *)oem_addr);
> - else {
> - setup_unisys();
> - return 1;
> - }
> - }
> - return 0;
> -}
> -#else
> -int __init acpi_madt_oem_check(char *oem_id, char *oem_table_id)
> -{
> - return 0;
> -}
> -#endif
> -#endif /* COFIG_X86_GENERICARCH */
> -
> static void
> es7000_spin(int n)
> {
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

On the face of it the idea seems sound. The NUMAQ changes look ok on a
quick scan. I will need to see this applied and tested to be sure its
really sane.

-apw
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/