"vmalloc", friends and GFP flags

From: Russell King (rmk@arm.linux.org.uk)
Date: Wed Jan 01 2003 - 16:49:33 EST


Hi,

This mail is being copied to the XFS and PARISC mailing lists. You should
probably trim the reply list as appropriate.

I've just been looking over the state of the ARM {dma,pci}_alloc_consistent
wrt interrupts etc in 2.5.53, and decided to have a look around this area
for other users. I stumbled across the following unsafe areas.

a) parisc. Looking at pa11_dma_alloc_consistent:

   pa11_dma_alloc_consistent
        -> map_uncached_pages
                -> (eventually) map_pmd_uncached
                        -> pte_alloc_kernel
                                -> pte_alloc_one_kernel

        static inline pte_t *
        pte_alloc_one_kernel(struct mm_struct *mm, unsigned long addr)
        {
                pte_t *pte = (pte_t *) __get_free_page(GFP_KERNEL);
                if (likely(pte != NULL))
                        clear_page(pte);
                return pte;
        }

   End result is that if pa11_dma_alloc_consistent() is called from
   IRQ context, parisc calls __get_free_page using GFP_KERNEL.
   Calling __get_free_page with GFP_KERNEL from IRQ context appears
   to be unsafe (always has been.)

b) xfs. XFS contains this bit of code:

        static __inline unsigned int flag_convert(int flags)
        {
                ...
                if (flags & KM_NOSLEEP)
                        return GFP_ATOMIC;
                /* If we're in a transaction, FS activity is not ok */
                else if ((current->flags & PF_FSTRANS) || (flags & KM_NOFS))
                        return GFP_NOFS;
                else
                        return GFP_KERNEL;
        }

        void *kmem_alloc(size_t size, int flags)
        {
                ...
                rval = __vmalloc(size, flag_convert(flags), PAGE_KERNEL);
                ...
        }

   Ok, so xfs can pass GFP_ATOMIC, GFP_NOFS and GFP_KERNEL to __vmalloc.
   However, do these flags actually ensure what they suggest they do?

   Lets look at __vmalloc:

        void *__vmalloc(unsigned long size, int gfp_mask, pgprot_t prot)
        {
                ...
                area = get_vm_area(size, VM_ALLOC);
                ...
        }

   and get_vm_area:

        struct vm_struct *get_vm_area(unsigned long size, unsigned long flags)
        {
                ...
                area = kmalloc(sizeof(*area), GFP_KERNEL);
                ...
        }

   Also, look at the PMD and PTE allocation functions, which also use
   GFP_KERNEL.

   Answer to the above question: no. Is this unsafe? Probably.

Now, what does this all have to do with ARM and pci_alloc_consistent.
It's the same problem, only I have a BUG_ON() in there to catch uses
from IRQ context, which needs to disappear eventually. (This is
mainly a requirement for USB to work.)

However, before this can happen, I believe I'd need the following:

1. allocation of vm areas (via get_vm_area and friends) needs to become
   IRQ-safe.

   - vmlist_lock needs to become IRQ safe (but its held during vread/vwrite)
   - kmalloc of vm_structs needs to have gfp flags passed in

2. allocation of page tables (via pmd_alloc_kernel / pte_alloc_kernel)
   needs to become IRQ-safe.

   - pmd_alloc_kernel / pte_alloc_kernel needs to have gfp flags passed in
   - map_vm_area would also need gfp flags

The same solution also fixes the two instances described above.

I suspect, however, that the vmlist_lock change will be unacceptable
to many people however. This doesn't affect XFS nor parisc, but would
mean that I'd need to completely rewrite the ARM consistent memory
allocation to use none of the above (which I think is the right
approach, but means _all_ of the above becomes S.E.P. 8))

-- 
Russell King (rmk@arm.linux.org.uk)                The developer of ARM Linux
             http://www.arm.linux.org.uk/personal/aboutme.html

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Jan 07 2003 - 22:00:15 EST