Re: [RFC][PATCH 2/2] vmalloc: vread vwrite avoid memory hole.

From: Amerigo Wang
Date: Fri Jul 31 2009 - 05:36:26 EST


On Wed, Jul 29, 2009 at 08:51:23PM +0900, KAMEZAWA Hiroyuki wrote:
>On Wed, 29 Jul 2009 17:59:32 +0800
>Amerigo Wang <xiyou.wangcong@xxxxxxxxx> wrote:
>
>> On Wed, Jul 29, 2009 at 05:48:10PM +0900, KAMEZAWA Hiroyuki wrote:
>> >Considering recent changes, I think this an answer to this bug.
>> >
>> >All get_vm_area() calls specified VM_IOREMAP before 2.6.30.
>> >Now, percpu calls get_vm_area() without IOREMAP (it means kcore doesn't skip this)
>> >And it seems pcpu does delayed map to pre-allocated vmalloc-area.
>> >Then, now, vmalloc() area has memory holes. (I'm sorry if I misunderstand.)
>>
>>
>> You mean the guard holes in vmalloc area?
>>
>no, not guard hole. memory hole in the middle of valid vmalloc area, like this
>
>start end
> XXXXX0XXXXXG XXXXXXXXG XXXXXXXXG
>X= pte is present
>O= pte is none
>G= pte is none as guard page.
>


Hmm, I get your meaning now... thanks


>> >+
>> >+static int aligned_vwrite(char *buf, char *addr, unsigned long count)
>> >+{
>> >+ struct page *p;
>> >+ int copied;
>> >+
>> >+ while (count) {
>> >+ unsigned long offset, length;
>> >+
>> >+ offset = (unsinged long)addr & PAGE_MASK;
>> >+ length = PAGE_SIZE - offset;
>> >+ if (length > count)
>> >+ length = count;
>> >+ /* confirm the page is present */
>> >+ p = vmalloc_to_page(addr);
>> >+ if (p)
>> >+ memcpy(addr, buf, length);
>>
>>
>> So when !p, you copy nothing, but still increase the length,
>> *silently*?
>Ah, thanks, it's careless bug...will fix.
>
>
>
>>
>> This looks a bit weird for me... I am thinking if we need
>> to return the number of bytes that is *actually* copied?
>>
>I think it makes the caller complex.
>
>Then, I have 2 options.
> a) skip memory hole
> b) if we find memory hole, return immediately.
>
>Do you like b) ?
>But in old behavior, vwrite()/vread() incremetns its "addr"
>(at read zero-fill).
>
>Then, I used a).


Ok, just want to make sure it's safe to do a).



>
>
>>
>> >+ addr += length;
>> >+ buf += length;
>> >+ copiled += length;
>> >+ count -= length;
>> >+ }
>> >+ return copied;
>> >+}
>> >+
>> >+long vread(char *buf, char *addr, unsigned long count, bool skip_ioremap)
>> > {
>> > struct vm_struct *tmp;
>> > char *vaddr, *buf_start = buf;
>> >@@ -1640,10 +1694,11 @@ long vread(char *buf, char *addr, unsign
>> > count = -(unsigned long) addr;
>> >
>> > read_lock(&vmlist_lock);
>> >- for (tmp = vmlist; tmp; tmp = tmp->next) {
>> >+ for (tmp = vmlist; count && tmp; tmp = tmp->next) {
>> > vaddr = (char *) tmp->addr;
>> > if (addr >= vaddr + tmp->size - PAGE_SIZE)
>> > continue;
>> >+
>> > while (addr < vaddr) {
>> > if (count == 0)
>> > goto finished;
>> >@@ -1653,14 +1708,15 @@ long vread(char *buf, char *addr, unsign
>> > count--;
>> > }
>> > n = vaddr + tmp->size - PAGE_SIZE - addr;
>> >- do {
>> >- if (count == 0)
>> >- goto finished;
>> >- *buf = *addr;
>> >- buf++;
>> >- addr++;
>> >- count--;
>> >- } while (--n > 0);
>> >+ if (n > count)
>> >+ n = count;
>> >+ if (!(tmp->flags & VM_IOREMAP) || !skip_ioremap)
>> >+ aligned_vread(buf, addr, n);
>> >+ else
>> >+ memset(buf, 0, n);
>> >+ buf += n;
>> >+ addr += n;
>> >+ count -= n;
>>
>>
>> If I set 'skip_ioremap' true, I want the return value can be
>> what it _actually_ copies, i.e. kick out the bytes counted
>> for VM_IOREMAP. What do you think?
>>
>
>One one od the reasons is The same reason to above.
>(adjusted to old behavior)
>
>IIUC, In following kind of codes,
>
> len = vread(buf,addr,xxx);
> bud += len;
> addr += len
>
>len should be the length that the caller should increment buffer address.
>So, I just skipped hole.


Well... old code doesn't have the parameter 'skip_ioremap'. :-)
For new code, we can do:

len = vread(buf, addr, count, true);
buf += len;
addr += count;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/