Re: Thread implementations...

Eric W. Biederman (ebiederm+eric@npwt.net)
24 Jun 1998 23:45:52 -0500


>>>>> "RG" == Richard Gooch <Richard.Gooch@atnf.CSIRO.AU> writes:

RG> Eric W. Biederman writes:
>> >>>>> "RG" == Richard Gooch <Richard.Gooch@atnf.CSIRO.AU> writes:

>> With madvise(3) following the traditional format with only one
RG> ^
RG> Don't you mean 2?

My suggestion:
madvise(2)(struct madvise_struct *, int number_of_structs);
madvise(3)(caddr_t addr, size_t len, size_t strategy);

madvise(3) being in libc...

>> advisement can be done easily. The reason I suggest multiple
>> arguments is that for apps that have random but predictable access
>> patterns will want to use MADV_WILLNEED & MADV_DONTNEED to an optimum
>> swapping algorigthm.

RG> I'm not aware of madvise() being a POSIX standard. I've appended the
RG> man page from alpha_OSF1, which looks reasonable. It would be nice to
RG> be compatible with something.

According to the kernel source it is available on:
the alpha, mips, and sparc. And the mips code thinks there is a posix
version somewhere.

Does someone have the Sun/sparc man page? Besides what is in the
kernel source I mean.

> MADV_WILLNEED
This needs to start an asynchronouse pagein if necessary.

> MADV_DONTNEED
> Do not need these pages

> The system will free any resident pages that are allo-
> cated to the region. All modifications will be lost
> and any swapped out pages will be discarded. Subse-
> quent access to the region will result in a zero-fill-
> on-demand fault as though it is being accessed for the
> first time. Reserved swap space is not affected by
> this call.

This one is broken, for 3 reasons.
1) madvise should only give advise.
2) This can be done with mmap(start, len, PROT..., MAP_ANON, -1, 0)
3) There is a more reasonable interpretation from IRIX:

MADV_DONTNEED informs the system that the address range from addr to
addr + len will likely not be referenced in the near
future. The memory to which the indicated addresses are
mapped will be the first to be reclaimed when memory is
needed by the system.

Which means that with a smart programmer you can implement the optimal
swapping algorithm for your process with MADV_DONTNEED and
MADV_WILLNEED and be relatively portable.

Of course MADV_SEQUENTIAL should handle the case of sending a file out
a socket, for a userspace sendfile.

> MADV_SPACEAVAIL
> Ensure that resources are reserved

This one also does more than advise and for that reason I don't like it.

Anyhow this looks like something to keep in mind for 2.3.
Currently I have too many projects in the air to do more than think
the interface through. The mapping type could easily be stored in the
vma as a hind though. Perhaps it could be ready for 2.2 but I
couldn't do it.

Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu