Re: 1.3.94 Ooops For Sale...

Andrew Tridgell (tridge@cs.anu.edu.au)
Wed, 1 May 1996 17:40:06 +1000


> Ick... why not have a seperate kernel thread do all this stuff that
> can sleep. Take a look at arch/sparc/mm/asyncd.c for an easy and
> efficient method to get things done out of intterupt context in the
> kernel.

I thought I'd take this opportunity to explain what asyncd does and
why it was added. As David points out, it may provide a neat mechanism
for solving all sorts of "you can't do that in an interrupt" type
problems.

Asyncd was added to solve a problem we were having in the port of
Linux to the AP1000+ multicomputer (see
http://cap.anu.edu.au/cap/projects/linux for info on the port)

The AP+ is a distributed memory multicomputer with hardware support
for fast user level memory transfers between cells. These transfers
take the form of get() or put() operations which in their simplest
form take a local and remote virtual address and a transfer size.

This works fine under Linux except in the case where the remote memory
isn't actually mapped at the time of the operation. The memory might
be in a part of the bss that hasn't been instantiated yet, or might
have been swapped out, or might be a shared page which needs to be
copied before it can be written to. In any case the memory isn't
there so the hardware complains when it tries to use it.

When this happens the messaging hardware issues a interrupt 11 and
sets some registers to tell the OS the mmu context and address the
problem occured for. The OS then needs to map in some memory at that
address before it tells the hardware to continue.

My original solution was to rewrite a large section of kernel/memory.c
to try to handle this within the interrupt handler. This solution was
doomed to failure as the page may actually be resident on a remote NFS
server and there can be arbitrary delays. It was an ugly hack.

David Millers suggestion was to have a kernel thread which slept
waiting on a queue of such paging requests. The interrupt handler then
just calls the add_to_queue() routine, passing the relevant
information along with a pointer to a callback function. The asyncd
wakes up, handles the request as though it were a normal user task
(with sleeps etc), then calls the callback indicating success or
failure. The callback then tells the messaging hardware to continue
appropriately.

I implemented asyncd based on skeleton code from David and it works
very nicely. We can now safely access remote memory within a parallel
program.

We will probably use a similar mechanism to support remote paging (as
happens after a remote fork) and remote system calls when we get
around to it.

Andrew