Re: [RFC PATCH 2/3] docs: scheduler: Add scheduler overview documentation

From: Matthew Wilcox
Date: Wed Apr 01 2020 - 07:54:24 EST


On Wed, Apr 01, 2020 at 01:00:28PM +0300, John Mathew wrote:
> +====================
> +CFS Data Structures
> +====================
> +
> +Main parts of the Linux scheduler are:

The main parts ...

> +**Running Queue:** This is the central data structure of process scheduling. It

I've never heard anyone call this 'Running Queue'. It's always called
'run queue'.

> +the CPU core. The main members of the :c:type:`struct rq <rq>` are:
> +
> +:c:member:`nr_running`
> + Total number of tasks on the runqueue.

This seems like it should be kernel-doc so the documentation is with the
code ... meaning it might possibly get updated when the code changes as
opposed to languishing over here.

> +Each rq struct points to a cfs_rq struct which represents the rb tree. The main

How does a cfs_rq struct represent an rb tree? I suspect what you intended
to say is that the cfs_rq structs are stored in an rb tree (I'm not familiar
with the details of the scheduler implementation).

More generally, you're making a mistake that a lot of documentation
writers make which is to overdescribe the current implementation.
The important thing to document is that they're stored in a tree; the
type of tree used is an irrelevant detail. If that changes, we shouldn't
need to update this documentation.

> +Each scheduling entity may be run from its parents runqueue. Scheduler traverses

The scheduler ...

also, please format your line lengths to more like 75 characters; don't go
all the way to 80. Just use 'fmt'; its defaults work fine.

> +vruntime is the value by which tasks are ordered on the red-black tree. Tasks are
> +arranged in increasing order of vruntime which is the amount of time a task has
> +spent running on the cpu.vruntime of a task is updated periodically based on the
> +:c:func:`scheduler_tick` function.

This is a backwards explanation.

vruntime is the amount of time a task has spent running on the cpu. It is
updated periodically by scheduler_tick(). Tasks are stored in the
scheduler's tree sorted by vruntime.

> +History
> +-------
> +
> +Linux 2.6.23 introduced a modular scheduler core and a Completely Fair Scheduler
> +(CFS) implemented as a scheduling module. Scheduler has been improving since
> +kernel version 2.4. In kernel 2.4 there was one running queue for all processes.

I would drop 'Scheduler has been improving since kernel version 2.4'. I
just assume that Linux has been improving.

> +CFS uses a time ordered red-black tree for each CPU. The red-black tree is a type
> +of self-balancing binary search tree. Every running process, has a node in the

Don't explain what an rbtree is, just link to the rbtree documentation.
Which, admittedly, hasn't been converted to rst format yet, but link to
rbtree.txt and someone else can fix that up when they do the conversion.