[RFC v2 0/6] Track RCU dereferences in RCU read-side critical sections

From: Boqun Feng
Date: Tue Feb 16 2016 - 00:58:36 EST


Hi all,

This is the v2 for RCU_LOCKED_ACCESS.

Link of v1: http://article.gmane.org/gmane.linux.kernel/2143674

Changes since v1:

* Define newly introduced irq_context helpers as macros when
lockdep doesn't trace irq_context, because these helpers will be
quite simple then and this could avoid some "defined but not
used" warnings.

* Introduce a new macro to static define a lock_class_key for RCU,
in order to work around a compiler's "bug" about embracer-enclosed
initializer list. (Paul Mckenney)


As a characteristic of RCU, read-side critical sections have a very
loose connection with rcu_dereference()s, which is you can only be sure
about an rcu_dereference() might be called in some read-side critical
section, but if code gets complex, you may not be sure which read-side
critical section exactly, this might be also an problem for some other
locking mechanisms, that is the critical sections protecting data and
the data accesses protected are not clearly correlated.

In this series, we are introducing LOCKED_ACCESS framework and based on
which, we implement the RCU_LOCKED_ACCESS functionality to give us a
clear hint: which rcu_dereference() happens in which RCU read-side
critical section.

After this series applied, and if CONFIG_RCU_LOCKED_ACCESS=y, the proc
file /proc/locked_access/rcu will show all relationships collected so
far for rcu_read_lock() and their friends and rcu_dereference*().

Snippets of /proc/locked_access/rcu are as follow:

...(this rcu_dereference() happens after one rcu_read_lock())
...
ACQCHAIN 0xfdbf0c6aeea, 1 locks, irq_context 0:
LOCK at [<ffffffff812b1115>] get_proc_task_net+0x5/0x140
ACCESS TYPE 1 at kernel/pid.c:441
...
...(this rcu_dereference() happens after three rcu_read_lock())
...
ACQCHAIN 0xfe042af3bbfb2605, 3 locks, irq_context 0:
LOCK at [<ffffffff81094b47>] SyS_kill+0x97/0x2a0
LOCK at [<ffffffff8109286f>] kill_pid_info+0x1f/0x140
LOCK at [<ffffffff81092605>] group_send_sig_info+0x5/0x130
ACCESS TYPE 1 at kernel/signal.c:695
...


This patchset is based on v4.5-rc2 and consists of 6 patches(in which
patch 2-5 are the implementation of LOCKED_ACCESS):

1. Introduce some functions of irq_context.

2. Introduce locked access class and acqchain.

3. Maintain the keys of acqchains.

4. Introduce the entry point of LOCKED_ACCESS.

5. Add proc interface for locked access class

6. Enables LOCKED_ACCESS for RCU.

Tested by 0day and I also did a simple test on x86: build and boot a
kernel with RCU_LOCKED_ACCESS=y and CONFIG_PROVE_LOCKING=y and ran
several workloads(kernel building, git cloning, dbench), and
/proc/locked_access/rcu was able to collect the relationships between
~300 RCU read-critical sections and ~500 rcu_dereference*().

Looking forwards to any suggestion, comment and question ;-)

Regards,
Boqun