Re: [RFC PATCH 1/2] lib/percpu-list: Per-cpu list with associated per-cpu locks

From: Waiman Long
Date: Wed Feb 17 2016 - 10:56:24 EST


On 02/17/2016 04:53 AM, Dave Chinner wrote:
On Tue, Feb 16, 2016 at 08:31:19PM -0500, Waiman Long wrote:
Linked list is used everywhere in the Linux kernel. However, if many
threads are trying to add or delete entries into the same linked list,
it can create a performance bottleneck.

This patch introduces a new per-cpu list subystem with associated
per-cpu locks for protecting each of the lists individually. This
allows list entries insertion and deletion operations to happen in
parallel instead of being serialized with a global list and lock.

List entry insertion is strictly per cpu. List deletion, however, can
happen in a cpu other than the one that did the insertion. So we still
need lock to protect the list. Because of that, there may still be
a small amount of contention when deletion is being done.

A new header file include/linux/percpu-list.h will be added with the
associated percpu_list structure. The following functions are used
to manage the per-cpu list:

1. int init_percpu_list_head(struct percpu_list **pclist_handle)
2. void percpu_list_add(struct percpu_list *new,
struct percpu_list *head)
3. void percpu_list_del(struct percpu_list *entry)
A few comments on the code

+ * A per-cpu list protected by a per-cpu spinlock.
+ *
+ * The list head percpu_list structure contains the spinlock, the other
+ * entries in the list contain the spinlock pointer.
+ */
+struct percpu_list {
+ struct list_head list;
+ union {
+ spinlock_t lock; /* For list head */
+ spinlock_t *lockptr; /* For other entries */
+ };
+};
This union is bad for kernels running spinlock debugging - the size
of the spinlock can blow out, and that increases the size of any
object that has a percpu_list in it. I've only got basic spinlock
debugging turned on, and the spinlock_t is 24 bytes with that
config. If I turn on lockdep, it gets much larger again....

So it might be best to separate the list head and list entry
structures, similar to a hash list?

Right. I will split it into 2 separate structure in the next iteration of the patch.

+static inline void INIT_PERCPU_LIST_HEAD(struct percpu_list *pcpu_list)
+{
+ INIT_LIST_HEAD(&pcpu_list->list);
+ pcpu_list->lock = __SPIN_LOCK_UNLOCKED(&pcpu_list->lock);
+}
+
+static inline void INIT_PERCPU_LIST_ENTRY(struct percpu_list *pcpu_list)
+{
+ INIT_LIST_HEAD(&pcpu_list->list);
+ pcpu_list->lockptr = NULL;
+}
These function names don't need to shout.

I was just following the convention used in list init functions. I can certainly change them to lowercase.


+/**
+ * for_all_percpu_list_entries - iterate over all the per-cpu list with locking
+ * @pos: the type * to use as a loop cursor for the current .
+ * @next: an internal type * variable pointing to the next entry
+ * @pchead: an internal struct list * of percpu list head
+ * @pclock: an internal variable for the current per-cpu spinlock
+ * @head: the head of the per-cpu list
+ * @member: the name of the per-cpu list within the struct
+ */
+#define for_all_percpu_list_entries(pos, next, pchead, pclock, head, member)\
+ { \
+ int cpu; \
+ for_each_possible_cpu (cpu) { \
+ typeof(*pos) *next; \
+ spinlock_t *pclock = per_cpu_ptr(&(head)->lock, cpu); \
+ struct list_head *pchead =&per_cpu_ptr(head, cpu)->list;\
+ spin_lock(pclock); \
+ list_for_each_entry_safe(pos, next, pchead, member.list)
+
+#define end_all_percpu_list_entries(pclock) spin_unlock(pclock); } }
This is a bit of a landmine - the code inside he iteration is under
a spinlock hidden in the macros. People are going to forget about
that, and it's needs documenting about how it needs to be treated
w.r.t. dropping and regaining the lock so sleeping operations can be
performed on objects on the list being iterated.

Can we also think up some shorter names - names that are 30-40
characters long are getting out out of hand given we're supposed
tobe sticking to 80 character widths and we lost 8 of them in the
first indent...

Cheers,

Dave.

I will try to shorten the name and better document the macro. This is probably the most tricky part of the whole part.

Cheers,
Longman