Re: [PATCH v11 08/28] tracing: Add lock-free tracing_map

From: Namhyung Kim
Date: Thu Oct 29 2015 - 04:31:56 EST


Hi Tom,

On Thu, Oct 22, 2015 at 01:14:12PM -0500, Tom Zanussi wrote:
> Add tracing_map, a special-purpose lock-free map for tracing.
>
> tracing_map is designed to aggregate or 'sum' one or more values
> associated with a specific object of type tracing_map_elt, which
> is associated by the map to a given key.
>
> It provides various hooks allowing per-tracer customization and is
> separated out into a separate file in order to allow it to be shared
> between multiple tracers, but isn't meant to be generally used outside
> of that context.
>
> The tracing_map implementation was inspired by lock-free map
> algorithms originated by Dr. Cliff Click:
>
> http://www.azulsystems.com/blog/cliff/2007-03-26-non-blocking-hashtable
> http://www.azulsystems.com/events/javaone_2007/2007_LockFreeHash.pdf
>
> Signed-off-by: Tom Zanussi <tom.zanussi@xxxxxxxxxxxxxxx>
> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@xxxxxxxxxxx>
> ---

[SNIP]
> +void tracing_map_array_free(struct tracing_map_array *a)
> +{
> + unsigned int i;
> +
> + if (!a->pages)
> + return;

You need to free 'a' anyway.. Or did you want to check 'a' being NULL?


> +
> + for (i = 0; i < a->n_pages; i++)
> + free_page((unsigned long)a->pages[i]);
> +
> + kfree(a);
> +}
> +
> +struct tracing_map_array *tracing_map_array_alloc(unsigned int n_elts,
> + unsigned int entry_size)
> +{
> + struct tracing_map_array *a;
> + unsigned int i;
> +
> + a = kzalloc(sizeof(*a), GFP_KERNEL);
> + if (!a)
> + return NULL;
> +
> + a->entry_size_shift = fls(roundup_pow_of_two(entry_size) - 1);
> + a->entries_per_page = PAGE_SIZE / (1 << a->entry_size_shift);
> + a->n_pages = n_elts / a->entries_per_page;
> + if (!a->n_pages)
> + a->n_pages = 1;
> + a->entry_shift = fls(a->entries_per_page) - 1;
> + a->entry_mask = (1 << a->entry_shift) - 1;
> +
> + a->pages = kcalloc(a->n_pages, sizeof(void *), GFP_KERNEL);
> + if (!a->pages)
> + goto free;
> +
> + for (i = 0; i < a->n_pages; i++) {
> + a->pages[i] = (void *)get_zeroed_page(GFP_KERNEL);
> + if (!a->pages)

if (!a->pages[i])


> + goto free;
> + }
> + out:
> + return a;
> + free:
> + tracing_map_array_free(a);
> + a = NULL;
> +
> + goto out;
> +}
> +

[SNIP]
> +/**
> + * tracing_map_insert - Insert key and/or retrieve val from a tracing_map
> + * @map: The tracing_map to insert into
> + * @key: The key to insert
> + *
> + * Inserts a key into a tracing_map and creates and returns a new
> + * tracing_map_elt for it, or if the key has already been inserted by
> + * a previous call, returns the tracing_map_elt already associated
> + * with it. When the map was created, the number of elements to be
> + * allocated for the map was specified (internally maintained as
> + * 'max_elts' in struct tracing_map), and that number of
> + * tracing_map_elts was created by tracing_map_init(). This is the
> + * pre-allocated pool of tracing_map_elts that tracing_map_insert()
> + * will allocate from when adding new keys. Once that pool is
> + * exhausted, tracing_map_insert() is useless and will return NULL to
> + * signal that state.
> + *
> + * This is a lock-free tracing map insertion function implementing a
> + * modified form of Cliff Click's basic insertion algorithm. It
> + * requires the table size be a power of two. To prevent any
> + * possibility of an infinite loop we always make the internal table
> + * size double the size of the requested table size (max_elts * 2).
> + * Likewise, we never reuse a slot or resize or delete elements - when
> + * we've reached max_elts entries, we simply return NULL once we've
> + * run out of entries. Readers can at any point in time traverse the
> + * tracing map and safely access the key/val pairs.
> + *
> + * Return: the tracing_map_elt pointer val associated with the key.
> + * If this was a newly inserted key, the val will be a newly allocated
> + * and associated tracing_map_elt pointer val. If the key wasn't
> + * found and the pool of tracing_map_elts has been exhausted, NULL is
> + * returned and no further insertions will succeed.
> + */
> +struct tracing_map_elt *tracing_map_insert(struct tracing_map *map, void *key)
> +{
> + u32 idx, key_hash, test_key;
> + struct tracing_map_entry *entry;
> +
> + key_hash = jhash(key, map->key_size, 0);
> + if (key_hash == 0)
> + key_hash = 1;
> + idx = key_hash >> (32 - (map->map_bits + 1));
> +
> + while (1) {
> + idx &= (map->map_size - 1);
> + entry = TRACING_MAP_ENTRY(map->map, idx);
> + test_key = entry->key;
> +
> + if (test_key && test_key == key_hash && entry->val &&
> + keys_match(key, entry->val->key, map->key_size))
> + return entry->val;
> +
> + if (!test_key && !cmpxchg(&entry->key, 0, key_hash)) {
> + struct tracing_map_elt *elt;
> +
> + elt = get_free_elt(map);
> + if (!elt)
> + break;
> + memcpy(elt->key, key, map->key_size);
> + entry->val = elt;
> +
> + return entry->val;
> + }
> + idx++;
> + }
> +
> + return NULL;
> +}

IIUC this always insert new entry if no matching key found. And if
the map is full, it only fails after walking through the entries to
find an empty one, mark the entry with the key and call to
get_free_elt() returns NULL. As more key added, it worsenes the
problem since more entries will be marked with no value IMHO.

I can see you checked hist_data->drops in the next patch to work
around this problem. But IMHO it's suboptimal since it cannot update
the existing entries too. I think it'd be better having lookup-only
version of this function and use it after it sees drops. The lookup
function can bail out from the loop if the insert doesn't mark empty
entry anymore IMHO.

Thoughts?

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/