Re: [PATCH V2 2/8] perf tools: rewrite perf mmap read for overwrite

From: Wangnan (F)
Date: Mon Dec 18 2017 - 03:52:46 EST




On 2017/12/7 7:32, kan.liang@xxxxxxxxx wrote:
From: Kan Liang <kan.liang@xxxxxxxxx>

perf_mmap__read_catchup and perf_mmap__read_backward are used to read
events one by one from ring buffer under overwrite mode.
It always read the stale buffer which is already processed.
Because the previous location of processed ring buffer is discarded.

Introducing the perf_mmap__read_done, which update the map->prev to
indicate the position of processed buffer.

Refining perf_mmap__read_catchup to calculate the start and the end of
ring buffer. It only need to do the calculation once at the beginning,
because the ring buffer is paused in overwrite mode.
Doesn't need to do the calculation in perf_mmap__read_backward.

For now, the only user is backward-ring-buffer in perf test.

Signed-off-by: Kan Liang <kan.liang@xxxxxxxxx>
---
tools/perf/tests/backward-ring-buffer.c | 9 +++-
tools/perf/util/mmap.c | 88 +++++++++++++++++++--------------
tools/perf/util/mmap.h | 7 ++-
3 files changed, 63 insertions(+), 41 deletions(-)

diff --git a/tools/perf/tests/backward-ring-buffer.c b/tools/perf/tests/backward-ring-buffer.c
index 4035d43..66d9e54 100644
--- a/tools/perf/tests/backward-ring-buffer.c
+++ b/tools/perf/tests/backward-ring-buffer.c
@@ -31,10 +31,14 @@ static int count_samples(struct perf_evlist *evlist, int *sample_count,
int i;
for (i = 0; i < evlist->nr_mmaps; i++) {
+ struct perf_mmap *map = &evlist->overwrite_mmap[i];
union perf_event *event;
+ unsigned long size;
+ u64 start, end;
- perf_mmap__read_catchup(&evlist->overwrite_mmap[i]);
- while ((event = perf_mmap__read_backward(&evlist->overwrite_mmap[i])) != NULL) {
+ perf_mmap__read_catchup(map, true, &start, &end, &size);
+ while ((event = perf_mmap__read_backward(map, &start, end)) !=
+ NULL) {
const u32 type = event->header.type;
switch (type) {
@@ -49,6 +53,7 @@ static int count_samples(struct perf_evlist *evlist, int *sample_count,
return TEST_FAIL;
}
}
+ perf_mmap__read_done(map);
}
return TEST_OK;
}
diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index 05076e6..bf67460 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -85,51 +85,65 @@ union perf_event *perf_mmap__read_forward(struct perf_mmap *map)
return perf_mmap__read(map, old, head, &map->prev);
}
-union perf_event *perf_mmap__read_backward(struct perf_mmap *map)
+union perf_event *perf_mmap__read_backward(struct perf_mmap *map,
+ u64 *start, u64 end)
{
- u64 head, end;
- u64 start = map->prev;
-
- /*
- * Check if event was unmapped due to a POLLHUP/POLLERR.
- */
- if (!refcount_read(&map->refcnt))
- return NULL;
-

Is removing this checking safe?

- head = perf_mmap__read_head(map);
- if (!head)
- return NULL;
+ union perf_event *event = NULL;
- /*
- * 'head' pointer starts from 0. Kernel minus sizeof(record) form
- * it each time when kernel writes to it, so in fact 'head' is
- * negative. 'end' pointer is made manually by adding the size of
- * the ring buffer to 'head' pointer, means the validate data can
- * read is the whole ring buffer. If 'end' is positive, the ring
- * buffer has not fully filled, so we must adjust 'end' to 0.
- *
- * However, since both 'head' and 'end' is unsigned, we can't
- * simply compare 'end' against 0. Here we compare '-head' and
- * the size of the ring buffer, where -head is the number of bytes
- * kernel write to the ring buffer.
- */
- if (-head < (u64)(map->mask + 1))
- end = 0;
- else
- end = head + map->mask + 1;
+ event = perf_mmap__read(map, *start, end, &map->prev);
+ *start = map->prev;
- return perf_mmap__read(map, start, end, &map->prev);
+ return event;
}

perf_mmap__read_backward() and perf_mmap__read_forward() become asymmetrical,
but their names are symmetrical.


-void perf_mmap__read_catchup(struct perf_mmap *map)
+static int overwrite_rb_find_range(void *buf, int mask, u64 head,
+ u64 *start, u64 *end);
+
+int perf_mmap__read_catchup(struct perf_mmap *map,
+ bool overwrite,
+ u64 *start, u64 *end,
+ unsigned long *size)
{
- u64 head;
+ u64 head = perf_mmap__read_head(map);
+ u64 old = map->prev;
+ unsigned char *data = map->base + page_size;
- if (!refcount_read(&map->refcnt))
- return;
+ *start = overwrite ? head : old;
+ *end = overwrite ? old : head;
- head = perf_mmap__read_head(map);
- map->prev = head;
+ if (*start == *end)
+ return 0;
+
+ *size = *end - *start;
+ if (*size > (unsigned long)(map->mask) + 1) {
+ if (!overwrite) {
+ WARN_ONCE(1, "failed to keep up with mmap data. (warn only once)\n");
+
+ map->prev = head;
+ perf_mmap__consume(map, overwrite);
+ return 0;
+ }
+
+ /*
+ * Backward ring buffer is full. We still have a chance to read
+ * most of data from it.
+ */
+ if (overwrite_rb_find_range(data, map->mask, head, start, end))
+ return -1;
+ }
+
+ return 1;

Coding suggestion: this function returns 2 different value (1 and -1) in
two fail path. Should return 0 for success and -1 for failure.

You totally redefine perf_mmap__read_catchup(). The original meaning of
this function is adjust md->prev so following perf_mmap__read() can read
events one by one safely. Only backward ring buffer needs catchup:
kernel's 'head' pointer keeps moving (backward), while md->prev keeps
unchanged. For forward ring buffer, md->prev will catchup each time when
reading from the ring buffer.

Your patch totally changes its meaning. Now its meaning is finding the
available data from a ring buffer (report start and end). At least we
need a better name. In addition, if I understand your code correctly, we
don't need to report 'size' to caller, because size is determined by
start and end.

In addition, since the concept of backward and overwrite are now
unified, we can avoid updating map->prev during one-by-one reading
(because backward reading don't need consume operation). I think we can
decouple the updating of map->prev and these two basic read functions.
This can makes the operations on map->prev clearer. Now the moving
direction of map->prev is confusing for backward ring buffer: it moves
forward during one by one reading, and moves backward when block reading
(in perf record) and when syncing.

Then the core of two reading become uniform. Let's get rid of
duplication.

Summarize:

1. Compute both start and end before reading for both forward and
backward (perf_mmap__read_catchup, but need a better name).

2. Don't update map->prev in perf_mmap__read. Let perf_mmap__read()
update start pointer, so it become stateless and suit for both backward
and forward reading.

3. Update md->prev for forward reading. Merge it into
perf_mmap__consume?

Thank you.