Re: [PATCH v1 2/2]: perf record: enable asynchronous trace writing

From: Alexey Budankov
Date: Mon Aug 27 2018 - 05:20:59 EST


Hi,

On 27.08.2018 0:48, Jiri Olsa wrote:
> On Tue, Aug 21, 2018 at 11:27:03AM +0300, Alexey Budankov wrote:
>
> SNIP
>
>> -static int record__pushfn(void *to, void *bf, size_t size)
>> +static int record__pushfn(void *to, void *bf, size_t size, off_t off)
>> {
>> struct record *rec = to;
>> + struct perf_mmap *map = bf;
>>
>> rec->samples++;
>> - return record__write(rec, bf, size);
>> + return record__aio_write(rec->session->data->file.fd, &map->cblock,
>> + map->data, size, off);
>> }
>>
>> static volatile int done;
>> @@ -528,13 +530,85 @@ static struct perf_event_header finished_round_event = {
>> .type = PERF_RECORD_FINISHED_ROUND,
>> };
>>
>> +static int record__mmap_read_sync(int trace_fd, struct aiocb **cblocks,
>> + int cblocks_size, struct record *rec)
>> +{
>> + size_t rem;
>> + ssize_t size;
>> + off_t rem_off;
>> + int i, aio_ret, aio_errno, do_suspend;
>> + struct perf_mmap *md;
>> + struct timespec timeout0 = { 0, 0 };
>> + struct timespec timeoutS = { 0, 1000 * 500 * 1 };
>> +
>> + if (!cblocks_size)
>> + return 0;
>> +
>> + do {
>> + do_suspend = 0;
>> + nanosleep(&timeoutS, NULL);
>
> why the extra sleep in here and not sleeping through aio_suspend call?

Yep. Good question. That requires explicit explanation:

+ /* aio_suspend() implementation inside glibc (as of v2.27) is
+ * intrusive and not just blocks waiting io requests completion
+ * but polls requests queue inducing context switches in perf
+ * tool process. When profiling in system wide mode with tracing
+ * context switches the trace may be polluted by context switches
+ * from the perf process and the trace size becomes about 3-5
+ * times bigger than that of when writing the trace serially.
+ * To limit the volume of context switches from perf tool
+ * process nonsleep() call limits aio_suspend()
+ * polling till every half of the kernel timer tick which is
+ * usually 1ms (depends on CONFIG_HZ value).
+ */

>
> jirka
>
>> + if (aio_suspend((const struct aiocb**)cblocks, cblocks_size, &timeout0)) {
>> + if (errno == EAGAIN || errno == EINTR) {
>> + do_suspend = 1;
>> + continue;
>> + } else {
>> + pr_err("failed to sync perf data, error: %m\n");
>
> SNIP
>