Re: [PATCH v4 04/19] selftests/resctrl: Close perf value read fd on errors

From: Ilpo Järvinen
Date: Mon Jul 17 2023 - 09:07:20 EST


On Fri, 14 Jul 2023, Reinette Chatre wrote:
> On 7/14/2023 3:35 AM, Ilpo Järvinen wrote:
> > On Thu, 13 Jul 2023, Reinette Chatre wrote:
> >> On 7/13/2023 6:19 AM, Ilpo Järvinen wrote:
> >>> Perf event fd (fd_lm) is not closed on some error paths.
> >>>
> >>> Always close fd_lm in get_llc_perf() and add close into an error
> >>> handling block in cat_val().
> >>>
> >>> Fixes: 790bf585b0ee ("selftests/resctrl: Add Cache Allocation Technology (CAT) selftest")
> >>> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx>
> >>> ---
> >>> tools/testing/selftests/resctrl/cache.c | 10 +++++-----
> >>> 1 file changed, 5 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c
> >>> index 8a4fe8693be6..ced47b445d1e 100644
> >>> --- a/tools/testing/selftests/resctrl/cache.c
> >>> +++ b/tools/testing/selftests/resctrl/cache.c
> >>> @@ -87,21 +87,20 @@ static int reset_enable_llc_perf(pid_t pid, int cpu_no)
> >>> static int get_llc_perf(unsigned long *llc_perf_miss)
> >>> {
> >>> __u64 total_misses;
> >>> + int ret;
> >>>
> >>> /* Stop counters after one span to get miss rate */
> >>>
> >>> ioctl(fd_lm, PERF_EVENT_IOC_DISABLE, 0);
> >>>
> >>> - if (read(fd_lm, &rf_cqm, sizeof(struct read_format)) == -1) {
> >>> + ret = read(fd_lm, &rf_cqm, sizeof(struct read_format));
> >>> + close(fd_lm);
> >>> + if (ret == -1) {
> >>> perror("Could not get llc misses through perf");
> >>> -
> >>> return -1;
> >>> }
> >>>
> >>> total_misses = rf_cqm.values[0].value;
> >>> -
> >>> - close(fd_lm);
> >>> -
> >>> *llc_perf_miss = total_misses;
> >>>
> >>> return 0;
> >>> @@ -253,6 +252,7 @@ int cat_val(struct resctrl_val_param *param)
> >>> memflush, operation, resctrl_val)) {
> >>> fprintf(stderr, "Error-running fill buffer\n");
> >>> ret = -1;
> >>> + close(fd_lm);
> >>> break;
> >>> }
> >>>
> >>
> >> Instead of fixing these existing patterns I think it would make the code
> >> easier to understand and maintain if it is made symmetrical.
> >> Having the perf event fd opened in one place but its close()
> >> scattered elsewhere has the potential for confusion and making later
> >> mistakes easy to miss.
> >>
> >> What if perf event fd is closed in a new "disable_llc_perf()" that
> >> is matched with "reset_enable_llc_perf()" and called
> >> from cat_val()?
> >>
> >> I think this raises another issue with the test trickery where
> >> measure_cache_vals() has some assumptions about state based on the
> >> test name.
> >
> > I very much agree on the principle here, and thus I already have created
> > patches which will do a major cleanup on this area. The cleaned-up code
> > has pe_fd local var to cat_val() and handles closing it in cat_val() with
> > the usual patterns.
> >
> > However, the patch is currently resides post L3 CAT test rewrite.
> > Backporting the cleanups/refactors into this series would require
> > considerable effort due to how convoluted all those n-step cleanup patches
> > and L3 CAT test rewrite are in this area. There's just very much to
> > cleanup here and L3 rewrite will touch the same areas so its a net
> > full of conflicts.
> >
> > Do you want me to spend the effort to backport them into this series
> > (I expect will take some time)?
>
> Considering the "Fixes" tag, having a smaller fix that can easily
> be backported would be ideal so I am ok with deferring a bigger
> rework.
>
> I do think this fix can be made more robust with a couple of small
> changes that should not introduce significant conflicts:
> * initialize fd_lm to -1

> * do not close() fd_lm in get_llc_perf() but instead move its
> close() to at exit of cat_val().

I changed the test to only close the fd in cat_val() which is the
direction the later refactor/cleanup changes (not in this series) was
moving anyway.

> * add check in get_llc_perf() that it does not attempt ioctl()
> on "fd_lm == -1" (later addition would be error checking of
> the ioctl())

The other two things suggested seem unnecessary and I've not implemented
them, I don't thinkg fd_lm can be -1 at ioctl(). Given this code is going
to be replaced soonish, putting any extra "safety" effort into it now
seems waste of time.

--
i.