[GIT PULL] perf event updates for v6.1

From: Ingo Molnar
Date: Fri Oct 07 2022 - 04:01:04 EST


Linus,

Please pull the latest perf/core git tree from:

git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git perf-core-2022-10-07

# HEAD: 82aad7ff7ac25c8cf09d491ae23b9823f1901486 perf/hw_breakpoint: Annotate tsk->perf_event_mutex vs ctx->mutex

Perf events updates for v6.1:

- PMU driver updates:

- Add AMD Last Branch Record Extension Version 2 (LbrExtV2)
feature support for Zen 4 processors.

- Extend the perf ABI to provide branch speculation information,
if available, and use this on CPUs that have it (eg. LbrExtV2).

- Improve Intel PEBS TSC timestamp handling & integration.

- Add Intel Raptor Lake S CPU support.

- Add 'perf mem' and 'perf c2c' memory profiling support on
AMD CPUs by utilizing IBS tagged load/store samples.

- Clean up & optimize various x86 PMU details.

- HW breakpoints:

- Big rework to optimize the code for systems with hundreds of CPUs and
thousands of breakpoints:

- Replace the nr_bp_mutex global mutex with the bp_cpuinfo_sem
per-CPU rwsem that is read-locked during most of the key operations.

- Improve the O(#cpus * #tasks) logic in toggle_bp_slot()
and fetch_bp_busy_slots().

- Apply micro-optimizations & cleanups.

- Misc cleanups & enhancements.

- NOTE: When merged with your latest tree, there will be a new
conflict in lib/Kconfig.debug - it's just a context conflict due
to the HW_BREAKPOINT_KUNIT_TEST addition from the perf tree
clashing with the recent addition of FORTIFY_KUNIT_TEST:

upstream: 875bfd5276f3 ("fortify: Add KUnit test for FORTIFY_SOURCE internals")
perf tree: 724c299c6a0e ("perf/hw_breakpoint: Add KUnit test for constraints accounting")

Thanks,

Ingo

------------------>
Anshuman Khandual (9):
perf: Add system error and not in transaction branch types
perf: Extend branch type classification
perf: Capture branch privilege information
perf: Add PERF_BR_NEW_ARCH_[N] map for BRBE on arm64 platform
perf: Consolidate branch sample filter helpers
perf/core: Expand PERF_EVENT_FLAG_ARCH
perf/core: Assert PERF_EVENT_FLAG_ARCH does not overlap with generic flags
arm64/perf: Assert all platform event flags are within PERF_EVENT_FLAG_ARCH
x86/perf: Assert all platform event flags are within PERF_EVENT_FLAG_ARCH

Jiri Olsa (1):
bpf: Check flags for branch stack in bpf_read_branch_records helper

Jules Irenge (1):
perf/core: Convert snprintf() to scnprintf()

Kan Liang (11):
perf: Add sample_flags to indicate the PMU-filled sample data
perf/x86/intel/pebs: Fix PEBS timestamps overwritten
perf: Use sample_flags for branch stack
perf: Use sample_flags for weight
perf: Use sample_flags for data_src
perf: Use sample_flags for txn
perf/x86/intel: Optimize FIXED_CTR_CTRL access
perf/x86: Add new Raptor Lake S support
perf/x86/msr: Add new Raptor Lake S support
perf/x86/cstate: Add new Raptor Lake S support
perf/x86/uncore: Add new Raptor Lake S support

Marco Elver (15):
perf/hw_breakpoint: Add KUnit test for constraints accounting
perf/hw_breakpoint: Provide hw_breakpoint_is_used() and use in test
perf/hw_breakpoint: Clean up headers
perf/hw_breakpoint: Optimize list of per-task breakpoints
perf/hw_breakpoint: Mark data __ro_after_init
perf/hw_breakpoint: Optimize constant number of breakpoint slots
perf/hw_breakpoint: Make hw_breakpoint_weight() inlinable
perf/hw_breakpoint: Remove useless code related to flexible breakpoints
powerpc/hw_breakpoint: Avoid relying on caller synchronization
locking/percpu-rwsem: Add percpu_is_write_locked() and percpu_is_read_locked()
perf/hw_breakpoint: Reduce contention with large number of tasks
perf/hw_breakpoint: Introduce bp_slots_histogram
perf/hw_breakpoint: Optimize max_bp_pinned_slots() for CPU-independent task targets
perf/hw_breakpoint: Optimize toggle_bp_slot() for CPU-independent task targets
perf, hw_breakpoint: Fix use-after-free if perf_event_open() fails

Namhyung Kim (5):
perf: Use sample_flags for callchain
perf/bpf: Always use perf callchains if exist
perf: Kill __PERF_SAMPLE_CALLCHAIN_EARLY
perf: Use sample_flags for addr
perf: Use sample_flags for raw_data

Peter Zijlstra (11):
perf: Add a few assertions
perf/x86: Add two more x86_pmu methods
perf/x86/intel: Move the topdown stuff into the intel driver
perf/x86: Change x86_pmu::limit_period signature
perf/x86: Add a x86_pmu::limit_period static_call
perf/x86/intel: Remove x86_pmu::set_topdown_event_period
perf/x86/intel: Remove x86_pmu::update_topdown_event
perf/x86/p4: Remove perfctr_second_write quirk
perf: Fix lockdep_assert_event_ctx()
perf: Fix pmu_filter_match()
perf/hw_breakpoint: Annotate tsk->perf_event_mutex vs ctx->mutex

Ravi Bangoria (7):
perf/mem: Introduce PERF_MEM_LVLNUM_{EXTN_MEM|IO}
perf/x86/amd: Add IBS OP_DATA2 DataSrc bit definitions
perf/x86/amd: Support PERF_SAMPLE_DATA_SRC
perf/x86/amd: Support PERF_SAMPLE_{WEIGHT|WEIGHT_STRUCT}
perf/x86/amd: Support PERF_SAMPLE_ADDR
perf/x86/amd: Support PERF_SAMPLE_PHY_ADDR
perf/uapi: Define PERF_MEM_SNOOPX_PEER in kernel header file

Sandipan Das (13):
perf/x86/amd/brs: Move feature-specific functions
perf/x86/amd/core: Refactor branch attributes
perf/x86/amd/core: Add generic branch record interfaces
x86/cpufeatures: Add LbrExtV2 feature bit
perf/x86/amd/lbr: Detect LbrExtV2 support
perf/x86/amd/lbr: Add LbrExtV2 branch record support
perf/x86/amd/lbr: Add LbrExtV2 hardware branch filter support
perf/x86: Move branch classifier
perf/x86/amd/lbr: Add LbrExtV2 software branch filter support
perf/x86: Make branch classifier fusion-aware
perf/x86/amd/lbr: Use fusion-aware branch classifier
perf/core: Add speculation info to branch entries
perf/x86/amd/lbr: Add LbrExtV2 branch speculation info support

Stephane Eranian (2):
perf/x86/utils: Fix uninitialized var in get_branch_type()
perf/x86/amd/lbr: Adjust LBR regardless of filtering


arch/powerpc/kernel/hw_breakpoint.c | 53 ++-
arch/powerpc/perf/core-book3s.c | 10 +-
arch/s390/kernel/perf_cpum_cf.c | 1 +
arch/s390/kernel/perf_pai_crypto.c | 1 +
arch/sh/include/asm/hw_breakpoint.h | 5 +-
arch/x86/events/Makefile | 2 +-
arch/x86/events/amd/Makefile | 2 +-
arch/x86/events/amd/brs.c | 69 +++-
arch/x86/events/amd/core.c | 210 ++++++------
arch/x86/events/amd/ibs.c | 360 ++++++++++++++++++-
arch/x86/events/amd/lbr.c | 439 ++++++++++++++++++++++++
arch/x86/events/core.c | 61 ++--
arch/x86/events/intel/core.c | 101 ++++--
arch/x86/events/intel/cstate.c | 1 +
arch/x86/events/intel/ds.c | 55 ++-
arch/x86/events/intel/lbr.c | 273 ---------------
arch/x86/events/intel/p4.c | 37 +-
arch/x86/events/intel/uncore.c | 1 +
arch/x86/events/msr.c | 1 +
arch/x86/events/perf_event.h | 130 +++++--
arch/x86/events/perf_event_flags.h | 22 ++
arch/x86/events/utils.c | 251 ++++++++++++++
arch/x86/include/asm/amd-ibs.h | 16 +
arch/x86/include/asm/cpufeatures.h | 2 +-
arch/x86/include/asm/hw_breakpoint.h | 5 +-
arch/x86/include/asm/msr-index.h | 5 +
arch/x86/include/asm/perf_event.h | 3 +-
arch/x86/kernel/cpu/scattered.c | 1 +
drivers/perf/arm_spe_pmu.c | 4 +-
include/linux/hw_breakpoint.h | 4 +-
include/linux/percpu-rwsem.h | 6 +
include/linux/perf/arm_pmu.h | 9 +-
include/linux/perf_event.h | 77 ++++-
include/uapi/linux/perf_event.h | 57 ++-
kernel/bpf/stackmap.c | 4 +-
kernel/events/Makefile | 1 +
kernel/events/core.c | 88 +++--
kernel/events/hw_breakpoint.c | 648 ++++++++++++++++++++++++++---------
kernel/events/hw_breakpoint_test.c | 333 ++++++++++++++++++
kernel/locking/percpu-rwsem.c | 6 +
kernel/trace/bpf_trace.c | 3 +
lib/Kconfig.debug | 10 +
42 files changed, 2613 insertions(+), 754 deletions(-)
create mode 100644 arch/x86/events/amd/lbr.c
create mode 100644 arch/x86/events/perf_event_flags.h
create mode 100644 arch/x86/events/utils.c
create mode 100644 kernel/events/hw_breakpoint_test.c