[PATCH V5 02/14] perf/x86: Add perf_get_page_size support

From: kan . liang
Date: Fri Feb 08 2019 - 12:57:36 EST


From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>

Implement a x86 specific version of perf_get_page_size(), which do full
page-table walk of a given virtual address to retrieve page size.
For x86, disabling IRQs over the walk is sufficient to prevent any tear
down of the page tables.

The new sample type requires collecting the virtual address. The virtual
address will not be output unless SAMPLE_ADDR is applied.

The large PEBS will be disabled with this sample type. Because we need
to track munmap to flush the PEBS buffer for large PEBS. Perf doesn't
support munmap tracking yet. The large PEBS can be enabled later
separately when munmap tracking is supported.

Signed-off-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
---

Changes since V4
- Split patch 1 of V4 into two patches.
This patch add the x86 implementation

arch/x86/events/core.c | 31 +++++++++++++++++++++++++++++++
arch/x86/events/intel/ds.c | 3 ++-
2 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 374a197..229a73b 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2578,3 +2578,34 @@ void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap)
cap->events_mask_len = x86_pmu.events_mask_len;
}
EXPORT_SYMBOL_GPL(perf_get_x86_pmu_capability);
+
+u64 perf_get_page_size(u64 virt)
+{
+ unsigned long flags;
+ unsigned int level;
+ pte_t *pte;
+
+ if (!virt)
+ return 0;
+
+ /*
+ * Interrupts are disabled, so it prevents any tear down
+ * of the page tables.
+ * See the comment near struct mmu_table_batch.
+ */
+ local_irq_save(flags);
+ if (virt >= TASK_SIZE)
+ pte = lookup_address(virt, &level);
+ else {
+ if (current->mm) {
+ pte = lookup_address_in_pgd(pgd_offset(current->mm, virt),
+ virt, &level);
+ } else
+ level = PG_LEVEL_NUM;
+ }
+ local_irq_restore(flags);
+ if (level >= PG_LEVEL_NUM)
+ return 0;
+
+ return (u64)page_level_size(level);
+}
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index e9acf1d..720dc9e 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1274,7 +1274,8 @@ static void setup_pebs_sample_data(struct perf_event *event,
}


- if ((sample_type & (PERF_SAMPLE_ADDR | PERF_SAMPLE_PHYS_ADDR)) &&
+ if ((sample_type & (PERF_SAMPLE_ADDR | PERF_SAMPLE_PHYS_ADDR
+ | PERF_SAMPLE_DATA_PAGE_SIZE)) &&
x86_pmu.intel_cap.pebs_format >= 1)
data->addr = pebs->dla;

--
2.7.4