Commit 00e4db51 authored by Linus Torvalds's avatar Linus Torvalds
Browse files

Merge tag 'perf-tools-2020-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools updates from Arnaldo Carvalho de Melo:
 "New features:

   - Introduce controlling how 'perf stat' and 'perf record' works via a
     control file descriptor, allowing starting with events configured
     but disabled until commands are received via the control file
     descriptor. This allows, for instance for tools such as Intel VTune
     to make further use of perf as its Linux platform driver.

   - Improve 'perf record' to to register in a perf.data file header the
     clockid used to help later correlate things like syslog files and
     perf events recorded.

   - Add basic syscall and find_next_bit benchmarks to 'perf bench'.

   - Allow using computed metrics in calculating other metrics. For
     instance:

	  {
	    .metric_expr    = "l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit",
	    .metric_name    = "DCache_L2_All_Hits",
	  },
	  {
	    .metric_expr    = "max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss",
	    .metric_name    = "DCache_L2_All_Miss",
	  },
	  {
	     .metric_expr    = "dcache_l2_all_hits + dcache_l2_all_miss",
	     .metric_name    = "DCache_L2_All",
	  }

   - Add suport for 'd_ratio', '>' and '<' operators to the expression
     resolver used in calculating metrics in 'perf stat'.

  Support for new kernel features:

   - Support TEXT_POKE and KSYMBOL_TYPE_OOL perf metadata events to cope
     with things like ftrace, trampolines, i.e. changes in the kernel
     text that gets in the way of properly decoding Intel PT hardware
     traces, for instance.

  Intel PT:

   - Add various knobs to reduce the volume of Intel PT traces by
     reducing the level of details such as decoding just some types of
     packets (e.g., FUP/TIP, PSB+), also filtering by time range.

   - Add new itrace options (log flags to the 'd' option, error flags to
     the 'e' one, etc), controlling how Intel PT is transformed into
     perf events, document some missing options (e.g., how to synthesize
     callchains).

  BPF:

   - Properly report BPF errors when parsing events.

   - Do not setup side-band events if LIBBPF is not linked, fixing a
     segfault.

  Libraries:

   - Improvements to the libtraceevent plugin mechanism.

   - Improve libtracevent support for KVM trace events SVM exit reasons.

   - Add a libtracevent plugins for decoding syscalls/sys_enter_futex
     and for tlb_flush.

   - Ensure sample_period is set libpfm4 events in 'perf test'.

   - Fixup libperf namespacing, to make sure what is in libperf has the
     perf_ namespace while what is now only in tools/perf/ doesn't use
     that prefix.

  Arch specific:

   - Improve the testing of vendor events and metrics in 'perf test'.

   - Allow no ARM CoreSight hardware tracer sink to be specified on
     command line.

   - Fix arm_spe_x recording when mixed with other perf events.

   - Add s390 idle functions 'psw_idle' and 'psw_idle_exit' to list of
     idle symbols.

   - List kernel supplied event aliases for arm64 in 'perf list'.

   - Add support for extended register capability in PowerPC 9 and 10.

   - Added nest IMC power9 metric events.

  Miscellaneous:

   - No need to setup sample_regs_intr/sample_regs_user for dummy
     events.

   - Update various copies of kernel headers, some causing perf to
     handle new syscalls, MSRs, etc.

   - Improve usage of flex and yacc, enabling warnings and addressing
     the fallout.

   - Add missing '--output' option to 'perf kmem' so that it can pass it
     along to 'perf record'.

   - 'perf probe' fixes related to adding multiple probes on the same
     address for the same event.

   - Make 'perf probe' warn if the target function is a GNU indirect
     function.

   - Remove //anon mmap events from 'perf inject jit' to fix supporting
     both using ELF files for generated functions and the perf-PID.map
     approaches"

* tag 'perf-tools-2020-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (144 commits)
  perf record: Skip side-band event setup if HAVE_LIBBPF_SUPPORT is not set
  perf tools powerpc: Add support for extended regs in power10
  perf tools powerpc: Add support for extended register capability
  tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
  tools arch x86: Sync asm/cpufeatures.h with the kernel sources
  tools arch x86: Sync the msr-index.h copy with the kernel sources
  tools headers UAPI: update linux/in.h copy
  tools headers API: Update close_range affected files
  perf script: Add 'tod' field to display time of day
  perf script: Change the 'enum perf_output_field' enumerators to be 64 bits
  perf data: Add support to store time of day in CTF data conversion
  perf tools: Move clockid_res_ns under clock struct
  perf header: Store clock references for -k/--clockid option
  perf tools: Add clockid_name function
  perf clockid: Move parse_clockid() to new clockid object
  tools lib traceevent: Handle possible strdup() error in tep_add_plugin_path() API
  libtraceevent: Fixed description of tep_add_plugin_path() API
  libtraceevent: Fixed type in PRINT_FMT_STING
  libtraceevent: Fixed broken indentation in parse_ip4_print_args()
  libtraceevent: Improve error handling of tep_plugin_add_option() API
  ...
parents ed3854ff 1101c872
Loading
Loading
Loading
Loading
+19 −1
Original line number Diff line number Diff line
@@ -48,6 +48,24 @@ enum perf_event_powerpc_regs {
	PERF_REG_POWERPC_DSISR,
	PERF_REG_POWERPC_SIER,
	PERF_REG_POWERPC_MMCRA,
	PERF_REG_POWERPC_MAX,
	/* Extended registers */
	PERF_REG_POWERPC_MMCR0,
	PERF_REG_POWERPC_MMCR1,
	PERF_REG_POWERPC_MMCR2,
	PERF_REG_POWERPC_MMCR3,
	PERF_REG_POWERPC_SIER2,
	PERF_REG_POWERPC_SIER3,
	/* Max regs without the extended regs */
	PERF_REG_POWERPC_MAX = PERF_REG_POWERPC_MMCRA + 1,
};

#define PERF_REG_PMU_MASK	((1ULL << PERF_REG_POWERPC_MAX) - 1)

/* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_300 */
#define PERF_REG_PMU_MASK_300   (((1ULL << (PERF_REG_POWERPC_MMCR2 + 1)) - 1) - PERF_REG_PMU_MASK)
/* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_31 */
#define PERF_REG_PMU_MASK_31   (((1ULL << (PERF_REG_POWERPC_SIER3 + 1)) - 1) - PERF_REG_PMU_MASK)

#define PERF_REG_MAX_ISA_300   (PERF_REG_POWERPC_MMCR2 + 1)
#define PERF_REG_MAX_ISA_31    (PERF_REG_POWERPC_SIER3 + 1)
#endif /* _UAPI_ASM_POWERPC_PERF_REGS_H */
+4 −0
Original line number Diff line number Diff line
@@ -96,6 +96,7 @@
#define X86_FEATURE_SYSCALL32		( 3*32+14) /* "" syscall in IA32 userspace */
#define X86_FEATURE_SYSENTER32		( 3*32+15) /* "" sysenter in IA32 userspace */
#define X86_FEATURE_REP_GOOD		( 3*32+16) /* REP microcode works well */
/* free					( 3*32+17) */
#define X86_FEATURE_LFENCE_RDTSC	( 3*32+18) /* "" LFENCE synchronizes RDTSC */
#define X86_FEATURE_ACC_POWER		( 3*32+19) /* AMD Accumulated Power Mechanism */
#define X86_FEATURE_NOPL		( 3*32+20) /* The NOPL (0F 1F) instructions */
@@ -107,6 +108,7 @@
#define X86_FEATURE_EXTD_APICID		( 3*32+26) /* Extended APICID (8 bits) */
#define X86_FEATURE_AMD_DCM		( 3*32+27) /* AMD multi-node processor */
#define X86_FEATURE_APERFMPERF		( 3*32+28) /* P-State hardware coordination feedback capability (APERF/MPERF MSRs) */
/* free					( 3*32+29) */
#define X86_FEATURE_NONSTOP_TSC_S3	( 3*32+30) /* TSC doesn't stop in S3 state */
#define X86_FEATURE_TSC_KNOWN_FREQ	( 3*32+31) /* TSC has known frequency */

@@ -365,7 +367,9 @@
#define X86_FEATURE_SRBDS_CTRL		(18*32+ 9) /* "" SRBDS mitigation MSR available */
#define X86_FEATURE_MD_CLEAR		(18*32+10) /* VERW clears CPU buffers */
#define X86_FEATURE_TSX_FORCE_ABORT	(18*32+13) /* "" TSX_FORCE_ABORT */
#define X86_FEATURE_SERIALIZE		(18*32+14) /* SERIALIZE instruction */
#define X86_FEATURE_PCONFIG		(18*32+18) /* Intel PCONFIG */
#define X86_FEATURE_ARCH_LBR		(18*32+19) /* Intel ARCH LBR */
#define X86_FEATURE_SPEC_CTRL		(18*32+26) /* "" Speculation Control (IBRS + IBPB) */
#define X86_FEATURE_INTEL_STIBP		(18*32+27) /* "" Single Thread Indirect Branch Predictors */
#define X86_FEATURE_FLUSH_L1D		(18*32+28) /* Flush L1D cache */
+23 −3
Original line number Diff line number Diff line
@@ -149,6 +149,10 @@

#define MSR_LBR_SELECT			0x000001c8
#define MSR_LBR_TOS			0x000001c9

#define MSR_IA32_POWER_CTL		0x000001fc
#define MSR_IA32_POWER_CTL_BIT_EE	19

#define MSR_LBR_NHM_FROM		0x00000680
#define MSR_LBR_NHM_TO			0x000006c0
#define MSR_LBR_CORE_FROM		0x00000040
@@ -158,7 +162,23 @@
#define LBR_INFO_MISPRED		BIT_ULL(63)
#define LBR_INFO_IN_TX			BIT_ULL(62)
#define LBR_INFO_ABORT			BIT_ULL(61)
#define LBR_INFO_CYC_CNT_VALID		BIT_ULL(60)
#define LBR_INFO_CYCLES			0xffff
#define LBR_INFO_BR_TYPE_OFFSET		56
#define LBR_INFO_BR_TYPE		(0xfull << LBR_INFO_BR_TYPE_OFFSET)

#define MSR_ARCH_LBR_CTL		0x000014ce
#define ARCH_LBR_CTL_LBREN		BIT(0)
#define ARCH_LBR_CTL_CPL_OFFSET		1
#define ARCH_LBR_CTL_CPL		(0x3ull << ARCH_LBR_CTL_CPL_OFFSET)
#define ARCH_LBR_CTL_STACK_OFFSET	3
#define ARCH_LBR_CTL_STACK		(0x1ull << ARCH_LBR_CTL_STACK_OFFSET)
#define ARCH_LBR_CTL_FILTER_OFFSET	16
#define ARCH_LBR_CTL_FILTER		(0x7full << ARCH_LBR_CTL_FILTER_OFFSET)
#define MSR_ARCH_LBR_DEPTH		0x000014cf
#define MSR_ARCH_LBR_FROM_0		0x00001500
#define MSR_ARCH_LBR_TO_0		0x00001600
#define MSR_ARCH_LBR_INFO_0		0x00001200

#define MSR_IA32_PEBS_ENABLE		0x000003f1
#define MSR_PEBS_DATA_CFG		0x000003f2
@@ -253,8 +273,6 @@

#define MSR_PEBS_FRONTEND		0x000003f7

#define MSR_IA32_POWER_CTL		0x000001fc

#define MSR_IA32_MC0_CTL		0x00000400
#define MSR_IA32_MC0_STATUS		0x00000401
#define MSR_IA32_MC0_ADDR		0x00000402
@@ -418,7 +436,6 @@
#define MSR_AMD64_PATCH_LEVEL		0x0000008b
#define MSR_AMD64_TSC_RATIO		0xc0000104
#define MSR_AMD64_NB_CFG		0xc001001f
#define MSR_AMD64_CPUID_FN_1		0xc0011004
#define MSR_AMD64_PATCH_LOADER		0xc0010020
#define MSR_AMD_PERF_CTL		0xc0010062
#define MSR_AMD_PERF_STATUS		0xc0010063
@@ -427,6 +444,7 @@
#define MSR_AMD64_OSVW_STATUS		0xc0010141
#define MSR_AMD_PPIN_CTL		0xc00102f0
#define MSR_AMD_PPIN			0xc00102f1
#define MSR_AMD64_CPUID_FN_1		0xc0011004
#define MSR_AMD64_LS_CFG		0xc0011020
#define MSR_AMD64_DC_CFG		0xc0011022
#define MSR_AMD64_BU_CFG2		0xc001102a
@@ -466,6 +484,8 @@
#define MSR_F16H_DR0_ADDR_MASK		0xc0011027

/* Fam 15h MSRs */
#define MSR_F15H_CU_PWR_ACCUMULATOR     0xc001007a
#define MSR_F15H_CU_MAX_PWR_ACCUMULATOR 0xc001007b
#define MSR_F15H_PERF_CTL		0xc0010200
#define MSR_F15H_PERF_CTL0		MSR_F15H_PERF_CTL
#define MSR_F15H_PERF_CTL1		(MSR_F15H_PERF_CTL + 2)
+1 −1
Original line number Diff line number Diff line
@@ -8,7 +8,7 @@ endif

feature_check = $(eval $(feature_check_code))
define feature_check_code
  feature-$(1) := $(shell $(MAKE) OUTPUT=$(OUTPUT_FEATURES) CFLAGS="$(EXTRA_CFLAGS) $(FEATURE_CHECK_CFLAGS-$(1))" CXXFLAGS="$(EXTRA_CXXFLAGS) $(FEATURE_CHECK_CXXFLAGS-$(1))" LDFLAGS="$(LDFLAGS) $(FEATURE_CHECK_LDFLAGS-$(1))" -C $(feature_dir) $(OUTPUT_FEATURES)test-$1.bin >/dev/null 2>/dev/null && echo 1 || echo 0)
  feature-$(1) := $(shell $(MAKE) OUTPUT=$(OUTPUT_FEATURES) CC=$(CC) CXX=$(CXX) CFLAGS="$(EXTRA_CFLAGS) $(FEATURE_CHECK_CFLAGS-$(1))" CXXFLAGS="$(EXTRA_CXXFLAGS) $(FEATURE_CHECK_CXXFLAGS-$(1))" LDFLAGS="$(LDFLAGS) $(FEATURE_CHECK_LDFLAGS-$(1))" -C $(feature_dir) $(OUTPUT_FEATURES)test-$1.bin >/dev/null 2>/dev/null && echo 1 || echo 0)
endef

feature_set = $(eval $(feature_set_code))
+0 −2
Original line number Diff line number Diff line
@@ -74,8 +74,6 @@ FILES= \

FILES := $(addprefix $(OUTPUT),$(FILES))

CC ?= $(CROSS_COMPILE)gcc
CXX ?= $(CROSS_COMPILE)g++
PKG_CONFIG ?= $(CROSS_COMPILE)pkg-config
LLVM_CONFIG ?= llvm-config
CLANG ?= clang
Loading