Commit 27a0a90d authored by Ingo Molnar's avatar Ingo Molnar
Browse files

Merge tag 'perf-core-for-mingo-5.5-20191021' of...

Merge tag 'perf-core-for-mingo-5.5-20191021' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

 into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

perf trace:

- Add syscall failure stats to -s/--summary and -S/--with-summary, works in
  combination with specifying just a set of syscalls, see below first with
  -s/--summary, then with -S/--with-summary just for the syscalls we saw failing
  with -s:

    # perf trace -s sleep 1

     Summary of events:

     sleep (16218), 80 events, 93.0%

       syscall     calls  errors  total      min      avg      max   stddev
                                  (msec)   (msec)   (msec)   (msec)    (%)
       ----------- -----  ------ -------- -------- -------- -------- ------
       nanosleep       1      0  1000.091 1000.091 1000.091 1000.091  0.00%
       mmap            8      0     0.045    0.005    0.006    0.008  7.09%
       mprotect        4      0     0.028    0.005    0.007    0.009 11.38%
       openat          3      0     0.021    0.005    0.007    0.009 14.07%
       munmap          1      0     0.017    0.017    0.017    0.017  0.00%
       brk             4      0     0.010    0.001    0.002    0.004 23.15%
       read            4      0     0.009    0.002    0.002    0.003  8.13%
       close           5      0     0.008    0.001    0.002    0.002 10.83%
       fstat           3      0     0.006    0.002    0.002    0.002  6.97%
       access          1      1     0.006    0.006    0.006    0.006  0.00%
       lseek           3      0     0.005    0.001    0.002    0.002  7.37%
       arch_prctl      2      1     0.004    0.001    0.002    0.002 17.64%
       execve          1      0     0.000    0.000    0.000    0.000  0.00%

    # perf trace -e access,arch_prctl -S sleep 1
         0.000 ( 0.006 ms): sleep/19503 arch_prctl(option: 0x3001, arg2: 0x7fff165996b0) = -1 EINVAL (Invalid argument)
         0.024 ( 0.006 ms): sleep/19503 access(filename: 0x2177e510, mode: R)            = -1 ENOENT (No such file or directory)
         0.136 ( 0.002 ms): sleep/19503 arch_prctl(option: SET_FS, arg2: 0x7f9421737580) = 0

     Summary of events:

     sleep (19503), 6 events, 50.0%

       syscall    calls  errors total    min    avg    max  stddev
                                (msec) (msec) (msec) (msec)    (%)
       ---------- -----  ------ ------ ------ ------ ------ ------
       arch_prctl   2       1    0.008  0.002  0.004  0.006 57.22%
       access       1       1    0.006  0.006  0.006  0.006  0.00%

    #

  - Introduce --errno-summary, to drill down a bit more in the errno stats:

    # perf trace --errno-summary -e access,arch_prctl -S sleep 1
         0.000 ( 0.006 ms): sleep/5587 arch_prctl(option: 0x3001, arg2: 0x7ffd6ba6aa00) = -1 EINVAL (Invalid argument)
         0.028 ( 0.007 ms): sleep/5587 access(filename: 0xb83d9510, mode: R)            = -1 ENOENT (No such file or directory)
         0.172 ( 0.003 ms): sleep/5587 arch_prctl(option: SET_FS, arg2: 0x7f45b8392580) = 0

     Summary of events:

     sleep (5587), 6 events, 50.0%

       syscall    calls  errors total    min    avg    max  stddev
                                (msec) (msec) (msec) (msec)   (%)
       ---------- -----  ------ ------ ------ ------ ------ ------
       arch_prctl     2     1    0.009  0.003  0.005  0.006 38.90%
			   EINVAL: 1
       access         1     1    0.007  0.007  0.007  0.007  0.00%
                           ENOENT: 1
    #

  - Filter own pid to avoid a feedback look in 'perf trace record -a'

  - Add the glue for the auto generated x86 IRQ vector array.

  - Show error message when not finding a field used in a filter expression

    # perf trace --max-events=4 -e syscalls:sys_enter_write --filter="cnt>32767"
    Failed to set filter "(cnt>32767) && (common_pid != 19938 && common_pid != 8922)" on event syscalls:sys_enter_write with 22 (Invalid argument)
    #
    # perf trace --max-events=4 -e syscalls:sys_enter_write --filter="count>32767"
         0.000 python3.5/17535 syscalls:sys_enter_write(fd: 3, buf: 0x564b0dc53600, count: 172086)
        12.641 python3.5.post/17535 syscalls:sys_enter_write(fd: 3, buf: 0x564b0db63660, count: 75994)
        27.738 python3.5.post/17535 syscalls:sys_enter_write(fd: 3, buf: 0x564b0db4b1e0, count: 41635)
       136.070 python3.5.post/17535 syscalls:sys_enter_write(fd: 3, buf: 0x564b0dbab510, count: 62232)
    #

  - Add a generator for x86's IRQ vectors -> strings

  - Introduce stroul() (string -> number) methods for the strarray and
    strarrays classes, also strtoul_flags, allowing to go from both strings
    and or-ed strings to numbers, allowing things like:

    # perf trace -e syscalls:sys_enter_mmap --filter="flags==DENYWRITE|PRIVATE|FIXED" sleep 1
         0.000 sleep/22588 syscalls:sys_enter_mmap(addr: 0x7f42d2aa5000, len: 1363968, prot: READ|EXEC, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x22000)
         0.011 sleep/22588 syscalls:sys_enter_mmap(addr: 0x7f42d2bf2000, len: 311296, prot: READ, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x16f000)
         0.015 sleep/22588 syscalls:sys_enter_mmap(addr: 0x7f42d2c3f000, len: 24576, prot: READ|WRITE, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x1bb000)
    #

  Allowing to narrow down from the complete set of mmap calls for that workload:

    # perf trace -e syscalls:sys_enter_mmap sleep 1
         0.000 sleep/22695 syscalls:sys_enter_mmap(len: 134773, prot: READ, flags: PRIVATE, fd: 3)
         0.041 sleep/22695 syscalls:sys_enter_mmap(len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS)
         0.053 sleep/22695 syscalls:sys_enter_mmap(len: 1857472, prot: READ, flags: PRIVATE|DENYWRITE, fd: 3)
         0.069 sleep/22695 syscalls:sys_enter_mmap(addr: 0x7fd23ffb6000, len: 1363968, prot: READ|EXEC, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x22000)
         0.077 sleep/22695 syscalls:sys_enter_mmap(addr: 0x7fd240103000, len: 311296, prot: READ, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x16f000)
         0.083 sleep/22695 syscalls:sys_enter_mmap(addr: 0x7fd240150000, len: 24576, prot: READ|WRITE, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x1bb000)
         0.095 sleep/22695 syscalls:sys_enter_mmap(addr: 0x7fd240156000, len: 14272, prot: READ|WRITE, flags: PRIVATE|FIXED|ANONYMOUS)
         0.339 sleep/22695 syscalls:sys_enter_mmap(len: 217750512, prot: READ, flags: PRIVATE, fd: 3)
    #

  Works with all targets, so, for system wide, looking at who calls mmap with flags set to just "PRIVATE":

    # perf trace --max-events=5 -e syscalls:sys_enter_mmap --filter="flags==PRIVATE"
         0.000 pool/2242 syscalls:sys_enter_mmap(len: 756, prot: READ, flags: PRIVATE, fd: 14)
         0.050 pool/2242 syscalls:sys_enter_mmap(len: 756, prot: READ, flags: PRIVATE, fd: 14)
         0.062 pool/2242 syscalls:sys_enter_mmap(len: 756, prot: READ, flags: PRIVATE, fd: 14)
         0.145 goa-identity-s/2240 syscalls:sys_enter_mmap(len: 756, prot: READ, flags: PRIVATE, fd: 18)
         0.183 goa-identity-s/2240 syscalls:sys_enter_mmap(len: 756, prot: READ, flags: PRIVATE, fd: 18)
    #

  # perf trace --max-events=2 -e syscalls:sys_enter_lseek --filter="whence==SET && offset != 0"
         0.000 Cache2 I/O/12047 syscalls:sys_enter_lseek(fd: 277, offset: 43, whence: SET)
      1142.070 mozStorage #5/12302 syscalls:sys_enter_lseek(fd: 44</home/acme/.mozilla/firefox/ina67tev.default/cookies.sqlite-wal>, offset: 393536, whence: SET)
  #

perf annotate:

  - Fix objdump --no-show-raw-insn flag to work with goth gcc and clang.

  - Streamline objdump execution, preserving the right error codes for better
    reporting to user.

perf report:

  - Add warning when libunwind not compiled in.

perf stat:

  Jin Yao:

  - Support --all-kernel/--all-user, to match options available in 'perf record',
    asking that all the events specified work just with kernel or user events.

perf list:

  Jin Yao:

  - Hide deprecated events by default, allow showing them with --deprecated.

libbperf:

  Jiri Olsa:

  - Allow to build with -ltcmalloc.

  - Finish mmap interface, getting more stuff from tools/perf while adding
    abstractions to avoid pulling too much stuff, to get libperf to grow as
    tools needs things like auxtrace, etc.

perf scripting engines:

  Steven Rostedt (VMware):

  - Iterate on tep event arrays directly, fixing script generation with
    '-g python' when having multiple tracepoints in a perf.data file.

core:

  - Allow to build with -ltcmalloc.

perf test:

  Leo Yan:

  - Report failure for mmap events.

  - Avoid infinite loop for task exit case.

  - Remove needless headers for bp_account test.

  - Add dedicated checking helper is_supported().

  - Disable bp_signal testing for arm64.

Vendor events:

arm64:

  John Garry:

  - Fix Hisi hip08 DDRC PMU eventname.

  - Add some missing events for Hisi hip08 DDRC, L3C and HHA PMUs.

Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
parents aa7a7b72 27198a89
Loading
Loading
Loading
Loading
+146 −0
Original line number Diff line number Diff line
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef _ASM_X86_IRQ_VECTORS_H
#define _ASM_X86_IRQ_VECTORS_H

#include <linux/threads.h>
/*
 * Linux IRQ vector layout.
 *
 * There are 256 IDT entries (per CPU - each entry is 8 bytes) which can
 * be defined by Linux. They are used as a jump table by the CPU when a
 * given vector is triggered - by a CPU-external, CPU-internal or
 * software-triggered event.
 *
 * Linux sets the kernel code address each entry jumps to early during
 * bootup, and never changes them. This is the general layout of the
 * IDT entries:
 *
 *  Vectors   0 ...  31 : system traps and exceptions - hardcoded events
 *  Vectors  32 ... 127 : device interrupts
 *  Vector  128         : legacy int80 syscall interface
 *  Vectors 129 ... LOCAL_TIMER_VECTOR-1
 *  Vectors LOCAL_TIMER_VECTOR ... 255 : special interrupts
 *
 * 64-bit x86 has per CPU IDT tables, 32-bit has one shared IDT table.
 *
 * This file enumerates the exact layout of them:
 */

#define NMI_VECTOR			0x02
#define MCE_VECTOR			0x12

/*
 * IDT vectors usable for external interrupt sources start at 0x20.
 * (0x80 is the syscall vector, 0x30-0x3f are for ISA)
 */
#define FIRST_EXTERNAL_VECTOR		0x20

/*
 * Reserve the lowest usable vector (and hence lowest priority)  0x20 for
 * triggering cleanup after irq migration. 0x21-0x2f will still be used
 * for device interrupts.
 */
#define IRQ_MOVE_CLEANUP_VECTOR		FIRST_EXTERNAL_VECTOR

#define IA32_SYSCALL_VECTOR		0x80

/*
 * Vectors 0x30-0x3f are used for ISA interrupts.
 *   round up to the next 16-vector boundary
 */
#define ISA_IRQ_VECTOR(irq)		(((FIRST_EXTERNAL_VECTOR + 16) & ~15) + irq)

/*
 * Special IRQ vectors used by the SMP architecture, 0xf0-0xff
 *
 *  some of the following vectors are 'rare', they are merged
 *  into a single vector (CALL_FUNCTION_VECTOR) to save vector space.
 *  TLB, reschedule and local APIC vectors are performance-critical.
 */

#define SPURIOUS_APIC_VECTOR		0xff
/*
 * Sanity check
 */
#if ((SPURIOUS_APIC_VECTOR & 0x0F) != 0x0F)
# error SPURIOUS_APIC_VECTOR definition error
#endif

#define ERROR_APIC_VECTOR		0xfe
#define RESCHEDULE_VECTOR		0xfd
#define CALL_FUNCTION_VECTOR		0xfc
#define CALL_FUNCTION_SINGLE_VECTOR	0xfb
#define THERMAL_APIC_VECTOR		0xfa
#define THRESHOLD_APIC_VECTOR		0xf9
#define REBOOT_VECTOR			0xf8

/*
 * Generic system vector for platform specific use
 */
#define X86_PLATFORM_IPI_VECTOR		0xf7

/*
 * IRQ work vector:
 */
#define IRQ_WORK_VECTOR			0xf6

#define UV_BAU_MESSAGE			0xf5
#define DEFERRED_ERROR_VECTOR		0xf4

/* Vector on which hypervisor callbacks will be delivered */
#define HYPERVISOR_CALLBACK_VECTOR	0xf3

/* Vector for KVM to deliver posted interrupt IPI */
#ifdef CONFIG_HAVE_KVM
#define POSTED_INTR_VECTOR		0xf2
#define POSTED_INTR_WAKEUP_VECTOR	0xf1
#define POSTED_INTR_NESTED_VECTOR	0xf0
#endif

#define MANAGED_IRQ_SHUTDOWN_VECTOR	0xef

#if IS_ENABLED(CONFIG_HYPERV)
#define HYPERV_REENLIGHTENMENT_VECTOR	0xee
#define HYPERV_STIMER0_VECTOR		0xed
#endif

#define LOCAL_TIMER_VECTOR		0xec

#define NR_VECTORS			 256

#ifdef CONFIG_X86_LOCAL_APIC
#define FIRST_SYSTEM_VECTOR		LOCAL_TIMER_VECTOR
#else
#define FIRST_SYSTEM_VECTOR		NR_VECTORS
#endif

/*
 * Size the maximum number of interrupts.
 *
 * If the irq_desc[] array has a sparse layout, we can size things
 * generously - it scales up linearly with the maximum number of CPUs,
 * and the maximum number of IO-APICs, whichever is higher.
 *
 * In other cases we size more conservatively, to not create too large
 * static arrays.
 */

#define NR_IRQS_LEGACY			16

#define CPU_VECTOR_LIMIT		(64 * NR_CPUS)
#define IO_APIC_VECTOR_LIMIT		(32 * MAX_IO_APICS)

#if defined(CONFIG_X86_IO_APIC) && defined(CONFIG_PCI_MSI)
#define NR_IRQS						\
	(CPU_VECTOR_LIMIT > IO_APIC_VECTOR_LIMIT ?	\
		(NR_VECTORS + CPU_VECTOR_LIMIT)  :	\
		(NR_VECTORS + IO_APIC_VECTOR_LIMIT))
#elif defined(CONFIG_X86_IO_APIC)
#define	NR_IRQS				(NR_VECTORS + IO_APIC_VECTOR_LIMIT)
#elif defined(CONFIG_PCI_MSI)
#define NR_IRQS				(NR_VECTORS + CPU_VECTOR_LIMIT)
#else
#define NR_IRQS				NR_IRQS_LEGACY
#endif

#endif /* _ASM_X86_IRQ_VECTORS_H */
+3 −0
Original line number Diff line number Diff line
@@ -36,6 +36,9 @@ Enable debugging output.
Print how named events are resolved internally into perf events, and also
any extra expressions computed by perf stat.

--deprecated::
Print deprecated events. By default the deprecated events are hidden.

[[EVENT_MODIFIERS]]
EVENT MODIFIERS
---------------
+6 −0
Original line number Diff line number Diff line
@@ -323,6 +323,12 @@ The output is SMI cycles%, equals to (aperf - unhalted core cycles) / aperf

Users who wants to get the actual value can apply --no-metric-only.

--all-kernel::
Configure all used events to run in kernel space.

--all-user::
Configure all used events to run in user space.

EXAMPLES
--------

+4 −0
Original line number Diff line number Diff line
@@ -146,6 +146,10 @@ the thread executes on the designated CPUs. Default is to monitor all CPUs.
	Show all syscalls followed by a summary by thread with min, max, and
    average times (in msec) and relative stddev.

--errno-summary::
	To be used with -s or -S, to show stats for the errnos experienced by
	syscalls, using only this option will trigger --summary.

--tool_stats::
	Show tool stats such as number of times fd->pathname was discovered thru
	hooking the open syscall return + vfs_getname or via reading /proc/pid/fd, etc.
+5 −0
Original line number Diff line number Diff line
@@ -265,6 +265,11 @@ LDFLAGS += -Wl,-z,noexecstack

EXTLIBS = -lpthread -lrt -lm -ldl

ifneq ($(TCMALLOC),)
  CFLAGS += -fno-builtin-malloc -fno-builtin-calloc -fno-builtin-realloc -fno-builtin-free
  EXTLIBS += -ltcmalloc
endif

ifeq ($(FEATURES_DUMP),)
include $(srctree)/tools/build/Makefile.feature
else
Loading