Commit 48342fc0 authored by Linus Torvalds's avatar Linus Torvalds
Browse files

Merge tag 'perf-tools-2020-12-19' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools updates from Arnaldo Carvalho de Melo:
 "perf record:
   - Fix memory leak when using '--user-regs=?' to list registers

  aarch64 support:
   - Add aarch64 registers to 'perf record's' --user-regs command line
     option

  aarch64 hw tracing support:
   - Decode memory tagging properties
   - Improve ARM's auxtrace support
   - Add support for ARMv8.3-SPE

  perf kvm:
   - Add kvm-stat for arm64

  perf stat:
   - Add --quiet option

  Cleanups:
   - Fixup function names wrt what is in libperf and what is in
     tools/perf

  Build:
   - Allow building without libbpf in older systems

  New kernel features:
   - Initial support for data/code page size sample type, more to come

  perf annotate:
   - Support MIPS instruction extended support

  perf stack unwinding:
   - Fix separate debug info files when using elfutils' libdw's unwinder

  perf vendor events:
   - Update Intel's Skylake client events to v50
   - Add JSON metrics for ARM's imx8mm DDR Perf
   - Support printing metric groups for system PMUs

  perf build id:
   - Prep work for supporting having the build id provided by the kernel
     in PERF_RECORD_MMAP2 metadata events

  perf stat:
   - Support regex pattern in --for-each-cgroup

  pipe mode:
   - Allow to use stdio functions for pipe mode
   - Support 'perf report's' --header-only for pipe mode
   - Support pipe mode display in 'perf evlist'

  Documentation:
   - Update information about CAP_PERFMON"

* tag 'perf-tools-2020-12-19' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (134 commits)
  perf mem: Factor out a function to generate sort order
  perf sort: Add sort option for data page size
  perf script: Support data page size
  tools headers UAPI: Update asm-generic/unistd.h
  tools headers cpufeatures: Sync with the kernel sources
  tools headers UAPI: Sync linux/prctl.h with the kernel sources
  tools headers UAPI: Sync linux/fscrypt.h with the kernel sources
  tools headers UAPI: Sync linux/const.h with the kernel headers
  tools arch x86: Sync the msr-index.h copy with the kernel sources
  perf trace beauty: Update copy of linux/socket.h with the kernel sources
  tools headers: Update linux/ctype.h with the kernel sources
  tools headers: Add conditional __has_builtin()
  tools headers: Get tools's linux/compiler.h closer to the kernel's
  tools headers UAPI: Sync linux/stat.h with the kernel sources
  tools headers: Syncronize linux/build_bug.h with the kernel sources
  perf tools: Reformat record's control fd man text
  perf config: Fix example command in manpage to conform to syntax specified in the SYNOPSIS section.
  perf test: Make sample-parsing test aware of PERF_SAMPLE_{CODE,DATA}_PAGE_SIZE
  perf tools: Add support to read build id from compressed elf
  perf debug: Add debug_set_file function
  ...
parents 6a447b0e 2e7f5450
Loading
Loading
Loading
Loading
+70 −11
Original line number Diff line number Diff line
@@ -84,11 +84,14 @@ capabilities then providing the process with CAP_PERFMON capability singly
is recommended as the preferred secure approach to resolve double access
denial logging related to usage of performance monitoring and observability.

Unprivileged processes using perf_events system call are also subject
for PTRACE_MODE_READ_REALCREDS ptrace access mode check [7]_ , whose
outcome determines whether monitoring is permitted. So unprivileged
processes provided with CAP_SYS_PTRACE capability are effectively
permitted to pass the check.
Prior Linux v5.9 unprivileged processes using perf_events system call
are also subject for PTRACE_MODE_READ_REALCREDS ptrace access mode check
[7]_ , whose outcome determines whether monitoring is permitted.
So unprivileged processes provided with CAP_SYS_PTRACE capability are
effectively permitted to pass the check. Starting from Linux v5.9
CAP_SYS_PTRACE capability is not required and CAP_PERFMON is enough to
be provided for processes to make performance monitoring and observability
operations.

Other capabilities being granted to unprivileged processes can
effectively enable capturing of additional data required for later
@@ -99,11 +102,11 @@ CAP_SYSLOG capability permits reading kernel space memory addresses from
Privileged Perf users groups
---------------------------------

Mechanisms of capabilities, privileged capability-dumb files [6]_ and
file system ACLs [10]_ can be used to create dedicated groups of
privileged Perf users who are permitted to execute performance monitoring
and observability without scope limits. The following steps can be
taken to create such groups of privileged Perf users.
Mechanisms of capabilities, privileged capability-dumb files [6]_,
file system ACLs [10]_ and sudo [15]_ utility can be used to create
dedicated groups of privileged Perf users who are permitted to execute
performance monitoring and observability without limits. The following
steps can be taken to create such groups of privileged Perf users.

1. Create perf_users group of privileged Perf users, assign perf_users
   group to Perf tool executable and limit access to the executable for
@@ -133,7 +136,7 @@ taken to create such groups of privileged Perf users.
   # getcap perf
   perf = cap_sys_ptrace,cap_syslog,cap_perfmon+ep

If the libcap installed doesn't yet support "cap_perfmon", use "38" instead,
If the libcap [16]_ installed doesn't yet support "cap_perfmon", use "38" instead,
i.e.:

::
@@ -159,6 +162,60 @@ performance monitoring and observability by using functionality of the
configured Perf tool executable that, when executes, passes perf_events
subsystem scope checks.

In case Perf tool executable can't be assigned required capabilities (e.g.
file system is mounted with nosuid option or extended attributes are
not supported by the file system) then creation of the capabilities
privileged environment, naturally shell, is possible. The shell provides
inherent processes with CAP_PERFMON and other required capabilities so that
performance monitoring and observability operations are available in the
environment without limits. Access to the environment can be open via sudo
utility for members of perf_users group only. In order to create such
environment:

1. Create shell script that uses capsh utility [16]_ to assign CAP_PERFMON
   and other required capabilities into ambient capability set of the shell
   process, lock the process security bits after enabling SECBIT_NO_SETUID_FIXUP,
   SECBIT_NOROOT and SECBIT_NO_CAP_AMBIENT_RAISE bits and then change
   the process identity to sudo caller of the script who should essentially
   be a member of perf_users group:

::

   # ls -alh /usr/local/bin/perf.shell
   -rwxr-xr-x. 1 root root 83 Oct 13 23:57 /usr/local/bin/perf.shell
   # cat /usr/local/bin/perf.shell
   exec /usr/sbin/capsh --iab=^cap_perfmon --secbits=239 --user=$SUDO_USER -- -l

2. Extend sudo policy at /etc/sudoers file with a rule for perf_users group:

::

   # grep perf_users /etc/sudoers
   %perf_users    ALL=/usr/local/bin/perf.shell

3. Check that members of perf_users group have access to the privileged
   shell and have CAP_PERFMON and other required capabilities enabled
   in permitted, effective and ambient capability sets of an inherent process:

::

  $ id
  uid=1003(capsh_test) gid=1004(capsh_test) groups=1004(capsh_test),1000(perf_users) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
  $ sudo perf.shell
  [sudo] password for capsh_test:
  $ grep Cap /proc/self/status
  CapInh:        0000004000000000
  CapPrm:        0000004000000000
  CapEff:        0000004000000000
  CapBnd:        000000ffffffffff
  CapAmb:        0000004000000000
  $ capsh --decode=0000004000000000
  0x0000004000000000=cap_perfmon

As a result, members of perf_users group have access to the privileged
environment where they can use tools employing performance monitoring APIs
governed by CAP_PERFMON Linux capability.

This specific access control management is only available to superuser
or root running processes with CAP_SETPCAP, CAP_SETFCAP [6]_
capabilities.
@@ -264,3 +321,5 @@ Bibliography
.. [12] `<http://man7.org/linux/man-pages/man5/limits.conf.5.html>`_
.. [13] `<https://sites.google.com/site/fullycapable>`_
.. [14] `<http://man7.org/linux/man-pages/man8/auditd.8.html>`_
.. [15] `<https://man7.org/linux/man-pages/man8/sudo.8.html>`_
.. [16] `<https://git.kernel.org/pub/scm/libs/libcap/libcap.git/>`_
+2 −0
Original line number Diff line number Diff line
@@ -241,6 +241,7 @@
/* Intel-defined CPU features, CPUID level 0x00000007:0 (EBX), word 9 */
#define X86_FEATURE_FSGSBASE		( 9*32+ 0) /* RDFSBASE, WRFSBASE, RDGSBASE, WRGSBASE instructions*/
#define X86_FEATURE_TSC_ADJUST		( 9*32+ 1) /* TSC adjustment MSR 0x3B */
#define X86_FEATURE_SGX			( 9*32+ 2) /* Software Guard Extensions */
#define X86_FEATURE_BMI1		( 9*32+ 3) /* 1st group bit manipulation extensions */
#define X86_FEATURE_HLE			( 9*32+ 4) /* Hardware Lock Elision */
#define X86_FEATURE_AVX2		( 9*32+ 5) /* AVX2 instructions */
@@ -356,6 +357,7 @@
#define X86_FEATURE_MOVDIRI		(16*32+27) /* MOVDIRI instruction */
#define X86_FEATURE_MOVDIR64B		(16*32+28) /* MOVDIR64B instruction */
#define X86_FEATURE_ENQCMD		(16*32+29) /* ENQCMD and ENQCMDS instructions */
#define X86_FEATURE_SGX_LC		(16*32+30) /* Software Guard Extensions Launch Control */

/* AMD-defined CPU features, CPUID level 0x80000007 (EBX), word 17 */
#define X86_FEATURE_OVERFLOW_RECOV	(17*32+ 0) /* MCA overflow recovery support */
+7 −1
Original line number Diff line number Diff line
@@ -62,6 +62,12 @@
# define DISABLE_ENQCMD (1 << (X86_FEATURE_ENQCMD & 31))
#endif

#ifdef CONFIG_X86_SGX
# define DISABLE_SGX	0
#else
# define DISABLE_SGX	(1 << (X86_FEATURE_SGX & 31))
#endif

/*
 * Make sure to add features to the correct mask
 */
@@ -74,7 +80,7 @@
#define DISABLED_MASK6	0
#define DISABLED_MASK7	(DISABLE_PTI)
#define DISABLED_MASK8	0
#define DISABLED_MASK9	(DISABLE_SMAP)
#define DISABLED_MASK9	(DISABLE_SMAP|DISABLE_SGX)
#define DISABLED_MASK10	0
#define DISABLED_MASK11	0
#define DISABLED_MASK12	0
+11 −1
Original line number Diff line number Diff line
@@ -139,6 +139,7 @@
#define MSR_IA32_MCG_CAP		0x00000179
#define MSR_IA32_MCG_STATUS		0x0000017a
#define MSR_IA32_MCG_CTL		0x0000017b
#define MSR_ERROR_CONTROL		0x0000017f
#define MSR_IA32_MCG_EXT_CTL		0x000004d0

#define MSR_OFFCORE_RSP_0		0x000001a6
@@ -326,8 +327,9 @@
#define MSR_PP1_ENERGY_STATUS		0x00000641
#define MSR_PP1_POLICY			0x00000642

#define MSR_AMD_PKG_ENERGY_STATUS	0xc001029b
#define MSR_AMD_RAPL_POWER_UNIT		0xc0010299
#define MSR_AMD_CORE_ENERGY_STATUS		0xc001029a
#define MSR_AMD_PKG_ENERGY_STATUS	0xc001029b

/* Config TDP MSRs */
#define MSR_CONFIG_TDP_NOMINAL		0x00000648
@@ -609,6 +611,8 @@
#define FEAT_CTL_LOCKED				BIT(0)
#define FEAT_CTL_VMX_ENABLED_INSIDE_SMX		BIT(1)
#define FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX	BIT(2)
#define FEAT_CTL_SGX_LC_ENABLED			BIT(17)
#define FEAT_CTL_SGX_ENABLED			BIT(18)
#define FEAT_CTL_LMCE_ENABLED			BIT(20)

#define MSR_IA32_TSC_ADJUST             0x0000003b
@@ -628,6 +632,12 @@
#define MSR_IA32_UCODE_WRITE		0x00000079
#define MSR_IA32_UCODE_REV		0x0000008b

/* Intel SGX Launch Enclave Public Key Hash MSRs */
#define MSR_IA32_SGXLEPUBKEYHASH0	0x0000008C
#define MSR_IA32_SGXLEPUBKEYHASH1	0x0000008D
#define MSR_IA32_SGXLEPUBKEYHASH2	0x0000008E
#define MSR_IA32_SGXLEPUBKEYHASH3	0x0000008F

#define MSR_IA32_SMM_MONITOR_CTL	0x0000009b
#define MSR_IA32_SMBASE			0x0000009e

+1 −1
Original line number Diff line number Diff line
@@ -90,7 +90,7 @@ __BUILDXX = $(CXX) $(CXXFLAGS) -MD -Wall -Werror -o $@ $(patsubst %.bin,%.cpp,$(
###############################

$(OUTPUT)test-all.bin:
	$(BUILD) -fstack-protector-all -O2 -D_FORTIFY_SOURCE=2 -ldw -lelf -lnuma -lelf -I/usr/include/slang -lslang $(FLAGS_PERL_EMBED) $(FLAGS_PYTHON_EMBED) -DPACKAGE='"perf"' -lbfd -ldl -lz -llzma -lzstd
	$(BUILD) -fstack-protector-all -O2 -D_FORTIFY_SOURCE=2 -ldw -lelf -lnuma -lelf -I/usr/include/slang -lslang $(FLAGS_PERL_EMBED) $(FLAGS_PYTHON_EMBED) -DPACKAGE='"perf"' -lbfd -ldl -lz -llzma -lzstd -lcap

$(OUTPUT)test-hello.bin:
	$(BUILD)
Loading