Commit 6734e20e authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull arm64 updates from Will Deacon:
 "There's quite a lot of code here, but much of it is due to the
  addition of a new PMU driver as well as some arm64-specific selftests
  which is an area where we've traditionally been lagging a bit.

  In terms of exciting features, this includes support for the Memory
  Tagging Extension which narrowly missed 5.9, hopefully allowing
  userspace to run with use-after-free detection in production on CPUs
  that support it. Work is ongoing to integrate the feature with KASAN
  for 5.11.

  Another change that I'm excited about (assuming they get the hardware
  right) is preparing the ASID allocator for sharing the CPU page-table
  with the SMMU. Those changes will also come in via Joerg with the
  IOMMU pull.

  We do stray outside of our usual directories in a few places, mostly
  due to core changes required by MTE. Although much of this has been
  Acked, there were a couple of places where we unfortunately didn't get
  any review feedback.

  Other than that, we ran into a handful of minor conflicts in -next,
  but nothing that should post any issues.

  Summary:

   - Userspace support for the Memory Tagging Extension introduced by
     Armv8.5. Kernel support (via KASAN) is likely to follow in 5.11.

   - Selftests for MTE, Pointer Authentication and FPSIMD/SVE context
     switching.

   - Fix and subsequent rewrite of our Spectre mitigations, including
     the addition of support for PR_SPEC_DISABLE_NOEXEC.

   - Support for the Armv8.3 Pointer Authentication enhancements.

   - Support for ASID pinning, which is required when sharing
     page-tables with the SMMU.

   - MM updates, including treating flush_tlb_fix_spurious_fault() as a
     no-op.

   - Perf/PMU driver updates, including addition of the ARM CMN PMU
     driver and also support to handle CPU PMU IRQs as NMIs.

   - Allow prefetchable PCI BARs to be exposed to userspace using normal
     non-cacheable mappings.

   - Implementation of ARCH_STACKWALK for unwinding.

   - Improve reporting of unexpected kernel traps due to BPF JIT
     failure.

   - Improve robustness of user-visible HWCAP strings and their
     corresponding numerical constants.

   - Removal of TEXT_OFFSET.

   - Removal of some unused functions, parameters and prototypes.

   - Removal of MPIDR-based topology detection in favour of firmware
     description.

   - Cleanups to handling of SVE and FPSIMD register state in
     preparation for potential future optimisation of handling across
     syscalls.

   - Cleanups to the SDEI driver in preparation for support in KVM.

   - Miscellaneous cleanups and refactoring work"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (148 commits)
  Revert "arm64: initialize per-cpu offsets earlier"
  arm64: random: Remove no longer needed prototypes
  arm64: initialize per-cpu offsets earlier
  kselftest/arm64: Check mte tagged user address in kernel
  kselftest/arm64: Verify KSM page merge for MTE pages
  kselftest/arm64: Verify all different mmap MTE options
  kselftest/arm64: Check forked child mte memory accessibility
  kselftest/arm64: Verify mte tag inclusion via prctl
  kselftest/arm64: Add utilities and a test to validate mte memory
  perf: arm-cmn: Fix conversion specifiers for node type
  perf: arm-cmn: Fix unsigned comparison to less than zero
  arm64: dbm: Invalidate local TLB when setting TCR_EL1.HD
  arm64: mm: Make flush_tlb_fix_spurious_fault() a no-op
  arm64: Add support for PR_SPEC_DISABLE_NOEXEC prctl() option
  arm64: Pull in task_stack_page() to Spectre-v4 mitigation code
  KVM: arm64: Allow patching EL2 vectors even with KASLR is not enabled
  arm64: Get rid of arm64_ssbd_state
  KVM: arm64: Convert ARCH_WORKAROUND_2 to arm64_get_spectre_v4_state()
  KVM: arm64: Get rid of kvm_arm_have_ssbd()
  KVM: arm64: Simplify handling of ARCH_WORKAROUND_2
  ...
parents d04a248f d13027bb
Loading
Loading
Loading
Loading
+65 −0
Original line number Diff line number Diff line
=============================
Arm Coherent Mesh Network PMU
=============================

CMN-600 is a configurable mesh interconnect consisting of a rectangular
grid of crosspoints (XPs), with each crosspoint supporting up to two
device ports to which various AMBA CHI agents are attached.

CMN implements a distributed PMU design as part of its debug and trace
functionality. This consists of a local monitor (DTM) at every XP, which
counts up to 4 event signals from the connected device nodes and/or the
XP itself. Overflow from these local counters is accumulated in up to 8
global counters implemented by the main controller (DTC), which provides
overall PMU control and interrupts for global counter overflow.

PMU events
----------

The PMU driver registers a single PMU device for the whole interconnect,
see /sys/bus/event_source/devices/arm_cmn. Multi-chip systems may link
more than one CMN together via external CCIX links - in this situation,
each mesh counts its own events entirely independently, and additional
PMU devices will be named arm_cmn_{1..n}.

Most events are specified in a format based directly on the TRM
definitions - "type" selects the respective node type, and "eventid" the
event number. Some events require an additional occupancy ID, which is
specified by "occupid".

* Since RN-D nodes do not have any distinct events from RN-I nodes, they
  are treated as the same type (0xa), and the common event templates are
  named "rnid_*".

* The cycle counter is treated as a synthetic event belonging to the DTC
  node ("type" == 0x3, "eventid" is ignored).

* XP events also encode the port and channel in the "eventid" field, to
  match the underlying pmu_event0_id encoding for the pmu_event_sel
  register. The event templates are named with prefixes to cover all
  permutations.

By default each event provides an aggregate count over all nodes of the
given type. To target a specific node, "bynodeid" must be set to 1 and
"nodeid" to the appropriate value derived from the CMN configuration
(as defined in the "Node ID Mapping" section of the TRM).

Watchpoints
-----------

The PMU can also count watchpoint events to monitor specific flit
traffic. Watchpoints are treated as a synthetic event type, and like PMU
events can be global or targeted with a particular XP's "nodeid" value.
Since the watchpoint direction is otherwise implicit in the underlying
register selection, separate events are provided for flit uploads and
downloads.

The flit match value and mask are passed in config1 and config2 ("val"
and "mask" respectively). "wp_dev_sel", "wp_chn_sel", "wp_grp" and
"wp_exclusive" are specified per the TRM definitions for dtm_wp_config0.
Where a watchpoint needs to match fields from both match groups on the
REQ or SNP channel, it can be specified as two events - one for each
group - with the same nonzero "combine" value. The count for such a
pair of combined events will be attributed to the primary match.
Watchpoint events with a "combine" value of 0 are considered independent
and will count individually.
+1 −0
Original line number Diff line number Diff line
@@ -12,6 +12,7 @@ Performance monitor support
   qcom_l2_pmu
   qcom_l3_pmu
   arm-ccn
   arm-cmn
   xgene-pmu
   arm_dsu_pmu
   thunderx2-pmu
+2 −0
Original line number Diff line number Diff line
@@ -175,6 +175,8 @@ infrastructure:
     +------------------------------+---------+---------+
     | Name                         |  bits   | visible |
     +------------------------------+---------+---------+
     | MTE                          | [11-8]  |    y    |
     +------------------------------+---------+---------+
     | SSBS                         | [7-4]   |    y    |
     +------------------------------+---------+---------+
     | BT                           | [3-0]   |    y    |
+4 −0
Original line number Diff line number Diff line
@@ -240,6 +240,10 @@ HWCAP2_BTI

    Functionality implied by ID_AA64PFR0_EL1.BT == 0b0001.

HWCAP2_MTE

    Functionality implied by ID_AA64PFR1_EL1.MTE == 0b0010, as described
    by Documentation/arm64/memory-tagging-extension.rst.

4. Unused AT_HWCAP bits
-----------------------
+1 −0
Original line number Diff line number Diff line
@@ -14,6 +14,7 @@ ARM64 Architecture
    hugetlbpage
    legacy_instructions
    memory
    memory-tagging-extension
    perf
    pointer-authentication
    silicon-errata
Loading