Commit 39d7530d authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull KVM updates from Paolo Bonzini:
 "ARM:
   - support for chained PMU counters in guests
   - improved SError handling
   - handle Neoverse N1 erratum #1349291
   - allow side-channel mitigation status to be migrated
   - standardise most AArch64 system register accesses to msr_s/mrs_s
   - fix host MPIDR corruption on 32bit
   - selftests ckleanups

  x86:
   - PMU event {white,black}listing
   - ability for the guest to disable host-side interrupt polling
   - fixes for enlightened VMCS (Hyper-V pv nested virtualization),
   - new hypercall to yield to IPI target
   - support for passing cstate MSRs through to the guest
   - lots of cleanups and optimizations

  Generic:
   - Some txt->rST conversions for the documentation"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (128 commits)
  Documentation: virtual: Add toctree hooks
  Documentation: kvm: Convert cpuid.txt to .rst
  Documentation: virtual: Convert paravirt_ops.txt to .rst
  KVM: x86: Unconditionally enable irqs in guest context
  KVM: x86: PMU Event Filter
  kvm: x86: Fix -Wmissing-prototypes warnings
  KVM: Properly check if "page" is valid in kvm_vcpu_unmap
  KVM: arm/arm64: Initialise host's MPIDRs by reading the actual register
  KVM: LAPIC: Retry tune per-vCPU timer_advance_ns if adaptive tuning goes insane
  kvm: LAPIC: write down valid APIC registers
  KVM: arm64: Migrate _elx sysreg accessors to msr_s/mrs_s
  KVM: doc: Add API documentation on the KVM_REG_ARM_WORKAROUNDS register
  KVM: arm/arm64: Add save/restore support for firmware workaround state
  arm64: KVM: Propagate full Spectre v2 workaround state to KVM guests
  KVM: arm/arm64: Support chained PMU counters
  KVM: arm/arm64: Remove pmc->bitmask
  KVM: arm/arm64: Re-create event when setting counter value
  KVM: arm/arm64: Extract duplicated code to own function
  KVM: arm/arm64: Rename kvm_pmu_{enable/disable}_counter functions
  KVM: LAPIC: ARBPRI is a reserved register for x2APIC
  ...
parents 16c97650 a45ff599
Loading
Loading
Loading
Loading
+2 −0
Original line number Diff line number Diff line
@@ -86,6 +86,8 @@ stable kernels.
+----------------+-----------------+-----------------+-----------------------------+
| ARM            | Neoverse-N1     | #1188873,1418040| ARM64_ERRATUM_1418040       |
+----------------+-----------------+-----------------+-----------------------------+
| ARM            | Neoverse-N1     | #1349291        | N/A                         |
+----------------+-----------------+-----------------+-----------------------------+
| ARM            | MMU-500         | #841119,826419  | N/A                         |
+----------------+-----------------+-----------------+-----------------------------+
+----------------+-----------------+-----------------+-----------------------------+
+18 −0
Original line number Diff line number Diff line
.. SPDX-License-Identifier: GPL-2.0

============================
Linux Virtualization Support
============================

.. toctree::
   :maxdepth: 2

   kvm/index
   paravirt_ops

.. only:: html and subproject

   Indices
   =======

   * :ref:`genindex`
+28 −0
Original line number Diff line number Diff line
@@ -4081,6 +4081,32 @@ KVM_ARM_VCPU_FINALIZE call.
See KVM_ARM_VCPU_INIT for details of vcpu features that require finalization
using this ioctl.

4.120 KVM_SET_PMU_EVENT_FILTER

Capability: KVM_CAP_PMU_EVENT_FILTER
Architectures: x86
Type: vm ioctl
Parameters: struct kvm_pmu_event_filter (in)
Returns: 0 on success, -1 on error

struct kvm_pmu_event_filter {
       __u32 action;
       __u32 nevents;
       __u64 events[0];
};

This ioctl restricts the set of PMU events that the guest can program.
The argument holds a list of events which will be allowed or denied.
The eventsel+umask of each event the guest attempts to program is compared
against the events field to determine whether the guest should have access.
This only affects general purpose counters; fixed purpose counters can
be disabled by changing the perfmon CPUID leaf.

Valid values for 'action':
#define KVM_PMU_EVENT_ALLOW 0
#define KVM_PMU_EVENT_DENY 1


5. The kvm_run structure
------------------------

@@ -4909,6 +4935,8 @@ Valid bits in args[0] are

#define KVM_X86_DISABLE_EXITS_MWAIT            (1 << 0)
#define KVM_X86_DISABLE_EXITS_HLT              (1 << 1)
#define KVM_X86_DISABLE_EXITS_PAUSE            (1 << 2)
#define KVM_X86_DISABLE_EXITS_CSTATE           (1 << 3)

Enabling this capability on a VM provides userspace with a way to no
longer intercept some instructions for improved latency in some
+31 −0
Original line number Diff line number Diff line
@@ -28,3 +28,34 @@ The following register is defined:
  - Allows any PSCI version implemented by KVM and compatible with
    v0.2 to be set with SET_ONE_REG
  - Affects the whole VM (even if the register view is per-vcpu)

* KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
  Holds the state of the firmware support to mitigate CVE-2017-5715, as
  offered by KVM to the guest via a HVC call. The workaround is described
  under SMCCC_ARCH_WORKAROUND_1 in [1].
  Accepted values are:
    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL: KVM does not offer
      firmware support for the workaround. The mitigation status for the
      guest is unknown.
    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL: The workaround HVC call is
      available to the guest and required for the mitigation.
    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_REQUIRED: The workaround HVC call
      is available to the guest, but it is not needed on this VCPU.

* KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
  Holds the state of the firmware support to mitigate CVE-2018-3639, as
  offered by KVM to the guest via a HVC call. The workaround is described
  under SMCCC_ARCH_WORKAROUND_2 in [1].
  Accepted values are:
    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL: A workaround is not
      available. KVM does not offer firmware support for the workaround.
    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN: The workaround state is
      unknown. KVM does not offer firmware support for the workaround.
    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL: The workaround is available,
      and can be disabled by a vCPU. If
      KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED is set, it is active for
      this vCPU.
    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED: The workaround is
      always active on this vCPU or it is not needed.

[1] https://developer.arm.com/-/media/developer/pdf/ARM_DEN_0070A_Firmware_interfaces_for_mitigating_CVE-2017-5715.pdf
+107 −0
Original line number Diff line number Diff line
.. SPDX-License-Identifier: GPL-2.0

==============
KVM CPUID bits
==============

:Author: Glauber Costa <glommer@gmail.com>

A guest running on a kvm host, can check some of its features using
cpuid. This is not always guaranteed to work, since userspace can
mask-out some, or even all KVM-related cpuid features before launching
a guest.

KVM cpuid functions are:

function: KVM_CPUID_SIGNATURE (0x40000000)

returns::

   eax = 0x40000001
   ebx = 0x4b4d564b
   ecx = 0x564b4d56
   edx = 0x4d

Note that this value in ebx, ecx and edx corresponds to the string "KVMKVMKVM".
The value in eax corresponds to the maximum cpuid function present in this leaf,
and will be updated if more functions are added in the future.
Note also that old hosts set eax value to 0x0. This should
be interpreted as if the value was 0x40000001.
This function queries the presence of KVM cpuid leafs.

function: define KVM_CPUID_FEATURES (0x40000001)

returns::

          ebx, ecx
          eax = an OR'ed group of (1 << flag)

where ``flag`` is defined as below:

================================= =========== ================================
flag                              value       meaning
================================= =========== ================================
KVM_FEATURE_CLOCKSOURCE           0           kvmclock available at msrs
                                              0x11 and 0x12

KVM_FEATURE_NOP_IO_DELAY          1           not necessary to perform delays
                                              on PIO operations

KVM_FEATURE_MMU_OP                2           deprecated

KVM_FEATURE_CLOCKSOURCE2          3           kvmclock available at msrs

                                              0x4b564d00 and 0x4b564d01
KVM_FEATURE_ASYNC_PF              4           async pf can be enabled by
                                              writing to msr 0x4b564d02

KVM_FEATURE_STEAL_TIME            5           steal time can be enabled by
                                              writing to msr 0x4b564d03

KVM_FEATURE_PV_EOI                6           paravirtualized end of interrupt
                                              handler can be enabled by
                                              writing to msr 0x4b564d04

KVM_FEATURE_PV_UNHAULT            7           guest checks this feature bit
                                              before enabling paravirtualized
                                              spinlock support

KVM_FEATURE_PV_TLB_FLUSH          9           guest checks this feature bit
                                              before enabling paravirtualized
                                              tlb flush

KVM_FEATURE_ASYNC_PF_VMEXIT       10          paravirtualized async PF VM EXIT
                                              can be enabled by setting bit 2
                                              when writing to msr 0x4b564d02

KVM_FEATURE_PV_SEND_IPI           11          guest checks this feature bit
                                              before enabling paravirtualized
                                              sebd IPIs

KVM_FEATURE_PV_POLL_CONTROL       12          host-side polling on HLT can
                                              be disabled by writing
                                              to msr 0x4b564d05.

KVM_FEATURE_PV_SCHED_YIELD        13          guest checks this feature bit
                                              before using paravirtualized
                                              sched yield.

KVM_FEATURE_CLOCSOURCE_STABLE_BIT 24          host will warn if no guest-side
                                              per-cpu warps are expeced in
                                              kvmclock
================================= =========== ================================

::

      edx = an OR'ed group of (1 << flag)

Where ``flag`` here is defined as below:

================== ============ =================================
flag               value        meaning
================== ============ =================================
KVM_HINTS_REALTIME 0            guest checks this feature bit to
                                determine that vCPUs are never
                                preempted for an unlimited time
                                allowing optimizations
================== ============ =================================
Loading