Commit f5d5f5fa authored by Paolo Bonzini's avatar Paolo Bonzini
Browse files

Merge tag 'kvmarm-fixes-5.5-1' of...

Merge tag 'kvmarm-fixes-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master

KVM/arm fixes for .5.5, take #1

- Fix uninitialised sysreg accessor
- Fix handling of demand-paged device mappings
- Stop spamming the console on IMPDEF sysregs
- Relax mappings of writable memslots
- Assorted cleanups
parents 8715f052 6d674e28
Loading
Loading
Loading
Loading
+3 −3
Original line number Diff line number Diff line
@@ -3110,9 +3110,9 @@
			[X86,PV_OPS] Disable paravirtualized VMware scheduler
			clock and use the default one.

	no-steal-acc	[X86,KVM] Disable paravirtualized steal time accounting.
			steal time is computed, but won't influence scheduler
			behaviour
	no-steal-acc	[X86,KVM,ARM64] Disable paravirtualized steal time
			accounting. steal time is computed, but won't
			influence scheduler behaviour

	nolapic		[X86-32,APIC] Do not enable or use the local APIC.

+54 −1
Original line number Diff line number Diff line
@@ -1002,12 +1002,18 @@ Specifying exception.has_esr on a system that does not support it will return
-EINVAL. Setting anything other than the lower 24bits of exception.serror_esr
will return -EINVAL.

It is not possible to read back a pending external abort (injected via
KVM_SET_VCPU_EVENTS or otherwise) because such an exception is always delivered
directly to the virtual CPU).


struct kvm_vcpu_events {
	struct {
		__u8 serror_pending;
		__u8 serror_has_esr;
		__u8 ext_dabt_pending;
		/* Align it to 8 bytes */
		__u8 pad[6];
		__u8 pad[5];
		__u64 serror_esr;
	} exception;
	__u32 reserved[12];
@@ -1051,9 +1057,23 @@ contain a valid state and shall be written into the VCPU.

ARM/ARM64:

User space may need to inject several types of events to the guest.

Set the pending SError exception state for this VCPU. It is not possible to
'cancel' an Serror that has been made pending.

If the guest performed an access to I/O memory which could not be handled by
userspace, for example because of missing instruction syndrome decode
information or because there is no device mapped at the accessed IPA, then
userspace can ask the kernel to inject an external abort using the address
from the exiting fault on the VCPU. It is a programming error to set
ext_dabt_pending after an exit which was not either KVM_EXIT_MMIO or
KVM_EXIT_ARM_NISV. This feature is only available if the system supports
KVM_CAP_ARM_INJECT_EXT_DABT. This is a helper which provides commonality in
how userspace reports accesses for the above cases to guests, across different
userspace implementations. Nevertheless, userspace can still emulate all Arm
exceptions by manipulating individual registers using the KVM_SET_ONE_REG API.

See KVM_GET_VCPU_EVENTS for the data structure.


@@ -4468,6 +4488,39 @@ Hyper-V SynIC state change. Notification is used to remap SynIC
event/message pages and to enable/disable SynIC messages/events processing
in userspace.

		/* KVM_EXIT_ARM_NISV */
		struct {
			__u64 esr_iss;
			__u64 fault_ipa;
		} arm_nisv;

Used on arm and arm64 systems. If a guest accesses memory not in a memslot,
KVM will typically return to userspace and ask it to do MMIO emulation on its
behalf. However, for certain classes of instructions, no instruction decode
(direction, length of memory access) is provided, and fetching and decoding
the instruction from the VM is overly complicated to live in the kernel.

Historically, when this situation occurred, KVM would print a warning and kill
the VM. KVM assumed that if the guest accessed non-memslot memory, it was
trying to do I/O, which just couldn't be emulated, and the warning message was
phrased accordingly. However, what happened more often was that a guest bug
caused access outside the guest memory areas which should lead to a more
meaningful warning message and an external abort in the guest, if the access
did not fall within an I/O window.

Userspace implementations can query for KVM_CAP_ARM_NISV_TO_USER, and enable
this capability at VM creation. Once this is done, these types of errors will
instead return to userspace with KVM_EXIT_ARM_NISV, with the valid bits from
the HSR (arm) and ESR_EL2 (arm64) in the esr_iss field, and the faulting IPA
in the fault_ipa field. Userspace can either fix up the access if it's
actually an I/O access by decoding the instruction from guest memory (if it's
very brave) and continue executing the guest, or it can decide to suspend,
dump, or restart the guest.

Note that KVM does not skip the faulting instruction as it does for
KVM_EXIT_MMIO, but userspace has to emulate any change to the processing state
if it decides to decode and emulate the instruction.

		/* Fix the size of the union. */
		char padding[256];
	};
+80 −0
Original line number Diff line number Diff line
.. SPDX-License-Identifier: GPL-2.0

Paravirtualized time support for arm64
======================================

Arm specification DEN0057/A defines a standard for paravirtualised time
support for AArch64 guests:

https://developer.arm.com/docs/den0057/a

KVM/arm64 implements the stolen time part of this specification by providing
some hypervisor service calls to support a paravirtualized guest obtaining a
view of the amount of time stolen from its execution.

Two new SMCCC compatible hypercalls are defined:

* PV_TIME_FEATURES: 0xC5000020
* PV_TIME_ST:       0xC5000021

These are only available in the SMC64/HVC64 calling convention as
paravirtualized time is not available to 32 bit Arm guests. The existence of
the PV_FEATURES hypercall should be probed using the SMCCC 1.1 ARCH_FEATURES
mechanism before calling it.

PV_TIME_FEATURES
    ============= ========    ==========
    Function ID:  (uint32)    0xC5000020
    PV_call_id:   (uint32)    The function to query for support.
                              Currently only PV_TIME_ST is supported.
    Return value: (int64)     NOT_SUPPORTED (-1) or SUCCESS (0) if the relevant
                              PV-time feature is supported by the hypervisor.
    ============= ========    ==========

PV_TIME_ST
    ============= ========    ==========
    Function ID:  (uint32)    0xC5000021
    Return value: (int64)     IPA of the stolen time data structure for this
                              VCPU. On failure:
                              NOT_SUPPORTED (-1)
    ============= ========    ==========

The IPA returned by PV_TIME_ST should be mapped by the guest as normal memory
with inner and outer write back caching attributes, in the inner shareable
domain. A total of 16 bytes from the IPA returned are guaranteed to be
meaningfully filled by the hypervisor (see structure below).

PV_TIME_ST returns the structure for the calling VCPU.

Stolen Time
-----------

The structure pointed to by the PV_TIME_ST hypercall is as follows:

+-------------+-------------+-------------+----------------------------+
| Field       | Byte Length | Byte Offset | Description                |
+=============+=============+=============+============================+
| Revision    |      4      |      0      | Must be 0 for version 1.0  |
+-------------+-------------+-------------+----------------------------+
| Attributes  |      4      |      4      | Must be 0                  |
+-------------+-------------+-------------+----------------------------+
| Stolen time |      8      |      8      | Stolen time in unsigned    |
|             |             |             | nanoseconds indicating how |
|             |             |             | much time this VCPU thread |
|             |             |             | was involuntarily not      |
|             |             |             | running on a physical CPU. |
+-------------+-------------+-------------+----------------------------+

All values in the structure are stored little-endian.

The structure will be updated by the hypervisor prior to scheduling a VCPU. It
will be present within a reserved region of the normal memory given to the
guest. The guest should not attempt to write into this memory. There is a
structure per VCPU of the guest.

It is advisable that one or more 64k pages are set aside for the purpose of
these structures and not used for other purposes, this enables the guest to map
the region using 64k pages and avoids conflicting attributes with other memory.

For the user space interface see Documentation/virt/kvm/devices/vcpu.txt
section "3. GROUP: KVM_ARM_VCPU_PVTIME_CTRL".
+14 −0
Original line number Diff line number Diff line
@@ -60,3 +60,17 @@ time to use the number provided for a given timer, overwriting any previously
configured values on other VCPUs.  Userspace should configure the interrupt
numbers on at least one VCPU after creating all VCPUs and before running any
VCPUs.

3. GROUP: KVM_ARM_VCPU_PVTIME_CTRL
Architectures: ARM64

3.1 ATTRIBUTE: KVM_ARM_VCPU_PVTIME_IPA
Parameters: 64-bit base address
Returns: -ENXIO:  Stolen time not implemented
         -EEXIST: Base address already set for this VCPU
         -EINVAL: Base address not 64 byte aligned

Specifies the base address of the stolen time structure for this VCPU. The
base address must be 64 byte aligned and exist within a valid guest memory
region. See Documentation/virt/kvm/arm/pvtime.txt for more information
including the layout of the stolen time structure.
+1 −0
Original line number Diff line number Diff line
@@ -162,6 +162,7 @@
#define HSR_ISV		(_AC(1, UL) << HSR_ISV_SHIFT)
#define HSR_SRT_SHIFT	(16)
#define HSR_SRT_MASK	(0xf << HSR_SRT_SHIFT)
#define HSR_CM		(1 << 8)
#define HSR_FSC		(0x3f)
#define HSR_FSC_TYPE	(0x3c)
#define HSR_SSE		(1 << 21)
Loading