Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm (6a447b0e) · Commits · 戴 / test

Documentation/admin-guide/kernel-parameters.txt

+10 −0

Original line number	Diff line number	Diff line
		@@ -2254,6 +2254,16 @@
		for all guests.
		Default is 1 (enabled) if in 64-bit or 32-bit PAE mode.

		kvm-arm.mode=
		[KVM,ARM] Select one of KVM/arm64's modes of operation.

		protected: nVHE-based mode with support for guests whose
		state is kept private from the host.
		Not valid if the kernel is running in EL2.

		Defaults to VHE/nVHE based on hardware support and
		the value of CONFIG_ARM64_VHE.

		kvm-arm.vgic_v3_group0_trap=
		[KVM,ARM] Trap guest accesses to GICv3 group-0
		system registers

Documentation/arm64/memory.rst

+1 −1

Original line number	Diff line number	Diff line
		@@ -97,7 +97,7 @@ hypervisor maps kernel pages in EL2 at a fixed (and potentially
		random) offset from the linear mapping. See the kern_hyp_va macro and
		kvm_update_va_mask function for more details. MMIO devices such as
		GICv2 gets mapped next to the HYP idmap page, as do vectors when
		ARM64_HARDEN_EL2_VECTORS is selected for particular CPUs.
		ARM64_SPECTRE_V3A is enabled for particular CPUs.

		When using KVM with the Virtualization Host Extensions, no additional
		mappings are created, since the host kernel runs directly in EL2.

Documentation/virt/kvm/api.rst

+111 −5

Original line number	Diff line number	Diff line
		@@ -262,6 +262,18 @@ The KVM_RUN ioctl (cf.) communicates with userspace via a shared
		memory region. This ioctl returns the size of that region. See the
		KVM_RUN documentation for details.

		Besides the size of the KVM_RUN communication region, other areas of
		the VCPU file descriptor can be mmap-ed, including:

		- if KVM_CAP_COALESCED_MMIO is available, a page at
		KVM_COALESCED_MMIO_PAGE_OFFSET * PAGE_SIZE; for historical reasons,
		this page is included in the result of KVM_GET_VCPU_MMAP_SIZE.
		KVM_CAP_COALESCED_MMIO is not documented yet.

		- if KVM_CAP_DIRTY_LOG_RING is available, a number of pages at
		KVM_DIRTY_LOG_PAGE_OFFSET * PAGE_SIZE. For more information on
		KVM_CAP_DIRTY_LOG_RING, see section 8.3.


		4.6 KVM_SET_MEMORY_REGION
		-------------------------
		@@ -4455,9 +4467,9 @@ that KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is present.
		4.118 KVM_GET_SUPPORTED_HV_CPUID
		--------------------------------

		:Capability: KVM_CAP_HYPERV_CPUID
		:Capability: KVM_CAP_HYPERV_CPUID (vcpu), KVM_CAP_SYS_HYPERV_CPUID (system)
		:Architectures: x86
		:Type: vcpu ioctl
		:Type: system ioctl, vcpu ioctl
		:Parameters: struct kvm_cpuid2 (in/out)
		:Returns: 0 on success, -1 on error

		@@ -4502,9 +4514,6 @@ Currently, the following list of CPUID leaves are returned:
		- HYPERV_CPUID_SYNDBG_INTERFACE
		- HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES

		HYPERV_CPUID_NESTED_FEATURES leaf is only exposed when Enlightened VMCS was
		enabled on the corresponding vCPU (KVM_CAP_HYPERV_ENLIGHTENED_VMCS).

		Userspace invokes KVM_GET_SUPPORTED_HV_CPUID by passing a kvm_cpuid2 structure
		with the 'nent' field indicating the number of entries in the variable-size
		array 'entries'. If the number of entries is too low to describe all Hyper-V
		@@ -4515,6 +4524,15 @@ number of valid entries in the 'entries' array, which is then filled.
		'index' and 'flags' fields in 'struct kvm_cpuid_entry2' are currently reserved,
		userspace should not expect to get any particular value there.

		Note, vcpu version of KVM_GET_SUPPORTED_HV_CPUID is currently deprecated. Unlike
		system ioctl which exposes all supported feature bits unconditionally, vcpu
		version has the following quirks:
		- HYPERV_CPUID_NESTED_FEATURES leaf and HV_X64_ENLIGHTENED_VMCS_RECOMMENDED
		feature bit are only exposed when Enlightened VMCS was previously enabled
		on the corresponding vCPU (KVM_CAP_HYPERV_ENLIGHTENED_VMCS).
		- HV_STIMER_DIRECT_MODE_AVAILABLE bit is only exposed with in-kernel LAPIC.
		(presumes KVM_CREATE_IRQCHIP has already been called).

		4.119 KVM_ARM_VCPU_FINALIZE
		---------------------------

		@@ -6390,3 +6408,91 @@ When enabled, KVM will disable paravirtual features provided to the
		guest according to the bits in the KVM_CPUID_FEATURES CPUID leaf
		(0x40000001). Otherwise, a guest may use the paravirtual features
		regardless of what has actually been exposed through the CPUID leaf.


		8.29 KVM_CAP_DIRTY_LOG_RING
		---------------------------

		:Architectures: x86
		:Parameters: args[0] - size of the dirty log ring

		KVM is capable of tracking dirty memory using ring buffers that are
		mmaped into userspace; there is one dirty ring per vcpu.

		The dirty ring is available to userspace as an array of
		``struct kvm_dirty_gfn``. Each dirty entry it's defined as::

		struct kvm_dirty_gfn {
		__u32 flags;
		__u32 slot; /* as_id \| slot_id */
		__u64 offset;
		};

		The following values are defined for the flags field to define the
		current state of the entry::

		#define KVM_DIRTY_GFN_F_DIRTY BIT(0)
		#define KVM_DIRTY_GFN_F_RESET BIT(1)
		#define KVM_DIRTY_GFN_F_MASK 0x3

		Userspace should call KVM_ENABLE_CAP ioctl right after KVM_CREATE_VM
		ioctl to enable this capability for the new guest and set the size of
		the rings. Enabling the capability is only allowed before creating any
		vCPU, and the size of the ring must be a power of two. The larger the
		ring buffer, the less likely the ring is full and the VM is forced to
		exit to userspace. The optimal size depends on the workload, but it is
		recommended that it be at least 64 KiB (4096 entries).

		Just like for dirty page bitmaps, the buffer tracks writes to
		all user memory regions for which the KVM_MEM_LOG_DIRTY_PAGES flag was
		set in KVM_SET_USER_MEMORY_REGION. Once a memory region is registered
		with the flag set, userspace can start harvesting dirty pages from the
		ring buffer.

		An entry in the ring buffer can be unused (flag bits ``00``),
		dirty (flag bits ``01``) or harvested (flag bits ``1X``). The
		state machine for the entry is as follows::

		dirtied harvested reset
		00 -----------> 01 -------------> 1X -------+
		^ \|
		\| \|
		+------------------------------------------+

		To harvest the dirty pages, userspace accesses the mmaped ring buffer
		to read the dirty GFNs. If the flags has the DIRTY bit set (at this stage
		the RESET bit must be cleared), then it means this GFN is a dirty GFN.
		The userspace should harvest this GFN and mark the flags from state
		``01b`` to ``1Xb`` (bit 0 will be ignored by KVM, but bit 1 must be set
		to show that this GFN is harvested and waiting for a reset), and move
		on to the next GFN. The userspace should continue to do this until the
		flags of a GFN have the DIRTY bit cleared, meaning that it has harvested
		all the dirty GFNs that were available.

		It's not necessary for userspace to harvest the all dirty GFNs at once.
		However it must collect the dirty GFNs in sequence, i.e., the userspace
		program cannot skip one dirty GFN to collect the one next to it.

		After processing one or more entries in the ring buffer, userspace
		calls the VM ioctl KVM_RESET_DIRTY_RINGS to notify the kernel about
		it, so that the kernel will reprotect those collected GFNs.
		Therefore, the ioctl must be called before reading the content of
		the dirty pages.

		The dirty ring can get full. When it happens, the KVM_RUN of the
		vcpu will return with exit reason KVM_EXIT_DIRTY_LOG_FULL.

		The dirty ring interface has a major difference comparing to the
		KVM_GET_DIRTY_LOG interface in that, when reading the dirty ring from
		userspace, it's still possible that the kernel has not yet flushed the
		processor's dirty page buffers into the kernel buffer (with dirty bitmaps, the
		flushing is done by the KVM_GET_DIRTY_LOG ioctl). To achieve that, one
		needs to kick the vcpu out of KVM_RUN using a signal. The resulting
		vmexit ensures that all dirty GFNs are flushed to the dirty rings.

		NOTE: the capability KVM_CAP_DIRTY_LOG_RING and the corresponding
		ioctl KVM_RESET_DIRTY_RINGS are mutual exclusive to the existing ioctls
		KVM_GET_DIRTY_LOG and KVM_CLEAR_DIRTY_LOG. After enabling
		KVM_CAP_DIRTY_LOG_RING with an acceptable dirty ring size, the virtual
		machine will switch to ring-buffer dirty page tracking and further
		KVM_GET_DIRTY_LOG or KVM_CLEAR_DIRTY_LOG ioctls will fail.

Documentation/virt/kvm/arm/pvtime.rst

+2 −2

Original line number	Diff line number	Diff line
		@@ -19,8 +19,8 @@ Two new SMCCC compatible hypercalls are defined:

		These are only available in the SMC64/HVC64 calling convention as
		paravirtualized time is not available to 32 bit Arm guests. The existence of
		the PV_FEATURES hypercall should be probed using the SMCCC 1.1 ARCH_FEATURES
		mechanism before calling it.
		the PV_TIME_FEATURES hypercall should be probed using the SMCCC 1.1
		ARCH_FEATURES mechanism before calling it.

		PV_TIME_FEATURES
		============= ======== ==========

arch/arm64/include/asm/cpucaps.h

+3 −2

Original line number	Diff line number	Diff line
		@@ -19,7 +19,7 @@
		#define ARM64_HAS_VIRT_HOST_EXTN 11
		#define ARM64_WORKAROUND_CAVIUM_27456 12
		#define ARM64_HAS_32BIT_EL0 13
		#define ARM64_HARDEN_EL2_VECTORS 14
		#define ARM64_SPECTRE_V3A 14
		#define ARM64_HAS_CNP 15
		#define ARM64_HAS_NO_FPSIMD 16
		#define ARM64_WORKAROUND_REPEAT_TLBI 17
		@@ -65,7 +65,8 @@
		#define ARM64_MTE 57
		#define ARM64_WORKAROUND_1508412 58
		#define ARM64_HAS_LDAPR 59
		#define ARM64_KVM_PROTECTED_MODE 60

		#define ARM64_NCAPS 60
		#define ARM64_NCAPS 61

		#endif /* __ASM_CPUCAPS_H */

Admin message