Commit 596cf45c authored by Linus Torvalds's avatar Linus Torvalds
Browse files

Merge branch 'akpm' (patches from Andrew)

Merge updates from Andrew Morton:
 "Incoming:

   - a small number of updates to scripts/, ocfs2 and fs/buffer.c

   - most of MM

  I still have quite a lot of material (mostly not MM) staged after
  linux-next due to -next dependencies. I'll send those across next week
  as the preprequisites get merged up"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (135 commits)
  mm/page_io.c: annotate refault stalls from swap_readpage
  mm/Kconfig: fix trivial help text punctuation
  mm/Kconfig: fix indentation
  mm/memory_hotplug.c: remove __online_page_set_limits()
  mm: fix typos in comments when calling __SetPageUptodate()
  mm: fix struct member name in function comments
  mm/shmem.c: cast the type of unmap_start to u64
  mm: shmem: use proper gfp flags for shmem_writepage()
  mm/shmem.c: make array 'values' static const, makes object smaller
  userfaultfd: require CAP_SYS_PTRACE for UFFD_FEATURE_EVENT_FORK
  fs/userfaultfd.c: wp: clear VM_UFFD_MISSING or VM_UFFD_WP during userfaultfd_register()
  userfaultfd: wrap the common dst_vma check into an inlined function
  userfaultfd: remove unnecessary WARN_ON() in __mcopy_atomic_hugetlb()
  userfaultfd: use vma_pagesize for all huge page size calculation
  mm/madvise.c: use PAGE_ALIGN[ED] for range checking
  mm/madvise.c: replace with page_size() in madvise_inject_error()
  mm/mmap.c: make vma_merge() comment more easy to understand
  mm/hwpoison-inject: use DEFINE_DEBUGFS_ATTRIBUTE to define debugfs fops
  autonuma: reduce cache footprint when scanning page tables
  autonuma: fix watermark checking in migrate_balanced_pgdat()
  ...
parents c3bfc5dd 93779069
Loading
Loading
Loading
Loading
+6 −1
Original line number Diff line number Diff line
@@ -1288,7 +1288,12 @@ PAGE_SIZE multiple when read back.
	  inactive_anon, active_anon, inactive_file, active_file, unevictable
		Amount of memory, swap-backed and filesystem-backed,
		on the internal memory management lists used by the
		page reclaim algorithm
		page reclaim algorithm.

		As these represent internal list state (eg. shmem pages are on anon
		memory management lists), inactive_foo + active_foo may not be equal to
		the value for the foo counter, since the foo counter is type-based, not
		list-based.

	  slab_reclaimable
		Part of "slab" that might be reclaimed, such as
+63 −0
Original line number Diff line number Diff line
@@ -218,3 +218,66 @@ brk handler is used to print bug reports.
A potential expansion of this mode is a hardware tag-based mode, which would
use hardware memory tagging support instead of compiler instrumentation and
manual shadow memory manipulation.

What memory accesses are sanitised by KASAN?
--------------------------------------------

The kernel maps memory in a number of different parts of the address
space. This poses something of a problem for KASAN, which requires
that all addresses accessed by instrumented code have a valid shadow
region.

The range of kernel virtual addresses is large: there is not enough
real memory to support a real shadow region for every address that
could be accessed by the kernel.

By default
~~~~~~~~~~

By default, architectures only map real memory over the shadow region
for the linear mapping (and potentially other small areas). For all
other areas - such as vmalloc and vmemmap space - a single read-only
page is mapped over the shadow area. This read-only shadow page
declares all memory accesses as permitted.

This presents a problem for modules: they do not live in the linear
mapping, but in a dedicated module space. By hooking in to the module
allocator, KASAN can temporarily map real shadow memory to cover
them. This allows detection of invalid accesses to module globals, for
example.

This also creates an incompatibility with ``VMAP_STACK``: if the stack
lives in vmalloc space, it will be shadowed by the read-only page, and
the kernel will fault when trying to set up the shadow data for stack
variables.

CONFIG_KASAN_VMALLOC
~~~~~~~~~~~~~~~~~~~~

With ``CONFIG_KASAN_VMALLOC``, KASAN can cover vmalloc space at the
cost of greater memory usage. Currently this is only supported on x86.

This works by hooking into vmalloc and vmap, and dynamically
allocating real shadow memory to back the mappings.

Most mappings in vmalloc space are small, requiring less than a full
page of shadow space. Allocating a full shadow page per mapping would
therefore be wasteful. Furthermore, to ensure that different mappings
use different shadow pages, mappings would have to be aligned to
``KASAN_SHADOW_SCALE_SIZE * PAGE_SIZE``.

Instead, we share backing space across multiple mappings. We allocate
a backing page when a mapping in vmalloc space uses a particular page
of the shadow region. This page can be shared by other vmalloc
mappings later on.

We hook in to the vmap infrastructure to lazily clean up unused shadow
memory.

To avoid the difficulties around swapping mappings around, we expect
that the part of the shadow region that covers the vmalloc space will
not be covered by the early shadow page, but will be left
unmapped. This will require changes in arch-specific code.

This allows ``VMAP_STACK`` support on x86, and can simplify support of
architectures that do not have a fixed module region.
+5 −4
Original line number Diff line number Diff line
@@ -836,16 +836,17 @@ config HAVE_ARCH_VMAP_STACK
config VMAP_STACK
	default y
	bool "Use a virtually-mapped stack"
	depends on HAVE_ARCH_VMAP_STACK && !KASAN
	depends on HAVE_ARCH_VMAP_STACK
	depends on !KASAN || KASAN_VMALLOC
	---help---
	  Enable this if you want the use virtually-mapped kernel stacks
	  with guard pages.  This causes kernel stack overflows to be
	  caught immediately rather than causing difficult-to-diagnose
	  corruption.

	  This is presently incompatible with KASAN because KASAN expects
	  the stack to map directly to the KASAN shadow map using a formula
	  that is incorrect if the stack is in vmalloc space.
	  To use this with KASAN, the architecture must support backing
	  virtual mappings with real shadow memory, and KASAN_VMALLOC must
	  be enabled.

config ARCH_OPTIONAL_KERNEL_RWX
	def_bool n
+0 −1
Original line number Diff line number Diff line
@@ -33,7 +33,6 @@
#define _ASM_ARC_PGTABLE_H

#include <linux/bits.h>
#define __ARCH_USE_5LEVEL_HACK
#include <asm-generic/pgtable-nopmd.h>
#include <asm/page.h>
#include <asm/mmu.h>	/* to propagate CONFIG_ARC_MMU_VER <n> */
+8 −2
Original line number Diff line number Diff line
@@ -30,6 +30,7 @@ noinline static int handle_kernel_vaddr_fault(unsigned long address)
	 * with the 'reference' page table.
	 */
	pgd_t *pgd, *pgd_k;
	p4d_t *p4d, *p4d_k;
	pud_t *pud, *pud_k;
	pmd_t *pmd, *pmd_k;

@@ -39,8 +40,13 @@ noinline static int handle_kernel_vaddr_fault(unsigned long address)
	if (!pgd_present(*pgd_k))
		goto bad_area;

	pud = pud_offset(pgd, address);
	pud_k = pud_offset(pgd_k, address);
	p4d = p4d_offset(pgd, address);
	p4d_k = p4d_offset(pgd_k, address);
	if (!p4d_present(*p4d_k))
		goto bad_area;

	pud = pud_offset(p4d, address);
	pud_k = pud_offset(p4d_k, address);
	if (!pud_present(*pud_k))
		goto bad_area;

Loading