Commit ac73e3dc authored by Linus Torvalds's avatar Linus Torvalds
Browse files

Merge branch 'akpm' (patches from Andrew)

Merge misc updates from Andrew Morton:

 - a few random little subsystems

 - almost all of the MM patches which are staged ahead of linux-next
   material. I'll trickle to post-linux-next work in as the dependents
   get merged up.

Subsystems affected by this patch series: kthread, kbuild, ide, ntfs,
ocfs2, arch, and mm (slab-generic, slab, slub, dax, debug, pagecache,
gup, swap, shmem, memcg, pagemap, mremap, hmm, vmalloc, documentation,
kasan, pagealloc, memory-failure, hugetlb, vmscan, z3fold, compaction,
oom-kill, migration, cma, page-poison, userfaultfd, zswap, zsmalloc,
uaccess, zram, and cleanups).

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (200 commits)
  mm: cleanup kstrto*() usage
  mm: fix fall-through warnings for Clang
  mm: slub: convert sysfs sprintf family to sysfs_emit/sysfs_emit_at
  mm: shmem: convert shmem_enabled_show to use sysfs_emit_at
  mm:backing-dev: use sysfs_emit in macro defining functions
  mm: huge_memory: convert remaining use of sprintf to sysfs_emit and neatening
  mm: use sysfs_emit for struct kobject * uses
  mm: fix kernel-doc markups
  zram: break the strict dependency from lzo
  zram: add stat to gather incompressible pages since zram set up
  zram: support page writeback
  mm/process_vm_access: remove redundant initialization of iov_r
  mm/zsmalloc.c: rework the list_add code in insert_zspage()
  mm/zswap: move to use crypto_acomp API for hardware acceleration
  mm/zswap: fix passing zero to 'PTR_ERR' warning
  mm/zswap: make struct kernel_param_ops definitions const
  userfaultfd/selftests: hint the test runner on required privilege
  userfaultfd/selftests: fix retval check for userfaultfd_open()
  userfaultfd/selftests: always dump something in modes
  userfaultfd: selftests: make __{s,u}64 format specifiers portable
  ...
parents 148842c9 dfefd226
Loading
Loading
Loading
Loading
+6 −0
Original line number Diff line number Diff line
@@ -266,6 +266,7 @@ line of text and contains the following stats separated by whitespace:
                  No memory is allocated for such pages.
 pages_compacted  the number of pages freed during compaction
 huge_pages	  the number of incompressible pages
 huge_pages_since the number of incompressible pages since zram set up
 ================ =============================================================

File /sys/block/zram<id>/bd_stat
@@ -334,6 +335,11 @@ Admin can request writeback of those idle pages at right timing via::

With the command, zram writeback idle pages from memory to the storage.

If admin want to write a specific page in zram device to backing device,
they could write a page index into the interface.

	echo "page_index=1251" > /sys/block/zramX/writeback

If there are lots of write IO with flash device, potentially, it has
flash wearout problem so that admin needs to design write limitation
to guarantee storage health for entire product life.
+3 −5
Original line number Diff line number Diff line
@@ -219,13 +219,11 @@ Under below explanation, we assume CONFIG_MEM_RES_CTRL_SWAP=y.

	This is an easy way to test page migration, too.

9.5 mkdir/rmdir
---------------
9.5 nested cgroups
------------------

	When using hierarchy, mkdir/rmdir test should be done.
	Use tests like the following::
	Use tests like the following for testing nested cgroups::

		echo 1 >/opt/cgroup/01/memory/use_hierarchy
		mkdir /opt/cgroup/01/child_a
		mkdir /opt/cgroup/01/child_b

+13 −27
Original line number Diff line number Diff line
@@ -77,6 +77,8 @@ Brief summary of control files.
 memory.soft_limit_in_bytes	     set/show soft limit of memory usage
 memory.stat			     show various statistics
 memory.use_hierarchy		     set/show hierarchical account enabled
                                     This knob is deprecated and shouldn't be
                                     used.
 memory.force_empty		     trigger forced page reclaim
 memory.pressure_level		     set memory pressure notifications
 memory.swappiness		     set/show swappiness parameter of vmscan
@@ -495,16 +497,13 @@ cgroup might have some charge associated with it, even though all
tasks have migrated away from it. (because we charge against pages, not
against tasks.)

We move the stats to root (if use_hierarchy==0) or parent (if
use_hierarchy==1), and no change on the charge except uncharging
We move the stats to parent, and no change on the charge except uncharging
from the child.

Charges recorded in swap information is not updated at removal of cgroup.
Recorded information is discarded and a cgroup which uses swap (swapcache)
will be charged as a new owner of it.

About use_hierarchy, see Section 6.

5. Misc. interfaces
===================

@@ -527,8 +526,6 @@ About use_hierarchy, see Section 6.
  write will still return success. In this case, it is expected that
  memory.kmem.usage_in_bytes == memory.usage_in_bytes.

  About use_hierarchy, see Section 6.

5.2 stat file
-------------

@@ -675,31 +672,20 @@ hierarchy::
		      d   e

In the diagram above, with hierarchical accounting enabled, all memory
usage of e, is accounted to its ancestors up until the root (i.e, c and root),
that has memory.use_hierarchy enabled. If one of the ancestors goes over its
limit, the reclaim algorithm reclaims from the tasks in the ancestor and the
children of the ancestor.

6.1 Enabling hierarchical accounting and reclaim
------------------------------------------------

A memory cgroup by default disables the hierarchy feature. Support
can be enabled by writing 1 to memory.use_hierarchy file of the root cgroup::
usage of e, is accounted to its ancestors up until the root (i.e, c and root).
If one of the ancestors goes over its limit, the reclaim algorithm reclaims
from the tasks in the ancestor and the children of the ancestor.

	# echo 1 > memory.use_hierarchy

The feature can be disabled by::
6.1 Hierarchical accounting and reclaim
---------------------------------------

	# echo 0 > memory.use_hierarchy
Hierarchical accounting is enabled by default. Disabling the hierarchical
accounting is deprecated. An attempt to do it will result in a failure
and a warning printed to dmesg.

NOTE1:
       Enabling/disabling will fail if either the cgroup already has other
       cgroups created below it, or if the parent cgroup has use_hierarchy
       enabled.
For compatibility reasons writing 1 to memory.use_hierarchy will always pass::

NOTE2:
       When panic_on_oom is set to "2", the whole system will panic in
       case of an OOM event in any cgroup.
	# echo 1 > memory.use_hierarchy

7. Soft limits
==============
+11 −0
Original line number Diff line number Diff line
@@ -1274,6 +1274,9 @@ PAGE_SIZE multiple when read back.
	  kernel_stack
		Amount of memory allocated to kernel stacks.

	  pagetables
                Amount of memory allocated for page tables.

	  percpu(npn)
		Amount of memory used for storing per-cpu kernel
		data structures.
@@ -1300,6 +1303,14 @@ PAGE_SIZE multiple when read back.
		Amount of memory used in anonymous mappings backed by
		transparent hugepages

	  file_thp
		Amount of cached filesystem data backed by transparent
		hugepages

	  shmem_thp
		Amount of shm, tmpfs, shared anonymous mmap()s backed by
		transparent hugepages

	  inactive_anon, active_anon, inactive_file, active_file, unevictable
		Amount of memory, swap-backed and filesystem-backed,
		on the internal memory management lists used by the
+0 −15
Original line number Diff line number Diff line
@@ -401,21 +401,6 @@ compact_fail
	is incremented if the system tries to compact memory
	but failed.

compact_pages_moved
	is incremented each time a page is moved. If
	this value is increasing rapidly, it implies that the system
	is copying a lot of data to satisfy the huge page allocation.
	It is possible that the cost of copying exceeds any savings
	from reduced TLB misses.

compact_pagemigrate_failed
	is incremented when the underlying mechanism
	for moving a page failed.

compact_blocks_moved
	is incremented each time memory compaction examines
	a huge page aligned range of pages.

It is possible to establish how long the stalls were using the function
tracer to record how long was spent in __alloc_pages_nodemask and
using the mm_page_alloc tracepoint to identify which allocations were
Loading