Merge branch 'akpm' (patches from Andrew) (ee01c4d7) · Commits · 戴 / test

Documentation/admin-guide/cgroup-v1/memory.rst

+7 −12

Original line number	Diff line number	Diff line
		@@ -199,11 +199,11 @@ An RSS page is unaccounted when it's fully unmapped. A PageCache page is
		unaccounted when it's removed from radix-tree. Even if RSS pages are fully
		unmapped (by kswapd), they may exist as SwapCache in the system until they
		are really freed. Such SwapCaches are also accounted.
		A swapped-in page is not accounted until it's mapped.
		A swapped-in page is accounted after adding into swapcache.

		Note: The kernel does swapin-readahead and reads multiple swaps at once.
		This means swapped-in pages may contain pages for other tasks than a task
		causing page fault. So, we avoid accounting at swap-in I/O.
		Since page's memcg recorded into swap whatever memsw enabled, the page will
		be accounted after swapin.

		At page migration, accounting information is kept.

		@@ -222,18 +222,13 @@ the cgroup that brought it in -- this will happen on memory pressure).
		But see section 8.2: when moving a task to another cgroup, its pages may
		be recharged to the new cgroup, if move_charge_at_immigrate has been chosen.

		Exception: If CONFIG_MEMCG_SWAP is not used.
		When you do swapoff and make swapped-out pages of shmem(tmpfs) to
		be backed into memory in force, charges for pages are accounted against the
		caller of swapoff rather than the users of shmem.

		2.4 Swap Extension (CONFIG_MEMCG_SWAP)
		2.4 Swap Extension
		--------------------------------------

		Swap Extension allows you to record charge for swap. A swapped-in page is
		charged back to original page allocator if possible.
		Swap usage is always recorded for each of cgroup. Swap Extension allows you to
		read and limit it.

		When swap is accounted, following files are added.
		When CONFIG_SWAP is enabled, following files are added.

		- memory.memsw.usage_in_bytes.
		- memory.memsw.limit_in_bytes.

Documentation/admin-guide/kernel-parameters.txt

+27 −13

Original line number	Diff line number	Diff line
		@@ -834,12 +834,15 @@
		See also Documentation/networking/decnet.rst.

		default_hugepagesz=
		[same as hugepagesz=] The size of the default
		HugeTLB page size. This is the size represented by
		the legacy /proc/ hugepages APIs, used for SHM, and
		default size when mounting hugetlbfs filesystems.
		Defaults to the default architecture's huge page size
		if not specified.
		[HW] The size of the default HugeTLB page. This is
		the size represented by the legacy /proc/ hugepages
		APIs. In addition, this is the default hugetlb size
		used for shmget(), mmap() and mounting hugetlbfs
		filesystems. If not specified, defaults to the
		architecture's default huge page size. Huge page
		sizes are architecture dependent. See also
		Documentation/admin-guide/mm/hugetlbpage.rst.
		Format: size[KMG]

		deferred_probe_timeout=
		[KNL] Debugging option to set a timeout in seconds for
		@@ -1484,13 +1487,24 @@
		hugepages using the cma allocator. If enabled, the
		boot-time allocation of gigantic hugepages is skipped.

		hugepages= [HW,X86-32,IA-64] HugeTLB pages to allocate at boot.
		hugepagesz= [HW,IA-64,PPC,X86-64] The size of the HugeTLB pages.
		On x86-64 and powerpc, this option can be specified
		multiple times interleaved with hugepages= to reserve
		huge pages of different sizes. Valid pages sizes on
		x86-64 are 2M (when the CPU supports "pse") and 1G
		(when the CPU supports the "pdpe1gb" cpuinfo flag).
		hugepages= [HW] Number of HugeTLB pages to allocate at boot.
		If this follows hugepagesz (below), it specifies
		the number of pages of hugepagesz to be allocated.
		If this is the first HugeTLB parameter on the command
		line, it specifies the number of pages to allocate for
		the default huge page size. See also
		Documentation/admin-guide/mm/hugetlbpage.rst.
		Format: <integer>

		hugepagesz=
		[HW] The size of the HugeTLB pages. This is used in
		conjunction with hugepages (above) to allocate huge
		pages of a specific size at boot. The pair
		hugepagesz=X hugepages=Y can be specified once for
		each supported huge page size. Huge page sizes are
		architecture dependent. See also
		Documentation/admin-guide/mm/hugetlbpage.rst.
		Format: size[KMG]

		hung_task_panic=
		[KNL] Should the hung task detector generate panics.

Documentation/admin-guide/mm/hugetlbpage.rst

+35 −0

Original line number	Diff line number	Diff line
		@@ -100,6 +100,41 @@ with a huge page size selection parameter "hugepagesz=<size>". <size> must
		be specified in bytes with optional scale suffix [kKmMgG]. The default huge
		page size may be selected with the "default_hugepagesz=<size>" boot parameter.

		Hugetlb boot command line parameter semantics
		hugepagesz - Specify a huge page size. Used in conjunction with hugepages
		parameter to preallocate a number of huge pages of the specified
		size. Hence, hugepagesz and hugepages are typically specified in
		pairs such as:
		hugepagesz=2M hugepages=512
		hugepagesz can only be specified once on the command line for a
		specific huge page size. Valid huge page sizes are architecture
		dependent.
		hugepages - Specify the number of huge pages to preallocate. This typically
		follows a valid hugepagesz or default_hugepagesz parameter. However,
		if hugepages is the first or only hugetlb command line parameter it
		implicitly specifies the number of huge pages of default size to
		allocate. If the number of huge pages of default size is implicitly
		specified, it can not be overwritten by a hugepagesz,hugepages
		parameter pair for the default size.
		For example, on an architecture with 2M default huge page size:
		hugepages=256 hugepagesz=2M hugepages=512
		will result in 256 2M huge pages being allocated and a warning message
		indicating that the hugepages=512 parameter is ignored. If a hugepages
		parameter is preceded by an invalid hugepagesz parameter, it will
		be ignored.
		default_hugepagesz - Specify the default huge page size. This parameter can
		only be specified once on the command line. default_hugepagesz can
		optionally be followed by the hugepages parameter to preallocate a
		specific number of huge pages of default size. The number of default
		sized huge pages to preallocate can also be implicitly specified as
		mentioned in the hugepages section above. Therefore, on an
		architecture with 2M default huge page size:
		hugepages=256
		default_hugepagesz=2M hugepages=256
		hugepages=256 default_hugepagesz=2M
		will all result in 256 2M huge pages being allocated. Valid default
		huge page size is architecture dependent.

		When multiple huge page sizes are supported, ``/proc/sys/vm/nr_hugepages``
		indicates the current number of pre-allocated huge pages of the default size.
		Thus, one can use the following command to dynamically allocate/deallocate

Documentation/admin-guide/mm/transhuge.rst

+7 −0

Original line number	Diff line number	Diff line
		@@ -220,6 +220,13 @@ memory. A lower value can prevent THPs from being
		collapsed, resulting fewer pages being collapsed into
		THPs, and lower memory access performance.

		``max_ptes_shared`` specifies how many pages can be shared across multiple
		processes. Exceeding the number would block the collapse::

		/sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_shared

		A higher value may increase memory footprint for some workloads.

		Boot parameter
		==============

Documentation/admin-guide/sysctl/vm.rst

+18 −5

Original line number	Diff line number	Diff line
		@@ -831,14 +831,27 @@ tooling to work, you can do::
		swappiness
		==========

		This control is used to define how aggressive the kernel will swap
		memory pages. Higher values will increase aggressiveness, lower values
		decrease the amount of swap. A value of 0 instructs the kernel not to
		initiate swap until the amount of free and file-backed pages is less
		than the high water mark in a zone.
		This control is used to define the rough relative IO cost of swapping
		and filesystem paging, as a value between 0 and 200. At 100, the VM
		assumes equal IO cost and will thus apply memory pressure to the page
		cache and swap-backed pages equally; lower values signify more
		expensive swap IO, higher values indicates cheaper.

		Keep in mind that filesystem IO patterns under memory pressure tend to
		be more efficient than swap's random IO. An optimal value will require
		experimentation and will also be workload-dependent.

		The default value is 60.

		For in-memory swap, like zram or zswap, as well as hybrid setups that
		have swap on faster devices than the filesystem, values beyond 100 can
		be considered. For example, if the random IO against the swap device
		is on average 2x faster than IO from the filesystem, swappiness should
		be 133 (x + 2x = 200, 2x = 133.33).

		At 0, the kernel will not initiate swap until the amount of free and
		file-backed pages is less than the high watermark in a zone.


		unprivileged_userfaultfd
		========================

Admin message