Commit a5ad5742 authored by Linus Torvalds's avatar Linus Torvalds
Browse files

Merge branch 'akpm' (patches from Andrew)

Merge even more updates from Andrew Morton:

 - a kernel-wide sweep of show_stack()

 - pagetable cleanups

 - abstract out accesses to mmap_sem - prep for mmap_sem scalability work

 - hch's user acess work

Subsystems affected by this patch series: debug, mm/pagemap, mm/maccess,
mm/documentation.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (93 commits)
  include/linux/cache.h: expand documentation over __read_mostly
  maccess: return -ERANGE when probe_kernel_read() fails
  x86: use non-set_fs based maccess routines
  maccess: allow architectures to provide kernel probing directly
  maccess: move user access routines together
  maccess: always use strict semantics for probe_kernel_read
  maccess: remove strncpy_from_unsafe
  tracing/kprobes: handle mixed kernel/userspace probes better
  bpf: rework the compat kernel probe handling
  bpf:bpf_seq_printf(): handle potentially unsafe format string better
  bpf: handle the compat string in bpf_trace_copy_string better
  bpf: factor out a bpf_trace_copy_string helper
  maccess: unify the probe kernel arch hooks
  maccess: remove probe_read_common and probe_write_common
  maccess: rename strnlen_unsafe_user to strnlen_user_nofault
  maccess: rename strncpy_from_unsafe_strict to strncpy_from_kernel_nofault
  maccess: rename strncpy_from_unsafe_user to strncpy_from_user_nofault
  maccess: update the top of file comment
  maccess: clarify kerneldoc comments
  maccess: remove duplicate kerneldoc comments
  ...
parents 013b2deb 4fa72523
Loading
Loading
Loading
Loading
+5 −5
Original line number Diff line number Diff line
@@ -364,19 +364,19 @@ follows:

2) for querying the policy, we do not need to take an extra reference on the
   target task's task policy nor vma policies because we always acquire the
   task's mm's mmap_sem for read during the query.  The set_mempolicy() and
   mbind() APIs [see below] always acquire the mmap_sem for write when
   task's mm's mmap_lock for read during the query.  The set_mempolicy() and
   mbind() APIs [see below] always acquire the mmap_lock for write when
   installing or replacing task or vma policies.  Thus, there is no possibility
   of a task or thread freeing a policy while another task or thread is
   querying it.

3) Page allocation usage of task or vma policy occurs in the fault path where
   we hold them mmap_sem for read.  Again, because replacing the task or vma
   policy requires that the mmap_sem be held for write, the policy can't be
   we hold them mmap_lock for read.  Again, because replacing the task or vma
   policy requires that the mmap_lock be held for write, the policy can't be
   freed out from under us while we're using it for page allocation.

4) Shared policies require special consideration.  One task can replace a
   shared memory policy while another task, with a distinct mmap_sem, is
   shared memory policy while another task, with a distinct mmap_lock, is
   querying or allocating a page based on the policy.  To resolve this
   potential race, the shared policy infrastructure adds an extra reference
   to the shared policy during lookup while holding a spin lock on the shared
+1 −1
Original line number Diff line number Diff line
@@ -33,7 +33,7 @@ memory ranges) provides two primary functionalities:
The real advantage of userfaults if compared to regular virtual memory
management of mremap/mprotect is that the userfaults in all their
operations never involve heavyweight structures like vmas (in fact the
``userfaultfd`` runtime load never takes the mmap_sem for writing).
``userfaultfd`` runtime load never takes the mmap_lock for writing).

Vmas are not suitable for page- (or hugepage) granular fault tracking
when dealing with virtual address spaces that could span
+1 −1
Original line number Diff line number Diff line
@@ -615,7 +615,7 @@ prototypes::
locking rules:

=============	========	===========================
ops		mmap_sem	PageLocked(page)
ops		mmap_lock	PageLocked(page)
=============	========	===========================
open:		yes
close:		yes
+3 −3
Original line number Diff line number Diff line
@@ -191,15 +191,15 @@ The usage pattern is::

 again:
      range.notifier_seq = mmu_interval_read_begin(&interval_sub);
      down_read(&mm->mmap_sem);
      mmap_read_lock(mm);
      ret = hmm_range_fault(&range);
      if (ret) {
          up_read(&mm->mmap_sem);
          mmap_read_unlock(mm);
          if (ret == -EBUSY)
                 goto again;
          return ret;
      }
      up_read(&mm->mmap_sem);
      mmap_read_unlock(mm);

      take_lock(driver->update);
      if (mmu_interval_read_retry(&ni, range.notifier_seq) {
+2 −2
Original line number Diff line number Diff line
@@ -98,9 +98,9 @@ split_huge_page() or split_huge_pmd() has a cost.

To make pagetable walks huge pmd aware, all you need to do is to call
pmd_trans_huge() on the pmd returned by pmd_offset. You must hold the
mmap_sem in read (or write) mode to be sure a huge pmd cannot be
mmap_lock in read (or write) mode to be sure a huge pmd cannot be
created from under you by khugepaged (khugepaged collapse_huge_page
takes the mmap_sem in write mode in addition to the anon_vma lock). If
takes the mmap_lock in write mode in addition to the anon_vma lock). If
pmd_trans_huge returns false, you just fallback in the old code
paths. If instead pmd_trans_huge returns true, you have to take the
page table lock (pmd_lock()) and re-run pmd_trans_huge. Taking the
Loading