Commit 387b1468 authored by Mauro Carvalho Chehab's avatar Mauro Carvalho Chehab
Browse files

docs: locking: convert docs to ReST and rename to *.rst



Convert the locking documents to ReST and add them to the
kernel development book where it belongs.

Most of the stuff here is just to make Sphinx to properly
parse the text file, as they're already in good shape,
not requiring massive changes in order to be parsed.

The conversion is actually:
  - add blank lines and identation in order to identify paragraphs;
  - fix tables markups;
  - add some lists markups;
  - mark literal blocks;
  - adjust title markups.

At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.

Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: default avatarFederico Vaga <federico.vaga@vaga.pv.it>
parent fec88ab0
Loading
Loading
Loading
Loading
+1 −1
Original line number Diff line number Diff line
@@ -1364,7 +1364,7 @@ Futex API reference
Further reading
===============

-  ``Documentation/locking/spinlocks.txt``: Linus Torvalds' spinlocking
-  ``Documentation/locking/spinlocks.rst``: Linus Torvalds' spinlocking
   tutorial in the kernel sources.

-  Unix Systems for Modern Architectures: Symmetric Multiprocessing and
+24 −0
Original line number Diff line number Diff line
:orphan:

=======
locking
=======

.. toctree::
    :maxdepth: 1

    lockdep-design
    lockstat
    locktorture
    mutex-design
    rt-mutex-design
    rt-mutex
    spinlocks
    ww-mutex-design

.. only::  subproject and html

   Indices
   =======

   * :ref:`genindex`
+28 −23
Original line number Diff line number Diff line
@@ -2,6 +2,7 @@ Runtime locking correctness validator
=====================================

started by Ingo Molnar <mingo@redhat.com>

additions by Arjan van de Ven <arjan@linux.intel.com>

Lock-class
@@ -56,7 +57,7 @@ where the last 1 category is:

When locking rules are violated, these usage bits are presented in the
locking error messages, inside curlies, with a total of 2 * n STATEs bits.
A contrived example:
A contrived example::

   modprobe/2287 is trying to acquire lock:
    (&sio_locks[i].lock){-.-.}, at: [<c02867fd>] mutex_lock+0x21/0x24
@@ -70,12 +71,14 @@ of the lock and readlock (if exists), for each of the n STATEs listed
above respectively, and the character displayed at each bit position
indicates:

   ===  ===================================================
   '.'  acquired while irqs disabled and not in irq context
   '-'  acquired in irq context
   '+'  acquired with irqs enabled
   '?'  acquired in irq context with irqs enabled.
   ===  ===================================================

The bits are illustrated with an example:
The bits are illustrated with an example::

    (&sio_locks[i].lock){-.-.}, at: [<c02867fd>] mutex_lock+0x21/0x24
                         ||||
@@ -90,13 +93,13 @@ context and whether that STATE is enabled yields four possible cases as
shown in the table below. The bit character is able to indicate which
exact case is for the lock as of the reporting time.

   -------------------------------------------
  +--------------+-------------+--------------+
  |              | irq enabled | irq disabled |
  |-------------------------------------------|
  +--------------+-------------+--------------+
  | ever in irq  |      ?      |       -      |
  |-------------------------------------------|
  +--------------+-------------+--------------+
  | never in irq |      +      |       .      |
   -------------------------------------------
  +--------------+-------------+--------------+

The character '-' suggests irq is disabled because if otherwise the
charactor '?' would have been shown instead. Similar deduction can be
@@ -113,7 +116,7 @@ is irq-unsafe means it was ever acquired with irq enabled.

A softirq-unsafe lock-class is automatically hardirq-unsafe as well. The
following states must be exclusive: only one of them is allowed to be set
for any lock-class based on its usage:
for any lock-class based on its usage::

 <hardirq-safe> or <hardirq-unsafe>
 <softirq-safe> or <softirq-unsafe>
@@ -134,7 +137,7 @@ Multi-lock dependency rules:
The same lock-class must not be acquired twice, because this could lead
to lock recursion deadlocks.

Furthermore, two locks can not be taken in inverse order:
Furthermore, two locks can not be taken in inverse order::

 <L1> -> <L2>
 <L2> -> <L1>
@@ -148,7 +151,7 @@ operations; the validator will still find whether these locks can be
acquired in a circular fashion.

Furthermore, the following usage based lock dependencies are not allowed
between any two lock-classes:
between any two lock-classes::

   <hardirq-safe>   ->  <hardirq-unsafe>
   <softirq-safe>   ->  <softirq-unsafe>
@@ -204,7 +207,7 @@ the ordering is not static.
In order to teach the validator about this correct usage model, new
versions of the various locking primitives were added that allow you to
specify a "nesting level". An example call, for the block device mutex,
looks like this:
looks like this::

  enum bdev_bd_mutex_lock_class
  {
@@ -234,7 +237,7 @@ must be held: lockdep_assert_held*(&lock) and lockdep_*pin_lock(&lock).
As the name suggests, lockdep_assert_held* family of macros assert that a
particular lock is held at a certain time (and generate a WARN() otherwise).
This annotation is largely used all over the kernel, e.g. kernel/sched/
core.c
core.c::

  void update_rq_clock(struct rq *rq)
  {
@@ -253,7 +256,7 @@ out to be especially helpful to debug code with callbacks, where an upper
layer assumes a lock remains taken, but a lower layer thinks it can maybe drop
and reacquire the lock ("unwittingly" introducing races). lockdep_pin_lock()
returns a 'struct pin_cookie' that is then used by lockdep_unpin_lock() to check
that nobody tampered with the lock, e.g. kernel/sched/sched.h
that nobody tampered with the lock, e.g. kernel/sched/sched.h::

  static inline void rq_pin_lock(struct rq *rq, struct rq_flags *rf)
  {
@@ -280,7 +283,7 @@ correctness) in the sense that for every simple, standalone single-task
locking sequence that occurred at least once during the lifetime of the
kernel, the validator proves it with a 100% certainty that no
combination and timing of these locking sequences can cause any class of
lock related deadlock. [*]
lock related deadlock. [1]_

I.e. complex multi-CPU and multi-task locking scenarios do not have to
occur in practice to prove a deadlock: only the simple 'component'
@@ -299,7 +302,9 @@ possible combination of locking interaction between CPUs, combined with
every possible hardirq and softirq nesting scenario (which is impossible
to do in practice).

[*] assuming that the validator itself is 100% correct, and no other
.. [1]

    assuming that the validator itself is 100% correct, and no other
    part of the system corrupts the state of the validator in any way.
    We also assume that all NMI/SMM paths [which could interrupt
    even hardirq-disabled codepaths] are correct and do not interfere
@@ -310,7 +315,7 @@ to do in practice).
Performance:
------------

The above rules require _massive_ amounts of runtime checking. If we did
The above rules require **massive** amounts of runtime checking. If we did
that for every lock taken and for every irqs-enable event, it would
render the system practically unusably slow. The complexity of checking
is O(N^2), so even with just a few hundred lock-classes we'd have to do
@@ -369,17 +374,17 @@ be harder to do than to say.

Of course, if you do run out of lock classes, the next thing to do is
to find the offending lock classes.  First, the following command gives
you the number of lock classes currently in use along with the maximum:
you the number of lock classes currently in use along with the maximum::

	grep "lock-classes" /proc/lockdep_stats

This command produces the following output on a modest system:
This command produces the following output on a modest system::

	lock-classes:                          748 [max: 8191]

If the number allocated (748 above) increases continually over time,
then there is likely a leak.  The following command can be used to
identify the leaking lock classes:
identify the leaking lock classes::

	grep "BD" /proc/lockdep

+204 −0
Original line number Diff line number Diff line
===============
Lock Statistics
===============

LOCK STATISTICS

- WHAT
What
====

As the name suggests, it provides statistics on locks.

- WHY

Why
===

Because things like lock contention can severely impact performance.

- HOW
How
===

Lockdep already has hooks in the lock functions and maps lock instances to
lock classes. We build on that (see Documentation/locking/lockdep-design.txt).
lock classes. We build on that (see Documentation/locking/lockdep-design.rst).
The graph below shows the relation between the lock functions and the various
hooks therein.
hooks therein::

        __acquire
            |
@@ -42,18 +47,32 @@ __* - the hooks

With these hooks we provide the following statistics:

 con-bounces       - number of lock contention that involved x-cpu data
 contentions       - number of lock acquisitions that had to wait
 wait time min     - shortest (non-0) time we ever had to wait for a lock
           max     - longest time we ever had to wait for a lock
	   total   - total time we spend waiting on this lock
	   avg     - average time spent waiting on this lock
 acq-bounces       - number of lock acquisitions that involved x-cpu data
 acquisitions      - number of times we took the lock
 hold time min     - shortest (non-0) time we ever held the lock
	   max     - longest time we ever held the lock
	   total   - total time this lock was held
	   avg     - average time this lock was held
 con-bounces
	- number of lock contention that involved x-cpu data
 contentions
	- number of lock acquisitions that had to wait
 wait time
     min
	- shortest (non-0) time we ever had to wait for a lock
     max
	- longest time we ever had to wait for a lock
     total
	- total time we spend waiting on this lock
     avg
	- average time spent waiting on this lock
 acq-bounces
	- number of lock acquisitions that involved x-cpu data
 acquisitions
	- number of times we took the lock
 hold time
     min
	- shortest (non-0) time we ever held the lock
     max
	- longest time we ever held the lock
     total
	- total time this lock was held
     avg
	- average time this lock was held

These numbers are gathered per lock class, per read/write state (when
applicable).
@@ -61,21 +80,23 @@ applicable).
It also tracks 4 contention points per class. A contention point is a call site
that had to wait on lock acquisition.

 - CONFIGURATION
Configuration
-------------

Lock statistics are enabled via CONFIG_LOCK_STAT.

 - USAGE
Usage
-----

Enable collection of statistics:
Enable collection of statistics::

	# echo 1 >/proc/sys/kernel/lock_stat

Disable collection of statistics:
Disable collection of statistics::

	# echo 0 >/proc/sys/kernel/lock_stat

Look at the current lock statistics:
Look at the current lock statistics::

  ( line numbers not part of actual output, done for clarity in the explanation
    below )
@@ -133,7 +154,7 @@ points are the points we're contending with.

The integer part of the time values is in us.

Dealing with nested locks, subclasses may appear:
Dealing with nested locks, subclasses may appear::

  32...........................................................................................................................................................................................................................
  33
@@ -164,7 +185,7 @@ Line 48 shows statistics for the second subclass (/1) of &rq->lock class
(subclass starts from 0), since in this case, as line 50 suggests,
double_rq_lock actually acquires a nested lock of two spinlocks.

View the top contending locks:
View the top contending locks::

  # grep : /proc/lock_stat | head
			clockevents_lock:       2926159        2947636           0.15       46882.81  1784540466.34         605.41        3381345        3879161           0.00        2260.97    53178395.68          13.71
@@ -178,6 +199,6 @@ View the top contending locks:
       &(&dentry->d_lockref.lock)->rlock:         39791          40179           0.15        1302.08       88851.96           2.21        2790851       12527025           0.10        1910.75     3379714.27           0.27
			      rcu_node_0:         29203          30064           0.16         786.55     1555573.00          51.74          88963         244254           0.00         398.87      428872.51           1.76

Clear the statistics:
Clear the statistics::

  # echo 0 > /proc/lock_stat
+65 −40
Original line number Diff line number Diff line
==================================
Kernel Lock Torture Test Operation
==================================

CONFIG_LOCK_TORTURE_TEST
========================

The CONFIG LOCK_TORTURE_TEST config option provides a kernel module
that runs torture tests on core kernel locking primitives. The kernel
@@ -18,61 +21,77 @@ can be simulated by either enlarging this critical region hold time and/or
creating more kthreads.


MODULE PARAMETERS
Module Parameters
=================

This module has the following parameters:


	    ** Locktorture-specific **
Locktorture-specific
--------------------

nwriters_stress   Number of kernel threads that will stress exclusive lock
nwriters_stress
		  Number of kernel threads that will stress exclusive lock
		  ownership (writers). The default value is twice the number
		  of online CPUs.

nreaders_stress   Number of kernel threads that will stress shared lock
nreaders_stress
		  Number of kernel threads that will stress shared lock
		  ownership (readers). The default is the same amount of writer
		  locks. If the user did not specify nwriters_stress, then
		  both readers and writers be the amount of online CPUs.

torture_type	  Type of lock to torture. By default, only spinlocks will
torture_type
		  Type of lock to torture. By default, only spinlocks will
		  be tortured. This module can torture the following locks,
		  with string values as follows:

		     o "lock_busted": Simulates a buggy lock implementation.
		     - "lock_busted":
				Simulates a buggy lock implementation.

		     o "spin_lock": spin_lock() and spin_unlock() pairs.
		     - "spin_lock":
				spin_lock() and spin_unlock() pairs.

		     o "spin_lock_irq": spin_lock_irq() and spin_unlock_irq()
					pairs.
		     - "spin_lock_irq":
				spin_lock_irq() and spin_unlock_irq() pairs.

		     o "rw_lock": read/write lock() and unlock() rwlock pairs.
		     - "rw_lock":
				read/write lock() and unlock() rwlock pairs.

		     o "rw_lock_irq": read/write lock_irq() and unlock_irq()
		     - "rw_lock_irq":
				read/write lock_irq() and unlock_irq()
				rwlock pairs.

		     o "mutex_lock": mutex_lock() and mutex_unlock() pairs.
		     - "mutex_lock":
				mutex_lock() and mutex_unlock() pairs.

		     o "rtmutex_lock": rtmutex_lock() and rtmutex_unlock()
				       pairs. Kernel must have CONFIG_RT_MUTEX=y.
		     - "rtmutex_lock":
				rtmutex_lock() and rtmutex_unlock() pairs.
				Kernel must have CONFIG_RT_MUTEX=y.

		     o "rwsem_lock": read/write down() and up() semaphore pairs.
		     - "rwsem_lock":
				read/write down() and up() semaphore pairs.


	    ** Torture-framework (RCU + locking) **
Torture-framework (RCU + locking)
---------------------------------

shutdown_secs	  The number of seconds to run the test before terminating
shutdown_secs
		  The number of seconds to run the test before terminating
		  the test and powering off the system.  The default is
		  zero, which disables test termination and system shutdown.
		  This capability is useful for automated testing.

onoff_interval	  The number of seconds between each attempt to execute a
onoff_interval
		  The number of seconds between each attempt to execute a
		  randomly selected CPU-hotplug operation.  Defaults
		  to zero, which disables CPU hotplugging.  In
		  CONFIG_HOTPLUG_CPU=n kernels, locktorture will silently
		  refuse to do any CPU-hotplug operations regardless of
		  what value is specified for onoff_interval.

onoff_holdoff	  The number of seconds to wait until starting CPU-hotplug
onoff_holdoff
		  The number of seconds to wait until starting CPU-hotplug
		  operations.  This would normally only be used when
		  locktorture was built into the kernel and started
		  automatically at boot time, in which case it is useful
@@ -80,39 +99,44 @@ onoff_holdoff The number of seconds to wait until starting CPU-hotplug
		  coming and going. This parameter is only useful if
		  CONFIG_HOTPLUG_CPU is enabled.

stat_interval	  Number of seconds between statistics-related printk()s.
stat_interval
		  Number of seconds between statistics-related printk()s.
		  By default, locktorture will report stats every 60 seconds.
		  Setting the interval to zero causes the statistics to
		  be printed -only- when the module is unloaded, and this
		  is the default.

stutter		  The length of time to run the test before pausing for this
stutter
		  The length of time to run the test before pausing for this
		  same period of time.  Defaults to "stutter=5", so as
		  to run and pause for (roughly) five-second intervals.
		  Specifying "stutter=0" causes the test to run continuously
		  without pausing, which is the old default behavior.

shuffle_interval  The number of seconds to keep the test threads affinitied
shuffle_interval
		  The number of seconds to keep the test threads affinitied
		  to a particular subset of the CPUs, defaults to 3 seconds.
		  Used in conjunction with test_no_idle_hz.

verbose		  Enable verbose debugging printing, via printk(). Enabled
verbose
		  Enable verbose debugging printing, via printk(). Enabled
		  by default. This extra information is mostly related to
		  high-level errors and reports from the main 'torture'
		  framework.


STATISTICS
Statistics
==========

Statistics are printed in the following format:
Statistics are printed in the following format::

  spin_lock-torture: Writes:  Total: 93746064  Max/Min: 0/0   Fail: 0
     (A)		    (B)		   (C)		  (D)	       (E)

  (A): Lock type that is being tortured -- torture_type parameter.

(B): Number of writer lock acquisitions. If dealing with a read/write primitive
     a second "Reads" statistics line is printed.
  (B): Number of writer lock acquisitions. If dealing with a read/write
       primitive a second "Reads" statistics line is printed.

  (C): Number of times the lock was acquired.

@@ -124,9 +148,10 @@ spin_lock-torture: Writes: Total: 93746064 Max/Min: 0/0 Fail: 0
       Of course, the same applies for (C), above. A dummy example of this is
       the "lock_busted" type.

USAGE
Usage
=====

The following script may be used to torture locks:
The following script may be used to torture locks::

	#!/bin/sh

Loading