Commit e350b60f authored by Rafael J. Wysocki's avatar Rafael J. Wysocki
Browse files

Merge branches 'pm-avs', 'pm-docs' and 'pm-tools'

* pm-avs:
  ARM: OMAP2+: SmartReflex: add omap_sr_pdata definition
  power: avs: smartreflex: Remove superfluous cast in debugfs_create_file() call

* pm-docs:
  PM: Wrap documentation to fit in 80 columns

* pm-tools:
  cpupower: ToDo: Update ToDo with ideas for per_cpu_schedule handling
  cpupower: mperf_monitor: Update cpupower to use the RDPRU instruction
  cpupower: mperf_monitor: Introduce per_cpu_schedule flag
  cpupower: Move needs_root variable into a sub-struct
  cpupower : Handle set and info subcommands correctly
  pm-graph info added to MAINTAINERS
  tools/power/cpupower: Fix initializer override in hsw_ext_cstates
Loading
Loading
Loading
Loading
+4 −3
Original line number Diff line number Diff line
@@ -39,9 +39,10 @@ c) Compile the driver directly into the kernel and try the test modes of
d) Attempt to hibernate with the driver compiled directly into the kernel
   in the "reboot", "shutdown" and "platform" modes.

e) Try the test modes of suspend (see: Documentation/power/basic-pm-debugging.rst,
   2).  [As far as the STR tests are concerned, it should not matter whether or
   not the driver is built as a module.]
e) Try the test modes of suspend (see:
   Documentation/power/basic-pm-debugging.rst, 2).  [As far as the STR tests are
   concerned, it should not matter whether or not the driver is built as a
   module.]

f) Attempt to suspend to RAM using the s2ram tool with the driver loaded
   (see: Documentation/power/basic-pm-debugging.rst, 2).
+19 −18
Original line number Diff line number Diff line
@@ -215,30 +215,31 @@ VI. Are there any precautions to be taken to prevent freezing failures?

Yes, there are.

First of all, grabbing the 'system_transition_mutex' lock to mutually exclude a piece of code
from system-wide sleep such as suspend/hibernation is not encouraged.
If possible, that piece of code must instead hook onto the suspend/hibernation
notifiers to achieve mutual exclusion. Look at the CPU-Hotplug code
(kernel/cpu.c) for an example.

However, if that is not feasible, and grabbing 'system_transition_mutex' is deemed necessary,
it is strongly discouraged to directly call mutex_[un]lock(&system_transition_mutex) since
that could lead to freezing failures, because if the suspend/hibernate code
successfully acquired the 'system_transition_mutex' lock, and hence that other entity failed
to acquire the lock, then that task would get blocked in TASK_UNINTERRUPTIBLE
state. As a consequence, the freezer would not be able to freeze that task,
leading to freezing failure.
First of all, grabbing the 'system_transition_mutex' lock to mutually exclude a
piece of code from system-wide sleep such as suspend/hibernation is not
encouraged.  If possible, that piece of code must instead hook onto the
suspend/hibernation notifiers to achieve mutual exclusion. Look at the
CPU-Hotplug code (kernel/cpu.c) for an example.

However, if that is not feasible, and grabbing 'system_transition_mutex' is
deemed necessary, it is strongly discouraged to directly call
mutex_[un]lock(&system_transition_mutex) since that could lead to freezing
failures, because if the suspend/hibernate code successfully acquired the
'system_transition_mutex' lock, and hence that other entity failed to acquire
the lock, then that task would get blocked in TASK_UNINTERRUPTIBLE state. As a
consequence, the freezer would not be able to freeze that task, leading to
freezing failure.

However, the [un]lock_system_sleep() APIs are safe to use in this scenario,
since they ask the freezer to skip freezing this task, since it is anyway
"frozen enough" as it is blocked on 'system_transition_mutex', which will be released
only after the entire suspend/hibernation sequence is complete.
So, to summarize, use [un]lock_system_sleep() instead of directly using
"frozen enough" as it is blocked on 'system_transition_mutex', which will be
released only after the entire suspend/hibernation sequence is complete.  So, to
summarize, use [un]lock_system_sleep() instead of directly using
mutex_[un]lock(&system_transition_mutex). That would prevent freezing failures.

V. Miscellaneous
================

/sys/power/pm_freeze_timeout controls how long it will cost at most to freeze
all user space processes or all freezable kernel threads, in unit of millisecond.
The default value is 20000, with range of unsigned integer.
all user space processes or all freezable kernel threads, in unit of
millisecond.  The default value is 20000, with range of unsigned integer.
+17 −15
Original line number Diff line number Diff line
@@ -73,19 +73,21 @@ factors. Example usage: Thermal management or other exceptional situations where
SoC framework might choose to disable a higher frequency OPP to safely continue
operations until that OPP could be re-enabled if possible.

OPP library facilitates this concept in it's implementation. The following
OPP library facilitates this concept in its implementation. The following
operational functions operate only on available opps:
opp_find_freq_{ceil, floor}, dev_pm_opp_get_voltage, dev_pm_opp_get_freq, dev_pm_opp_get_opp_count
opp_find_freq_{ceil, floor}, dev_pm_opp_get_voltage, dev_pm_opp_get_freq,
dev_pm_opp_get_opp_count

dev_pm_opp_find_freq_exact is meant to be used to find the opp pointer which can then
be used for dev_pm_opp_enable/disable functions to make an opp available as required.
dev_pm_opp_find_freq_exact is meant to be used to find the opp pointer
which can then be used for dev_pm_opp_enable/disable functions to make an
opp available as required.

WARNING: Users of OPP library should refresh their availability count using
get_opp_count if dev_pm_opp_enable/disable functions are invoked for a device, the
exact mechanism to trigger these or the notification mechanism to other
dependent subsystems such as cpufreq are left to the discretion of the SoC
specific framework which uses the OPP library. Similar care needs to be taken
care to refresh the cpufreq table in cases of these operations.
get_opp_count if dev_pm_opp_enable/disable functions are invoked for a
device, the exact mechanism to trigger these or the notification mechanism
to other dependent subsystems such as cpufreq are left to the discretion of
the SoC specific framework which uses the OPP library. Similar care needs
to be taken care to refresh the cpufreq table in cases of these operations.

2. Initial OPP List Registration
================================
@@ -99,11 +101,11 @@ OPPs dynamically using the dev_pm_opp_enable / disable functions.
dev_pm_opp_add
	Add a new OPP for a specific domain represented by the device pointer.
	The OPP is defined using the frequency and voltage. Once added, the OPP
	is assumed to be available and control of it's availability can be done
	with the dev_pm_opp_enable/disable functions. OPP library internally stores
	and manages this information in the opp struct. This function may be
	used by SoC framework to define a optimal list as per the demands of
	SoC usage environment.
	is assumed to be available and control of its availability can be done
	with the dev_pm_opp_enable/disable functions. OPP library
	internally stores and manages this information in the opp struct.
	This function may be used by SoC framework to define a optimal list
	as per the demands of SoC usage environment.

	WARNING:
		Do not use this function in interrupt context.
@@ -354,7 +356,7 @@ struct dev_pm_opp

struct device
	This is used to identify a domain to the OPP layer. The
	nature of the device and it's implementation is left to the user of
	nature of the device and its implementation is left to the user of
	OPP library such as the SoC framework.

Overall, in a simplistic view, the data structure operations is represented as
+14 −14
Original line number Diff line number Diff line
@@ -426,12 +426,12 @@ pm->runtime_idle() callback.
2.4. System-Wide Power Transitions
----------------------------------
There are a few different types of system-wide power transitions, described in
Documentation/driver-api/pm/devices.rst.  Each of them requires devices to be handled
in a specific way and the PM core executes subsystem-level power management
callbacks for this purpose.  They are executed in phases such that each phase
involves executing the same subsystem-level callback for every device belonging
to the given subsystem before the next phase begins.  These phases always run
after tasks have been frozen.
Documentation/driver-api/pm/devices.rst.  Each of them requires devices to be
handled in a specific way and the PM core executes subsystem-level power
management callbacks for this purpose.  They are executed in phases such that
each phase involves executing the same subsystem-level callback for every device
belonging to the given subsystem before the next phase begins.  These phases
always run after tasks have been frozen.

2.4.1. System Suspend
^^^^^^^^^^^^^^^^^^^^^
@@ -636,12 +636,12 @@ System restore requires a hibernation image to be loaded into memory and the
pre-hibernation memory contents to be restored before the pre-hibernation system
activity can be resumed.

As described in Documentation/driver-api/pm/devices.rst, the hibernation image is loaded
into memory by a fresh instance of the kernel, called the boot kernel, which in
turn is loaded and run by a boot loader in the usual way.  After the boot kernel
has loaded the image, it needs to replace its own code and data with the code
and data of the "hibernated" kernel stored within the image, called the image
kernel.  For this purpose all devices are frozen just like before creating
As described in Documentation/driver-api/pm/devices.rst, the hibernation image
is loaded into memory by a fresh instance of the kernel, called the boot kernel,
which in turn is loaded and run by a boot loader in the usual way.  After the
boot kernel has loaded the image, it needs to replace its own code and data with
the code and data of the "hibernated" kernel stored within the image, called the
image kernel.  For this purpose all devices are frozen just like before creating
the image during hibernation, in the

	prepare, freeze, freeze_noirq
@@ -691,8 +691,8 @@ controlling the runtime power management of their devices.

At the time of this writing there are two ways to define power management
callbacks for a PCI device driver, the recommended one, based on using a
dev_pm_ops structure described in Documentation/driver-api/pm/devices.rst, and the
"legacy" one, in which the .suspend(), .suspend_late(), .resume_early(), and
dev_pm_ops structure described in Documentation/driver-api/pm/devices.rst, and
the "legacy" one, in which the .suspend(), .suspend_late(), .resume_early(), and
.resume() callbacks from struct pci_driver are used.  The legacy approach,
however, doesn't allow one to define runtime power management callbacks and is
not really suitable for any new drivers.  Therefore it is not covered by this
+13 −13
Original line number Diff line number Diff line
@@ -8,8 +8,8 @@ one of the parameters.

Two different PM QoS frameworks are available:
1. PM QoS classes for cpu_dma_latency
2. the per-device PM QoS framework provides the API to manage the per-device latency
constraints and PM QoS flags.
2. The per-device PM QoS framework provides the API to manage the
   per-device latency constraints and PM QoS flags.

Each parameters have defined units:

@@ -47,14 +47,14 @@ void pm_qos_add_request(handle, param_class, target_value):
  pm_qos API functions.

void pm_qos_update_request(handle, new_target_value):
  Will update the list element pointed to by the handle with the new target value
  and recompute the new aggregated target, calling the notification tree if the
  target is changed.
  Will update the list element pointed to by the handle with the new target
  value and recompute the new aggregated target, calling the notification tree
  if the target is changed.

void pm_qos_remove_request(handle):
  Will remove the element.  After removal it will update the aggregate target and
  call the notification tree if the target was changed as a result of removing
  the request.
  Will remove the element.  After removal it will update the aggregate target
  and call the notification tree if the target was changed as a result of
  removing the request.

int pm_qos_request(param_class):
  Returns the aggregated value for a given PM QoS class.
@@ -167,9 +167,9 @@ int dev_pm_qos_expose_flags(device, value)
  change the value of the PM_QOS_FLAG_NO_POWER_OFF flag.

void dev_pm_qos_hide_flags(device)
  Drop the request added by dev_pm_qos_expose_flags() from the device's PM QoS list
  of flags and remove sysfs attribute pm_qos_no_power_off from the device's power
  directory.
  Drop the request added by dev_pm_qos_expose_flags() from the device's PM QoS
  list of flags and remove sysfs attribute pm_qos_no_power_off from the device's
  power directory.

Notification mechanisms:

@@ -179,8 +179,8 @@ int dev_pm_qos_add_notifier(device, notifier, type):
  Adds a notification callback function for the device for a particular request
  type.

  The callback is called when the aggregated value of the device constraints list
  is changed.
  The callback is called when the aggregated value of the device constraints
  list is changed.

int dev_pm_qos_remove_notifier(device, notifier, type):
  Removes the notification callback function for the device.
Loading