Skip to content
This repository has been archived by the owner on Aug 27, 2022. It is now read-only.

Commit

Permalink
Merge branch 'pm-sleep'
Browse files Browse the repository at this point in the history
* pm-sleep:
  PM / hibernate: fixed typo in comment
  PM / sleep: unregister wakeup source when disabling device wakeup
  PM / sleep: Introduce command line argument for sleep state enumeration
  PM / sleep: Use valid_state() for platform-dependent sleep states only
  PM / sleep: Add state field to pm_states[] entries
  PM / sleep: Update device PM documentation to cover direct_complete
  PM / sleep: Mechanism to avoid resuming runtime-suspended devices unnecessarily
  PM / hibernate: Fix memory corruption in resumedelay_setup()
  PM / hibernate: convert simple_strtoul to kstrtoul
  PM / hibernate: Documentation: Fix script for unswapping
  PM / hibernate: no kernel_power_off when pm_power_off NULL
  PM / hibernate: use unsigned local variables in swsusp_show_speed()
  • Loading branch information
rafaeljw committed Jun 3, 2014
2 parents 97b80e6 + 057b0a7 commit ee7f9d7
Show file tree
Hide file tree
Showing 16 changed files with 339 additions and 143 deletions.
29 changes: 20 additions & 9 deletions Documentation/ABI/testing/sysfs-power
Original file line number Diff line number Diff line change
Expand Up @@ -7,19 +7,30 @@ Description:
subsystem.

What: /sys/power/state
Date: August 2006
Date: May 2014
Contact: Rafael J. Wysocki <rjw@rjwysocki.net>
Description:
The /sys/power/state file controls the system power state.
Reading from this file returns what states are supported,
which is hard-coded to 'freeze' (Low-Power Idle), 'standby'
(Power-On Suspend), 'mem' (Suspend-to-RAM), and 'disk'
(Suspend-to-Disk).
The /sys/power/state file controls system sleep states.
Reading from this file returns the available sleep state
labels, which may be "mem", "standby", "freeze" and "disk"
(hibernation). The meanings of the first three labels depend on
the relative_sleep_states command line argument as follows:
1) relative_sleep_states = 1
"mem", "standby", "freeze" represent non-hibernation sleep
states from the deepest ("mem", always present) to the
shallowest ("freeze"). "standby" and "freeze" may or may
not be present depending on the capabilities of the
platform. "freeze" can only be present if "standby" is
present.
2) relative_sleep_states = 0 (default)
"mem" - "suspend-to-RAM", present if supported.
"standby" - "power-on suspend", present if supported.
"freeze" - "suspend-to-idle", always present.

Writing to this file one of these strings causes the system to
transition into that state. Please see the file
Documentation/power/states.txt for a description of each of
these states.
transition into the corresponding state, if available. See
Documentation/power/states.txt for a description of what
"suspend-to-RAM", "power-on suspend" and "suspend-to-idle" mean.

What: /sys/power/disk
Date: September 2006
Expand Down
7 changes: 7 additions & 0 deletions Documentation/kernel-parameters.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2889,6 +2889,13 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
[KNL, SMP] Set scheduler's default relax_domain_level.
See Documentation/cgroups/cpusets.txt.

relative_sleep_states=
[SUSPEND] Use sleep state labeling where the deepest
state available other than hibernation is always "mem".
Format: { "0" | "1" }
0 -- Traditional sleep state labels.
1 -- Relative sleep state labels.

reserve= [KNL,BUGS] Force the kernel to ignore some iomem area

reservetop= [X86-32]
Expand Down
34 changes: 30 additions & 4 deletions Documentation/power/devices.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ Device Power Management

Copyright (c) 2010-2011 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc.
Copyright (c) 2010 Alan Stern <stern@rowland.harvard.edu>
Copyright (c) 2014 Intel Corp., Rafael J. Wysocki <rafael.j.wysocki@intel.com>


Most of the code in Linux is device drivers, so most of the Linux power
Expand Down Expand Up @@ -326,6 +327,20 @@ the phases are:
driver in some way for the upcoming system power transition, but it
should not put the device into a low-power state.

For devices supporting runtime power management, the return value of the
prepare callback can be used to indicate to the PM core that it may
safely leave the device in runtime suspend (if runtime-suspended
already), provided that all of the device's descendants are also left in
runtime suspend. Namely, if the prepare callback returns a positive
number and that happens for all of the descendants of the device too,
and all of them (including the device itself) are runtime-suspended, the
PM core will skip the suspend, suspend_late and suspend_noirq suspend
phases as well as the resume_noirq, resume_early and resume phases of
the following system resume for all of these devices. In that case,
the complete callback will be called directly after the prepare callback
and is entirely responsible for bringing the device back to the
functional state as appropriate.

2. The suspend methods should quiesce the device to stop it from performing
I/O. They also may save the device registers and put it into the
appropriate low-power state, depending on the bus type the device is on,
Expand Down Expand Up @@ -400,12 +415,23 @@ When resuming from freeze, standby or memory sleep, the phases are:
the resume callbacks occur; it's not necessary to wait until the
complete phase.

Moreover, if the preceding prepare callback returned a positive number,
the device may have been left in runtime suspend throughout the whole
system suspend and resume (the suspend, suspend_late, suspend_noirq
phases of system suspend and the resume_noirq, resume_early, resume
phases of system resume may have been skipped for it). In that case,
the complete callback is entirely responsible for bringing the device
back to the functional state after system suspend if necessary. [For
example, it may need to queue up a runtime resume request for the device
for this purpose.] To check if that is the case, the complete callback
can consult the device's power.direct_complete flag. Namely, if that
flag is set when the complete callback is being run, it has been called
directly after the preceding prepare and special action may be required
to make the device work correctly afterward.

At the end of these phases, drivers should be as functional as they were before
suspending: I/O can be performed using DMA and IRQs, and the relevant clocks are
gated on. Even if the device was in a low-power state before the system sleep
because of runtime power management, afterwards it should be back in its
full-power state. There are multiple reasons why it's best to do this; they are
discussed in more detail in Documentation/power/runtime_pm.txt.
gated on.

However, the details here may again be platform-specific. For example,
some systems support multiple "run" states, and the mode in effect at
Expand Down
17 changes: 17 additions & 0 deletions Documentation/power/runtime_pm.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ Runtime Power Management Framework for I/O Devices

(C) 2009-2011 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc.
(C) 2010 Alan Stern <stern@rowland.harvard.edu>
(C) 2014 Intel Corp., Rafael J. Wysocki <rafael.j.wysocki@intel.com>

1. Introduction

Expand Down Expand Up @@ -444,6 +445,10 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h:
bool pm_runtime_status_suspended(struct device *dev);
- return true if the device's runtime PM status is 'suspended'

bool pm_runtime_suspended_if_enabled(struct device *dev);
- return true if the device's runtime PM status is 'suspended' and its
'power.disable_depth' field is equal to 1

void pm_runtime_allow(struct device *dev);
- set the power.runtime_auto flag for the device and decrease its usage
counter (used by the /sys/devices/.../power/control interface to
Expand Down Expand Up @@ -644,6 +649,18 @@ place (in particular, if the system is not waking up from hibernation), it may
be more efficient to leave the devices that had been suspended before the system
suspend began in the suspended state.

To this end, the PM core provides a mechanism allowing some coordination between
different levels of device hierarchy. Namely, if a system suspend .prepare()
callback returns a positive number for a device, that indicates to the PM core
that the device appears to be runtime-suspended and its state is fine, so it
may be left in runtime suspend provided that all of its descendants are also
left in runtime suspend. If that happens, the PM core will not execute any
system suspend and resume callbacks for all of those devices, except for the
complete callback, which is then entirely responsible for handling the device
as appropriate. This only applies to system suspend transitions that are not
related to hibernation (see Documentation/power/devices.txt for more
information).

The PM core does its best to reduce the probability of race conditions between
the runtime PM and system suspend/resume (and hibernation) callbacks by carrying
out the following operations:
Expand Down
87 changes: 56 additions & 31 deletions Documentation/power/states.txt
Original file line number Diff line number Diff line change
@@ -1,62 +1,87 @@
System Power Management Sleep States

System Power Management States
(C) 2014 Intel Corp., Rafael J. Wysocki <rafael.j.wysocki@intel.com>

The kernel supports up to four system sleep states generically, although three
of them depend on the platform support code to implement the low-level details
for each state.

The kernel supports four power management states generically, though
one is generic and the other three are dependent on platform support
code to implement the low-level details for each state.
This file describes each state, what they are
commonly called, what ACPI state they map to, and what string to write
to /sys/power/state to enter that state
The states are represented by strings that can be read or written to the
/sys/power/state file. Those strings may be "mem", "standby", "freeze" and
"disk", where the last one always represents hibernation (Suspend-To-Disk) and
the meaning of the remaining ones depends on the relative_sleep_states command
line argument.

state: Freeze / Low-Power Idle
For relative_sleep_states=1, the strings "mem", "standby" and "freeze" label the
available non-hibernation sleep states from the deepest to the shallowest,
respectively. In that case, "mem" is always present in /sys/power/state,
because there is at least one non-hibernation sleep state in every system. If
the given system supports two non-hibernation sleep states, "standby" is present
in /sys/power/state in addition to "mem". If the system supports three
non-hibernation sleep states, "freeze" will be present in /sys/power/state in
addition to "mem" and "standby".

For relative_sleep_states=0, which is the default, the following descriptions
apply.

state: Suspend-To-Idle
ACPI state: S0
String: "freeze"
Label: "freeze"

This state is a generic, pure software, light-weight, low-power state.
It allows more energy to be saved relative to idle by freezing user
This state is a generic, pure software, light-weight, system sleep state.
It allows more energy to be saved relative to runtime idle by freezing user
space and putting all I/O devices into low-power states (possibly
lower-power than available at run time), such that the processors can
spend more time in their idle states.
This state can be used for platforms without Standby/Suspend-to-RAM

This state can be used for platforms without Power-On Suspend/Suspend-to-RAM
support, or it can be used in addition to Suspend-to-RAM (memory sleep)
to provide reduced resume latency.
to provide reduced resume latency. It is always supported.


State: Standby / Power-On Suspend
ACPI State: S1
String: "standby"
Label: "standby"

This state offers minimal, though real, power savings, while providing
a very low-latency transition back to a working system. No operating
state is lost (the CPU retains power), so the system easily starts up
This state, if supported, offers moderate, though real, power savings, while
providing a relatively low-latency transition back to a working system. No
operating state is lost (the CPU retains power), so the system easily starts up
again where it left off.

We try to put devices in a low-power state equivalent to D1, which
also offers low power savings, but low resume latency. Not all devices
support D1, and those that don't are left on.
In addition to freezing user space and putting all I/O devices into low-power
states, which is done for Suspend-To-Idle too, nonboot CPUs are taken offline
and all low-level system functions are suspended during transitions into this
state. For this reason, it should allow more energy to be saved relative to
Suspend-To-Idle, but the resume latency will generally be greater than for that
state.


State: Suspend-to-RAM
ACPI State: S3
String: "mem"
Label: "mem"

This state offers significant power savings as everything in the
system is put into a low-power state, except for memory, which is
placed in self-refresh mode to retain its contents.
This state, if supported, offers significant power savings as everything in the
system is put into a low-power state, except for memory, which should be placed
into the self-refresh mode to retain its contents. All of the steps carried out
when entering Power-On Suspend are also carried out during transitions to STR.
Additional operations may take place depending on the platform capabilities. In
particular, on ACPI systems the kernel passes control to the BIOS (platform
firmware) as the last step during STR transitions and that usually results in
powering down some more low-level components that aren't directly controlled by
the kernel.

System and device state is saved and kept in memory. All devices are
suspended and put into D3. In many cases, all peripheral buses lose
power when entering STR, so devices must be able to handle the
transition back to the On state.
System and device state is saved and kept in memory. All devices are suspended
and put into low-power states. In many cases, all peripheral buses lose power
when entering STR, so devices must be able to handle the transition back to the
"on" state.

For at least ACPI, STR requires some minimal boot-strapping code to
resume the system from STR. This may be true on other platforms.
For at least ACPI, STR requires some minimal boot-strapping code to resume the
system from it. This may be the case on other platforms too.


State: Suspend-to-disk
ACPI State: S4
String: "disk"
Label: "disk"

This state offers the greatest power savings, and can be used even in
the absence of low-level platform support for power management. This
Expand Down
5 changes: 4 additions & 1 deletion Documentation/power/swsusp.txt
Original file line number Diff line number Diff line change
Expand Up @@ -220,7 +220,10 @@ Q: After resuming, system is paging heavily, leading to very bad interactivity.

A: Try running

cat `cat /proc/[0-9]*/maps | grep / | sed 's:.* /:/:' | sort -u` > /dev/null
cat /proc/[0-9]*/maps | grep / | sed 's:.* /:/:' | sort -u | while read file
do
test -f "$file" && cat "$file" > /dev/null
done

after resume. swapoff -a; swapon -a may also be useful.

Expand Down
Loading

0 comments on commit ee7f9d7

Please sign in to comment.