Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linux linaro lsk v3.14 mx6 thermal #24

Merged

Conversation

linux4kix
Copy link
Owner

No description provided.

bebarino and others added 20 commits November 26, 2014 18:51
It looks like this code is missing braces, otherwise the if
statement shouldn't have been indented. Fix it.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
When binding cooling devices to thermal zones created from the device
tree the minimum and maximum cooling states are in the wrong order
leading to failure to bind.

Fix the order of cooling states in the call to
thermal_zone_bind_cooling_device to fix this.

Cc:Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Thermal sensor used to need two calibration points which are
in fuse map to get a slope for converting thermal sensor's raw
data to real temperature in degree C. Due to the chip calibration
limitation, hardware team provides an universal formula to get
real temperature from internal thermal sensor raw data:

Slope = 0.4297157 - (0.0015976 * 25C fuse);

Update the formula, as there will be no hot point calibration
data in fuse map from now on.

Signed-off-by: Anson Huang <b20788@freescale.com>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
On latest i.MX6 SOC with thermal calibration data of 0x5A100000,
the critical trip temperature will be an invalid value and
cause system auto shutdown as below log:

thermal thermal_zone0: critical temperature reached(42 C),shutting down

So, with universal formula for thermal sensor, only room
temperature point is calibrated, which means the calibration
data read from fuse only has valid data of bit [31:20], others
are all 0, the critical trip point temperature can NOT depend
on the hot point calibration data, here we set it to 20 C higher
than default passive temperature.

Signed-off-by: Anson Huang <b20788@freescale.com>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Create a new event to trace the temperature of a thermal zone. Using
this event trace the temperature changes of the thermal zone every-time
it is updated.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Introduce and use an event to trace when a cooling device's state is
updated. This is useful to follow the effect of governor decisions on
cooling devices.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Create a new event to trace when the temperature is above a trip
point. Use the trace-point when handling non-critical and critical
trip pionts.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
result is always zero when comes here.

Signed-off-by: Yao Dongdong <yaodongdong@huawei.com>
Acked-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
thermal driver should be regisetered after cpufreq driver has
been registered and probed. Doing so is to make sure that thermal
driver can get the max cpu cooling states correctly when calling
get_property.

Signed-off-by: Bai Ping <b51503@freescale.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
The second cooling device was being registered as another cpu
frequency device.  This is incorrect register it as a device
frequency device.
…terrupts

Initializing the thermal_device causes get_temp to get called which in turns
tries to enable interrupts.  Move clock code and the enable_irq setting
up in the probe function to get rid of the kernel warnings on boot.
The driver currently would only trip at 85C and then bring the
SOC and GPU down to the slowest speeds until it reached 75C.

Now we read the number of cpufreq states and spread out trip points
between 85C and critical temp.  At each trip point we drop the
top cpufrequency down and don't raise it again until the temp is
lower than the previous zones trip point.
Add some additional debug output to track the changes being
detected by the driver.  Also fix the typing of some variables
to verify that they print out properly in the debug output.
This allows max_state to be set at the time the devfreq cooling
device is instantiated.  This allows events to match the cpufreq
cooling device and gives tighter grained control over the
devices listening.

Currently I am using state 5 as the critical event number.  This
could be dynamic as well but since only the imx_thermal and galcore
use this driver I am okay hardcoding it for testing.
Tell the devfreq device how many cooling states it will have.
Rather than just scaling down the GPU's 3D core to the slowest
speed at the first passive cooling trip point, we slowly scale back
the core clock speed trying to achieve balance.  If we reach the
critical speed we scale all the way back to the lowest freq.

*TEST* the divider number is completely guestimated and needs to be
tested.
Write access to the fuses is very dangerous as they can only be
written once.  This is the first commit in locking this down a bit.
Now you must explicitely enable this functionality in the kernel.
Instead of having all drivers mapping the otp memory and checking
values provide a function so they can do it through the driver.
Instead of checking device-tree for the otp definition and mapping
memory ourselves, use the new helper function to read fuse values.
This should probably be broken up into multiple patches but I will
do that when I get them ready for upstreaming.  Functionality that
I have added is follows.

- read grade of chip from otp and set critical based on this value.
- base the default passive trip points based on the number of cpu
frequency points and the critical thermal temperature.
- allow the user to change the trip_point0 up to a maximum of 85c
and then scale the trip points based on that.
linux4kix added a commit that referenced this pull request Dec 17, 2014
@linux4kix linux4kix merged commit 1f8e124 into linux-linaro-lsk-v3.14-mx6 Dec 17, 2014
pepedog pushed a commit to pepedog/linux-imx6-3.14 that referenced this pull request Dec 31, 2014
[ Upstream commit 7da89a2a3776442a57e918ca0b8678d1b16a7072 ]

Meelis Roos reports crashes during bootup on a V480 that look like
this:

====================
[   61.300577] PCI: Scanning PBM /pci@9,600000
[   61.304867] schizo f009b070: PCI host bridge to bus 0003:00
[   61.310385] pci_bus 0003:00: root bus resource [io  0x7ffe9000000-0x7ffe9ffffff] (bus address [0x0000-0xffffff])
[   61.320515] pci_bus 0003:00: root bus resource [mem 0x7fb00000000-0x7fbffffffff] (bus address [0x00000000-0xffffffff])
[   61.331173] pci_bus 0003:00: root bus resource [bus 00]
[   61.385344] Unable to handle kernel NULL pointer dereference
[   61.390970] tsk->{mm,active_mm}->context = 0000000000000000
[   61.396515] tsk->{mm,active_mm}->pgd = fff000b000002000
[   61.401716]               \|/ ____ \|/
[   61.401716]               "@'/ .. \`@"
[   61.401716]               /_| \__/ |_\
[   61.401716]                  \__U_/
[   61.416362] swapper/0(0): Oops [#1]
[   61.419837] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc1-00422-g2cc9188-dirty linux4kix#24
[   61.427975] task: fff000b0fd8e9c40 ti: fff000b0fd928000 task.ti: fff000b0fd928000
[   61.435426] TSTATE: 0000004480e01602 TPC: 00000000004455e4 TNPC: 00000000004455e8 Y: 00000000    Not tainted
[   61.445230] TPC: <schizo_pcierr_intr+0x104/0x560>
[   61.449897] g0: 0000000000000000 g1: 0000000000000000 g2: 0000000000a10f78 g3: 000000000000000a
[   61.458563] g4: fff000b0fd8e9c40 g5: fff000b0fdd82000 g6: fff000b0fd928000 g7: 000000000000000a
[   61.467229] o0: 000000000000003d o1: 0000000000000000 o2: 0000000000000006 o3: fff000b0ffa5fc7e
[   61.475894] o4: 0000000000060000 o5: c000000000000000 sp: fff000b0ffa5f3c1 ret_pc: 00000000004455cc
[   61.484909] RPC: <schizo_pcierr_intr+0xec/0x560>
[   61.489500] l0: fff000b0fd8e9c40 l1: 0000000000a20800 l2: 0000000000000000 l3: 000000000119a430
[   61.498164] l4: 0000000001742400 l5: 00000000011cfbe0 l6: 00000000011319c0 l7: fff000b0fd8ea348
[   61.506830] i0: 0000000000000000 i1: fff000b0fdb34000 i2: 0000000320000000 i3: 0000000000000000
[   61.515497] i4: 00060002010b003f i5: 0000040004e02000 i6: fff000b0ffa5f481 i7: 00000000004a9920
[   61.524175] I7: <handle_irq_event_percpu+0x40/0x140>
[   61.529099] Call Trace:
[   61.531531]  [00000000004a9920] handle_irq_event_percpu+0x40/0x140
[   61.537681]  [00000000004a9a58] handle_irq_event+0x38/0x80
[   61.543145]  [00000000004ac77c] handle_fasteoi_irq+0xbc/0x200
[   61.548860]  [00000000004a9084] generic_handle_irq+0x24/0x40
[   61.554500]  [000000000042be0c] handler_irq+0xac/0x100
====================

The problem is that pbm->pci_bus->self is NULL.

This code is trying to go through the standard PCI config space
interfaces to read the PCI controller's PCI_STATUS register.

This doesn't work, because we more often than not do not enumerate
the PCI controller as a bonafide PCI device during the OF device
node scan.  Therefore bus->self remains NULL.

Existing common code for PSYCHO and PSYCHO-like PCI controllers
handles this properly, by doing the config space access directly.

Do the same here, pbm->pci_ops->{read,write}().

Reported-by: Meelis Roos <mroos@linux.ee>
Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants