Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fancontrol deadlock #663

Open
3 of 6 tasks
cyoung opened this issue Aug 31, 2017 · 17 comments
Open
3 of 6 tasks

fancontrol deadlock #663

cyoung opened this issue Aug 31, 2017 · 17 comments
Labels

Comments

@cyoung
Copy link
Owner

cyoung commented Aug 31, 2017

  1. Stratux version: v1.4r2

  2. Stratux config:

    SDR

    • single
    • dual

    GPS

    • yes
    • no
      type:

    AHRS

    • yes
    • no

    power source: EasyAcc 6000 mAh

    usb cable: integrated cable

  3. EFB app and version: N/A

    EFB platform: N/A

    EFB hardware: N/A

  4. Description of your issue: fancontrol daemon deadlocks.

If possible, enable "Replay Logs", reproduce the problem, and provide a copy of the logs in http://192.168.10.1/logs/stratux/ and http://192.168.10.1/logs/stratux.log.

@cyoung cyoung added the bug label Aug 31, 2017
cyoung added a commit that referenced this issue Sep 12, 2017
@Snowflake6
Copy link

FYI - I may be observing this problem in 1.4r3. My fan does briefly run at power-up, but since then it is not turning on. CPU up to 60C and no fan activity after an hour's running.

@cyoung
Copy link
Owner Author

cyoung commented Oct 17, 2017

Thanks, I think it might be some deadlock on reading /sys/class/thermal/thermal_zone0/temp. Going to leave a unit running for a while to try and reproduce it, then maybe strace to figure out what the process is doing.

@Snowflake6
Copy link

Further update - I've rebooted a few times today since, and it seems to be running smoothly now. I didn't change anything on the hardware side. The software side has changed slightly as I installed the updated OLED drivers but I don't expect that should have any effect on the fan operation.

@cyoung
Copy link
Owner Author

cyoung commented Oct 17, 2017

Keep looking for it. I got it to happen once. Seems rare.

@Snowflake6
Copy link

Last night I left it running in my window (trying to get my VK172 to lock in). This morning the temp was up to 60C and the fan wasn't running. I unplugged it and brought it to the office, and when I powered it up here the fan spun up and down as usual then did nothing until the temp got up over 55C... After running for half an hour or more like that, the fan suddenly started working and has been ramping up and down with temperature for the last couple of hours.s

So I guess there's something flaky in there but i'm not sure what yet. FWIW, this is running on my older "dead bug" circuit for controlling the fan. I have an AHRS/Fan CTL board on order and will swap it in when it arrives (along with a (hopefully) more stable GPYes).

@cyoung
Copy link
Owner Author

cyoung commented Oct 23, 2017

Monitor with: cc4127d.

@cyoung
Copy link
Owner Author

cyoung commented Jan 17, 2018

Not fixed. Seems to be a wiringPi bug.

@cyoung
Copy link
Owner Author

cyoung commented May 11, 2018

@d-hoke - what does http://192.168.10.1:9977 look like when the fan is not running (but CPU temp is >50ºC)?

cyoung added a commit that referenced this issue May 11, 2018
@cyoung
Copy link
Owner Author

cyoung commented May 11, 2018

@d-hoke -- another test, http://updates.stratux.me/builds/update-stratux-v1.4r5-c0127928af.sh

Added a failsafe temperature (for this test, once it reaches 65ºC it will give up using PWM). The theory is that wiringPi stops responding to PWM command at a certain point.

@d-hoke
Copy link

d-hoke commented May 11, 2018

1of3) will try to check :9977 when/if can reproduce (not doing too much with the unit right now), its mostly just benched accruing runtime hours, but I can periodically restart to see if re-occurs, initial brief attempts have 'delay' (see 2of3) but when duty reaches about 70 fan begins to operate (and I think that probably happens within about 6-7 30sec intervals, if I recall code correctly). (With the earlier failures I let it go way longer than that, time-length prob. doc'd in other now closed issue.)

2of3)have previously observed that on my unit(s) (I only have access to one at-the-moment), the fan does not seem to operate until PWM at 60-70. On 'this' unit, it seems to hover between 70-80. (I think on one of the others it would start to operate at about 550 when I was playing with it via GPIO util (range 1-1024 ).)

3of3)Don't know when/if I'll have a chance to try your failsafe test build.

@d-hoke
Copy link

d-hoke commented May 11, 2018

obtained a failure, when found (fan not running) :9977 checked and showed:
{"TempTarget":50,"TempCurrent":65.528,"PWMDutyMin":1,"PWMDutyMax":100,"PWMDutyCurrent":100,"PWMPin":1}
(the above was manually typed (circumstances make direct copy difficult), but should be mostly correct)
(next check after entering above, temp was to 66.066)

(***might be important...
FWIW, this was after a soft 'reboot' via web interface, hadn't previously made any association, don't know if it could be related or not, all other (multiple) attempts prior this morning were power-off/power-on restarts.)

ssh'd in...
systemctl status fancontrol reports 'active (running)'
(other stuff)
systemctl stop fancontrol
ps -A | grep fancontrol
kill -9 thatpid
gpio pwm 1 500
gpio pwm 1 900
***fan did NOT start operating (in weeks earlier attempts it DID start)
gpio readall
-the 'V' column for pin 1 shows zero ('0') (I think that's not good, right?)
gpio mode 1 pwm
-the fan immediately started... - not sure what this would indicate...
gpio readall
-the 'V' column for pin1 shows one ('1')
(but the service is currently down, so this is just from gpio utility - important part I'd guess is that setting the PWM value did not work here until after I request mode be set to PWM - any possibility that PWM mode for that pin being somehow 'lost'?)
systemctl start fancontrol
-apparently started and took control of pin... watching...
-:9977 showed PWMDuty up to 60, at that point I could hear audible frequence (PWM at level for my ears?) but fan not running, went to 70, fan still not running, went to 90 while typing this, fan had started running before that check... went to 100 fan running hard (temp was still @ 51.1 but apparently dropping)

@cyoung
Copy link
Owner Author

cyoung commented May 11, 2018

Did you do this test with the updated version?

@d-hoke
Copy link

d-hoke commented May 11, 2018

No. (sorry, not having done an update yet, that's unfamiliar territory to me and would require extra time, stop/starts/reboots I can fit in, as well as risk of 'bricking' unit currently generally functioning for my development purposes)

@d-hoke
Copy link

d-hoke commented May 11, 2018

on a subsequent soft restart fancontrol was functioning

@CraXgt
Copy link

CraXgt commented May 13, 2018

Same observation here on 1.4r5

{"TempTarget":50,"TempCurrent":69.294,"PWMDutyMin":1,"PWMDutyMax":10,"PWMDutyCurrent":4,"PWMPin":1}

No spinning fan, although PWMDutyCurrent reads our a value correctly.

Edit: After reboot, it seems to work correctly.

westphae pushed a commit to westphae/stratux that referenced this issue Aug 6, 2018
westphae pushed a commit to westphae/stratux that referenced this issue Aug 6, 2018
tobo07 added a commit to tobo07/stratux that referenced this issue Aug 17, 2018
* Remove old AHRS replay tools. Add protocol tools.

* Typo and path fix.

* Change IP Stratux IP address and DHCP range.

* Update install and auto-start scripts.

* Update make clean.

* Add missing files from update scripts.

* Update goflying.

* Update goflying.

* Add alias for 192.168.10.1 for normal webui access.

* Adapt `ffMonitor()` to ahrs_approx - only send the FF AHRS packets when FF is detected.

* Add dhcpd.conf and interfaces files to .sh update.

* Remove sqlite log and stratux.log on update.

* Install dhcpd.conf and interfaces file.

* stratux.log cleanup.

* Remove unused 'CPULoad' status variable.

* Typo fix.

* Move to cyoung/goflying, 'stratux_master' branch.

* Update goflying.

* Update goflying.

* Change goflying version.

* Merge branch 'master' into ahrs_dev_protocolfun

# Conflicts:
#	selfupdate/makeupdate.sh
#	selfupdate/update_footer.sh

* Cleanup.

* Add Merlin auto-detect.

* Clean up log output.

* Update oled package to new 'luma' package name.

Fixes cyoung#672.

* Change luma usage.

cyoung#672.

* Use estimate vAcc = 2*hAcc instead of reported vAcc.

Fixes cyoung#666.

* Turn off PPS output (green LED on GPYes).

cyoung#659.

* Add NightMode to turn off ACT LED.

cyoung#659.

* Remove socket listener from fancontrol. Replace with http server. Serve status in JSON format.

* Update system uptime warning format.

* update readme.md

added additional jet tests.

* Turn off wireless power management, from D. DeMartini.

* Add logrotate conf. Keep two days of logs. Run logrotate on boot.

* Add some AHRS and maintenance web calls to documents.

* Change deleted AvSquirrel/dump1090 submodule to mirror at stratux/dump1090

* Remove old libimu.so references in selfupdate.

* Create function that tracks critical system errors and issues them only once.

cyoung#692

* Use addSingleSystemErrorf() instead of tracking error prints individually.

cyoung#692.

* Change error prints to use addSingleSystemErrorf().

cyoung#692

* Clean up unused imports.

cyoung#692

* Upgrade golang version in CircleCI from 1.6 to 1.9.2.

* Clean gopath on each CircleCI build.

* Install mercurial on CircleCI.

* Remove "test" directory from go get in gen_gdl90 make.

* Cleanup.

cyoung#692

* Remove test build from CircleCI config.

* Change CircleCI to test to build master.

* Remove unnecessary debug print.

* Delete /var/log/stratux* on update.

* Simplify "Settings" page.

Only display "Hardware" and "Diagnostic" settings when in developer mode.

* Make PPM setting a developer option.

cyoung#79.

* Roll back changes.

* Use "N" or "C" regs derived from Mode-S identifier, or default to "Stratux".

* RPi3B+ setup.

* Add new ForeFlight AHRS and ID GDL90 messages.

* Change ForeFlight support note in README.

* Update ForeFlight version to 10+ for AHRS support.

* gen_gdl90 gracefulShutdown() on system shutdown or reboot request.

* Clean up obsolete dangerzone includes.

* Fan control failsafe temp.

cyoung#663

* Fix gdl90 AHRSGyroHeading reporting.

* Comment out ublox8 Glonass / Galileo code.

* Increase suggested wait time on updates.

* Add info to main status help page.

Fixes cyoung#728.

* Adds AU tail number decoding from ICAO addr.

Contribution by @armeniki. cyoung#736.
tobo07 added a commit to tobo07/stratux that referenced this issue Aug 19, 2018
* Remove old AHRS replay tools. Add protocol tools.

* Typo and path fix.

* Change IP Stratux IP address and DHCP range.

* Update install and auto-start scripts.

* Update make clean.

* Add missing files from update scripts.

* Update goflying.

* Update goflying.

* Add alias for 192.168.10.1 for normal webui access.

* Adapt `ffMonitor()` to ahrs_approx - only send the FF AHRS packets when FF is detected.

* Add dhcpd.conf and interfaces files to .sh update.

* Remove sqlite log and stratux.log on update.

* Install dhcpd.conf and interfaces file.

* stratux.log cleanup.

* Remove unused 'CPULoad' status variable.

* Typo fix.

* Move to cyoung/goflying, 'stratux_master' branch.

* Update goflying.

* Update goflying.

* Change goflying version.

* Merge branch 'master' into ahrs_dev_protocolfun

# Conflicts:
#	selfupdate/makeupdate.sh
#	selfupdate/update_footer.sh

* Cleanup.

* Add Merlin auto-detect.

* Clean up log output.

* Update oled package to new 'luma' package name.

Fixes cyoung#672.

* Change luma usage.

cyoung#672.

* Use estimate vAcc = 2*hAcc instead of reported vAcc.

Fixes cyoung#666.

* Turn off PPS output (green LED on GPYes).

cyoung#659.

* Add NightMode to turn off ACT LED.

cyoung#659.

* Remove socket listener from fancontrol. Replace with http server. Serve status in JSON format.

* Update system uptime warning format.

* update readme.md

added additional jet tests.

* Turn off wireless power management, from D. DeMartini.

* Add logrotate conf. Keep two days of logs. Run logrotate on boot.

* Add some AHRS and maintenance web calls to documents.

* Change deleted AvSquirrel/dump1090 submodule to mirror at stratux/dump1090

* Remove old libimu.so references in selfupdate.

* Create function that tracks critical system errors and issues them only once.

cyoung#692

* Use addSingleSystemErrorf() instead of tracking error prints individually.

cyoung#692.

* Change error prints to use addSingleSystemErrorf().

cyoung#692

* Clean up unused imports.

cyoung#692

* Upgrade golang version in CircleCI from 1.6 to 1.9.2.

* Clean gopath on each CircleCI build.

* Install mercurial on CircleCI.

* Remove "test" directory from go get in gen_gdl90 make.

* Cleanup.

cyoung#692

* Remove test build from CircleCI config.

* Change CircleCI to test to build master.

* Remove unnecessary debug print.

* Delete /var/log/stratux* on update.

* Simplify "Settings" page.

Only display "Hardware" and "Diagnostic" settings when in developer mode.

* Make PPM setting a developer option.

cyoung#79.

* Roll back changes.

* Use "N" or "C" regs derived from Mode-S identifier, or default to "Stratux".

* RPi3B+ setup.

* Add new ForeFlight AHRS and ID GDL90 messages.

* Change ForeFlight support note in README.

* Update ForeFlight version to 10+ for AHRS support.

* gen_gdl90 gracefulShutdown() on system shutdown or reboot request.

* Clean up obsolete dangerzone includes.

* Fan control failsafe temp.

cyoung#663

* Fix gdl90 AHRSGyroHeading reporting.

* Comment out ublox8 Glonass / Galileo code.

* Increase suggested wait time on updates.

* Add info to main status help page.

Fixes cyoung#728.

* Adds AU tail number decoding from ICAO addr.

Contribution by @armeniki. cyoung#736.
@fast240z
Copy link

fast240z commented Sep 1, 2018

I am experiencing issues with fancontrol dead locking under the latest master build. The fan runs at boot up and will run for a short period if I kill the process and run fancontrol manually. The behavior occurs regardless of reboot.

EDIT - issue appears to exist in v1.4r5 img, but does not exist if flashing a previous version of stratux and updating to v1.4r5 via .sh file. I installed v1.4r3 and verified the fan worked properly (came on at 55c) and then updated to v1.4r5 and verified the fan still worked fine. If flashing v1.4r5 directly from img, the fan will consistently fail to work.

@craytron
Copy link

First post here and I don't get github yet. I have had fan lock problems since day one on my system described below. I think this issue is still open with the last comment on Aug 31 I assume this year (2018).

Pi 3B, Version 1.4r4, AHRS, no GPS, single external antenna via Stratus ESG transponder, good power (steady red), Stratux mounted in glove box. Fan burst normal on all boot ups but, only a 50% chance that the fan controller actually works. UI reboot will usually restore the fan controller. Sometimes 2 reboots are required. Looking to help if I can or, maybe this issue is closed and a fix is available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants