-
Notifications
You must be signed in to change notification settings - Fork 365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fan operation, fancontrol functionality (?), inconsistent #708
Comments
What are the make/models of the fans you're testing with? The parameters in fancontrol are tuned to the most commonly used SUNON GM0503PEV1-8. Some other fans might take longer to start, but the fan should come on eventually. |
All units are sourced from CrewDogElectronics, in the event you might know what he uses. (I thought somewhere I read that he sourced parts from you but could not quickly find that.) (The fans are glued in place, and appears any label is hidden under outer case plastic. But I can persist in trying to remove if knowing where they're from doesn't tell you.) And since I've seen them fairly regularly actuate usually by ~~128F and pull down to around 118F, waiting several hours and seeing temps rise to 150F seems excessive, but I'm new to this product... |
Yes, those are the right fan. |
Have noted that cpuTempMonitor is used from both fancontrol.go and gen_gdl90.go. Any chance of collisions between the two of them accessing temperature path causing issues? (I will note that I am trying to hack fancontrol.go into simple command utility to set/overwrite pwmDuty cycle, and I'm 3 for 3 in causing the system to fail, apparently rebooting trying it, so apparently conflict is possible at some level :). It does appear to set the duty cycle as the fan speeds up, then within seconds stops, as I think, the system reboots. I realize fancontrol is a bit more intrusive than cpuTempMonitor, and I may not have it stripped properly, and maybe this is a 'REALLY BAD IDEA', but I wasn't expecting system reboot. update - apparently its not a clean reboot, it may simply be that some services fail/restart - while the wireless reappears, I don't seem to actually be able to (re-)connect after these failures.) |
more data... Had a unit in the failing state, temp up to 62C. I restarted the service, and the 5-or-so second test period of fan occurred, but there was no subsequent operation of the fan (within about 5 minutes) even though the temp was > 60C. Eventually I stop/killed/rm'd again. While stopped, performing a I started the fancontrol again, and checked that process was running (ps -A | grep -i fancontrol). Also did journalctl -ln 2000 -u fancontrol which seemed to show it was thought to have been started as well. This time, when temp reached somewhere slightly above 54C, fan activated, pulling down temp, and eventually shut fan off. Watching for a while more, it seems to have now (appropriately) cycled on/off 5-10 times as temp rises/falls, |
For what it's worth, others are seeing this intermittent fan behavior (Reddit thread). In my case, I have the SUNON GM0503PEV1-8 and have observed the intermittent fan behavior on 1.4r5 with two different AHRS boards (thought the first one was faulty, so bought a second and had the same results). |
Known issue, consolidating to #663. |
Stratux version: 1.4r4 (5af6d77)
Stratux config:
SDR
GPS
type:
AHRS
power source: wall/line
usb cable:
EFB app and version: (e.g., WingX Pro7 8.6.2)
EFB platform: (e.g., iOS 9.2)
EFB hardware: (e.g., iPad Mini 2)
Description of your issue:
Fan operation is not reliable. I have worked with 3 different stratux units, all same configuration, and each has exhibited a failure to consistently operate the fan as expected.
Some 'boots' fan seemed to work OK, other boots fan did not work at all, with a couple boots where it seemed to be slow to start functioning (temps would rise above target with no fan activation), but once it did start operating, it (mostly*) continued to. (There have been a couple boots where it seemed a bit random, allowing temps sometimes to rise above the trigger point (above 130F) before beginning to operate, and other times coming on around trigger (which seems to be 125F-128F range.)
I was recently bench testing two of the units (simultaneously), and both at boot operated the fan, but then post-boot allowed the temperature to rise to approx 150F without ever activating the fan. At slightly different times, I rebooted the units, with one of them shortly after boot having 'normal' fan operation (the temp was still high after reboot, over 130F, and rose a little, when fan finally began to operate, eventually pulled temp down below 120F, and seems to have continued to operate overnight.) The 2nd unit was rebooted a bit later, and even after reboot, still allowed the temp to rise to around 150F (which may have been about its 'max' for the current ambient temperature in my benchtest environment.) A bit later, I rebooted the 2nd unit AGAIN, and after this boot the fan on this unit also began to operate 'normally', and also appears to have continued operating normally through the night.
I found comments in cputemp.go indicating there are (have been?) problems reading the current temp on the pi, with hangs occurring. On one of the units I ssh'd into it during a no-fan boot, found that the fancontrol service appeared to be running, and I was able to "cat /sys/class/thermal/thermal_zone0/temp" and promptly get output (with indicated temp matching what was being reported on the stratux main browser page.)
Examining cputemp.go, it does not seem to take any action (no logging, nothing corrective) if an error is returned in
temp, err := ioutil.ReadFile("/sys/class/thermal/thermal_zone0/temp")
reading the cpu temp. Some sort of brief logging (maybe at least once reporting error reading file occurred), or perhaps other corrective action might be in order.
I originally suspected hardware, but am now suspicious that its more a software issue, since 3 different units have exhibited similar behaviour, the fans run OK at boot, but then on different 'boots' may or may not as temps rise within the units.
***My time is limited, but I will entertain any suggestions for how to go about diagnosing precisely what's going on with whatever time I can apply to this (and subject to being able to reproduce it). (Note that 'real soon now', I am only going to have one unit available to me.)
The text was updated successfully, but these errors were encountered: