Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repeated excessive clock drifts between MCU/SX1301#0 RAK833 #63

Closed
jawadiot opened this issue May 14, 2020 · 8 comments
Closed

Repeated excessive clock drifts between MCU/SX1301#0 RAK833 #63

jawadiot opened this issue May 14, 2020 · 8 comments

Comments

@jawadiot
Copy link

jawadiot commented May 14, 2020

Hi,

I'm running basic station on raspberry compute 3, with concentrator RAKWIRLESS 831, i sucess to run the example live-s2.sm.tc , but the concentrator get error :

2020-05-14 22:19:31.086 [SYN:ERRO] Repeated excessive clock drifts between MCU/SX1301#0 (3 retries): -2978.4ppm (threshold 100.0ppm)
2020-05-14 22:19:34.237 [SYN:ERRO] Repeated excessive clock drifts between MCU/SX1301#0 (6 retries): -2751.8ppm (threshold 100.0ppm)
2020-05-14 22:19:37.387 [SYN:ERRO] Repeated excessive clock drifts between MCU/SX1301#0 (9 retries): -2552.3ppm (threshold 100.0ppm)
2020-05-14 22:19:40.538 [SYN:ERRO] Repeated excessive clock drifts between MCU/SX1301#0 (12 retries): -2377.3ppm (threshold 100.0ppm)
2020-05-14 22:19:43.688 [SYN:ERRO] Repeated excessive clock drifts between MCU/SX1301#0 (15 retries): -2223.0ppm (threshold 100.0ppm)
2020-05-14 22:19:46.839 [SYN:ERRO] Repeated excessive clock drifts between MCU/SX1301#0 (18 retries): -2086.5ppm (threshold 100.0ppm)
2020-05-14 22:19:48.939 [SYN:INFO] MCU/SX1301 drift stats: min: -2004.5ppm q50: -2491.4ppm q80: -2899.4ppm max: -3145.6ppm - threshold q90: -3060.2ppm
2020-05-14 22:19:48.939 [SYN:INFO] Avg MCU drift vs SX1301#0: 1.0ppm

I saw that on the RAKWIRELESS forum, he advises to do that :

Need to change the spi rate in basicstation/deps/lgw/platform-xxx/libloragw/src/loragw_spi.native.c from 8000000 to 2000000.

I do this change in /opt/basicstation/deps/lgw/platform-rpi/libloragw/src/loragw_spi.native.c, but what i'm suppose to do, to apply this change, i think i should to recompile the sofwtware that use spi driver no ?

@yucheng1993
Copy link

Hi, have you solved your problem?

@craigpeacock
Copy link

craigpeacock commented Feb 15, 2021

jawadiot makes reference to a RAKWireless forum. Here is the link:
https://forum.rakwireless.com/t/basicstation-on-diy-gateway/62/27

I'm not convinced the SPI clock speed is the source of the "Repeated excessive clock drifts...". RAKWireless Staff indicate the SPI clock speed was reduced as "There is a high probability that high spi rate will cause sx1301 to fail to start." I do, however, note jawadiot's drifts are quite excessive.

I'm getting a similar 'error' message and reducing the SPI clock has not mitigated it. I also note in the above forum, others have the same experience. However my drifts are only a couple ppm:
[SYN:ERRO] Repeated excessive clock drifts between MCU/SX130X#0 (3 retries): 9.5ppm (threshold 5.0ppm)
[SYN:ERRO] Repeated excessive clock drifts between MCU/SX130X#0 (6 retries): 5.3ppm (threshold 5.0ppm)
[SYN:ERRO] Repeated excessive clock drifts between MCU/SX130X#0 (3 retries): 7.7ppm (threshold 4.9ppm)
[SYN:ERRO] Repeated excessive clock drifts between MCU/SX130X#0 (6 retries): 6.1ppm (threshold 4.9ppm)

I don't know if it has anything to do with timesync quality (#41 (comment)) and if it is significant or not.

@fk0815
Copy link

fk0815 commented May 6, 2021

I have the same issue with RAK833. It seems that in my setup this problem is reproducible when I touch the metal part of the antenna connector or when I touch the PPS connector, could be some EMC sensitivity. A restart of basicstation clears the problem and a resync is done. As a workaround I patched:

diff --git a/src/timesync.c b/src/timesync.c
index 0000216..90d7c7f 100644
--- a/src/timesync.c
+++ b/src/timesync.c
@@ -231,6 +231,10 @@ ustime_t ts_updateTimesync (u1_t txunit, int quality, const timesync_t* curr) {
         }
         if( stats->excessive_drift_cnt >= 2*QUICK_RETRIES )
             stats->drift_thres = MAX_MCU_DRIFT_THRES;  // reset - we might be stuck on a very low value
+        if( stats->excessive_drift_cnt >= 20*QUICK_RETRIES ) {
+            LOG(MOD_SYN|CRITICAL, "excessive_drift_cnt too high! Concentrator hangup? Exit now.");
+            exit(EXIT_FAILURE);
+        }
         return TIMESYNC_RADIO_INTV/2;
     }
     stats->excessive_drift_cnt = 0;
-- 

This exits the application and it is then restarted by systemd automatically.

@tonysmith55
Copy link

You may want to try this. I locked the VPU frequency on a Pi3 by adding the line “core_freq=250” to /boot/config.txt I am assuming the Compute3 module has exactly the same issue. The explanation behind this can be found at https://www.thethingsnetwork.org/forum/t/rp3-x-ic880a-gateway-stopped-working/15469/12 Reducing the SPI speed from 8Mhz to 2MHz is an issue in the RAK2245 module due to the SPI drivers being frequency limited but I've not experienced this on a other RAK devices. (Reference https://www.thethingsnetwork.org/forum/t/do-you-need-the-loraserver-os-for-the-rak831/25860/9)

@orvio-craig
Copy link

I'm also seeing this problem, usually resulting in a disconnection. I'm using an outdoor antenna with my RAK2287, whcih I don't believe needs any modification to SPI speed. It connects fine, but disconnects after about 3 minutes. I've tried adding the line “core_freq=250” to /boot/config.tx but haven't seen any difference in behaviour...

2021-10-08 12:42:04.469 [S2E:VERB]   TX power: 0.0 dBm EIRP
2021-10-08 12:42:04.469 [S2E:VERB]   JoinEUI list: 0 entries
2021-10-08 12:42:04.469 [S2E:VERB]   NetID filter: FFFFFFFF-FFFFFFFF-FFFFFFFF-FFFFFFFF
2021-10-08 12:42:04.469 [S2E:VERB]   Dev/test settings: nocca=0 nodc=0 nodwell=0
2021-10-08 12:42:46.481 [SYN:INFO] MCU/SX130X drift stats: min: -3.3ppm  q50: +6.7ppm  q80: +8.1ppm  max: +10.5ppm - threshold q90: +9.5ppm
2021-10-08 12:42:46.481 [SYN:INFO] Mean MCU drift vs SX130X#0: 6.2ppm
2021-10-08 12:43:04.333 [SYN:INFO] Time sync qualities: min=184 q90=212 max=240 (previous q90=2147483647)
2021-10-08 12:43:21.138 [SYN:VERB] Time sync rejected: quality=213 threshold=212
2021-10-08 12:43:29.545 [SYN:VERB] Time sync rejected: quality=6321 threshold=212
2021-10-08 12:43:31.645 [SYN:INFO] MCU/SX130X drift stats: min: -2.4ppm  q50: +5.2ppm  q80: +7.6ppm  max: +8.6ppm - threshold q90: +8.1ppm
2021-10-08 12:43:31.645 [SYN:INFO] Mean MCU drift vs SX130X#0: 5.0ppm
2021-10-08 12:43:38.996 [SYN:VERB] Time sync rejected: quality=213 threshold=212
2021-10-08 12:43:41.097 [SYN:VERB] Time sync rejected: quality=213 threshold=212
2021-10-08 12:43:55.799 [SYN:VERB] Time sync rejected: quality=213 threshold=212
2021-10-08 12:44:06.301 [SYN:INFO] Time sync qualities: min=167 q90=213 max=6321 (previous q90=212)
2021-10-08 12:44:16.802 [SYN:INFO] MCU/SX130X drift stats: min: +1.9ppm  q50: +5.2ppm  q80: +9.5ppm  max: -95.7ppm - threshold q90: -90.1ppm
2021-10-08 12:44:16.802 [SYN:INFO] Mean MCU drift vs SX130X#0: -9.1ppm
2021-10-08 12:44:16.803 [SYN:ERRO] Repeated excessive clock drifts between MCU/SX130X#0 (3 retries): -95.7ppm (threshold 90.1ppm)
2021-10-08 12:44:19.954 [SYN:ERRO] Repeated excessive clock drifts between MCU/SX130X#0 (6 retries): -100.4ppm (threshold 90.1ppm)
2021-10-08 12:44:23.105 [SYN:ERRO] Repeated excessive clock drifts between MCU/SX130X#0 (9 retries): -100.6ppm (threshold 100.0ppm)
2021-10-08 12:44:26.260 [SYN:ERRO] Repeated excessive clock drifts between MCU/SX130X#0 (12 retries): -101.0ppm (threshold 100.0ppm)
2021-10-08 12:44:29.411 [SYN:ERRO] Repeated excessive clock drifts between MCU/SX130X#0 (15 retries): -100.5ppm (threshold 100.0ppm)
2021-10-08 12:44:35.713 [SYN:VERB] Time sync rejected: quality=239 threshold=213
2021-10-08 12:44:46.214 [SYN:VERB] Time sync rejected: quality=318 threshold=213
2021-10-08 12:44:48.315 [SYN:INFO] MCU/SX130X drift stats: min: -86.2ppm  q50: -100.4ppm  q80: -100.8ppm  max: -101.0ppm - threshold q90: -101.0ppm
2021-10-08 12:44:48.315 [SYN:INFO] Mean MCU drift vs SX130X#0: -97.4ppm
2021-10-08 12:44:52.520 [SYN:INFO] Time sync qualities: min=182 q90=212 max=318 (previous q90=213)
2021-10-08 12:45:01.323 [AIO:DEBU] [3] Connection closed unexpectedly
2021-10-08 12:45:01.323 [AIO:DEBU] [3] WS connection shutdown...
2021-10-08 12:45:01.323 [TCE:VERB] Connection to MUXS closed in state 4
2021-10-08 12:45:01.324 [TCE:INFO] MUXS reconnect backoff 1s (retry 0)

@tonysmith55
Copy link

I haven't checked the Basics Station code to be certain if GPS is involved in this. I would check if the GPS serial connection is communicating, you may see this in the logs when Basics Station starts.
With the introduction of BlueTooth onto the Pi the serial port change swapped between from /dev/ttyAMA0 to /dev/ttyS0. From memory i think you can continue to use ttyAMA0 if you disable BlueTooth in /boot/config.txt by adding dtoverlay=disable-bt
For more detail have a look at https://raspberrypi.stackexchange.com/questions/45570/how-do-i-make-serial-work-on-the-raspberry-pi3-pizerow-pi4-or-later-models/45571#45571
Not sure if this is the issue, but it's where I would start.

@orvio-craig
Copy link

I've tried what you suggest, but I don't think it had much of an impact at all. I was able to view the output and see that it had got a GPS fix. Even without that the PPS signal should still come through to the SX1302 so I'd expect it shouldn't really make much differerence to timing whether or not the serial port is active?

@orvio-craig
Copy link

Turns out I just had to add a "pps": true to the station.conf file!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants