Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add join back-off #546

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Conversation

ngraziano
Copy link

This is a simple proposition for the join bakoff to resolve #2 .
It try to use the globalDutyRate to limit the interval of transmission and add a random value to prevent synchronization of different device which start at same time.
The limitation is made without summing time on air during the period to be simple.

  • Limit for hour 1 is 36s : start join with duty rate of 2^7 which limit to 3600/2^17 = 28s.

  • Limit between hours 1 and 11 is 36s,
    divide by 2 the duty rate every 4000s, this limit to 35s

rate duration max transmission time
7 400 3.1
8 4000 15.6
9 4000 7.8
10 4000 3.9
11 4000 1.9
12 4000 0.9
13 4000 0.5
14 11600 0.7
Total 36000 35
  • Limit after hour 11 is 8.7s/24h, duty rate of 2^14 which limit to 5.7s/24.

A next step should be to work on LMICbandplan_nextJoinState to not stay on SF12 after join try all frequency once.

@terrillmoore
Copy link
Member

@ngraziano thanks for this! Before we merge, we need to make sure the compliance script can disable this cleanly (ideally via the mechanism already in use). It turns out that all the compliance test suites assume that the device will "magically" not honor duty cycle limitations; otherwise the tests would take forever....

@ngraziano
Copy link
Author

This back off can be disable in the application side in onEvent: on EV_JOINING and EV_JOIN_TXCOMPLETE, LMIC.globalDutyRate need to be force to 0 to remove the waiting.
But I am not sure where to set it in compliance script.

@terrillmoore
Copy link
Member

@ngraziano sorry, was speaking loosely. I always forget that I divided the compliance code into a library (that can be incorporated into any application) and a sketch (that provides a null application and a bunch of debug support). We'd want to update the library to disable join timing. This is tricky because right now the library really only runs parasitically (if you call LMIC_complianceRxMessage()). We'd have to add LMIC_complianceJoinInit() to be called by the base application when doing a join (as you say, in response to the proper message from the LMIC engine).

The code that I was thinking of, that defeats duty cycle logic, is:

#if CFG_LMIC_EU_like
band_t *b = LMIC.bands;
lmic_compliance_band_t *b_save = LMIC_Compliance.saveBands;
for (; b < &LMIC.bands[MAX_BANDS]; ++b, ++b_save) {
b_save->txcap = b->txcap;
b->txcap = 1;
b->avail = os_getTime();
}
#endif // CFG_LMIC_EU_like

I see that this is only post-join. I'll have to do some testing on this to see the best way to do this. Maybe at NYC hacking hours on Thursday.

@ngraziano
Copy link
Author

I make a proposition for disabling globalDutyRate but I can't test it.
I add a callback in compliance module and add it to the compliance example.
Is that what you have in mind ?

@terrillmoore
Copy link
Member

@ngraziano Yes, very much like that, thanks. I'm on deadline for something else; I'd like to try testing before merging, but it may be a few days. Thanks again for this contribution.

@terrillmoore
Copy link
Member

I'm studying this for US-like bandplans; all need the same thing, but there are subtleties for US-like having to do with probing all the channel groups; and this interacts with a better randomization scheme. Probably can't get to testing this weekend, I'm afraid. Would encourage others with EU-like bandplans to try testing; @svelmurugan92 can you arrange to test with IN866, AS923 and KR920, all of which are EU-like, and see if the compliance mode on the RWC5020 works smoothly? (Symptom of a problem will be very long script processing time, compared to without the fix.)

@terrillmoore
Copy link
Member

With all the current excitement, I've not heard back from anyone else with test results. I've marked this as on the list the release after this one, which will be in a few weeks.

@terrillmoore
Copy link
Member

@ngraziano can you rebase this PR on v3.2.0? That will make it easier for others to test with the latest fixes, which are important. Thanks!

Get in transmission limit for join request using the globalDutyRate.
The limitation is made without summing time on air during the period
to be simple.

Limit for hour 1 is 36s, so select duty rate fo 2^7 which limit to 28s

Limit between hour 1 and 11 is 36s,
divide by 2 the duty rate every 4000s, this limit to 35s

Limit after hour 11 is 8.7s/24h
duty rate of 2^14 which limit to 5.7s/24.
Add an event handler for compliance test to disable the duty rate
during the join when compliance test is active.
@ngraziano
Copy link
Author

@terrillmoore rebase made.

@terrillmoore
Copy link
Member

Hello again... In writing #581, I discovered some fishy things around (in EU only) setting the join requests in the 0.1% group, rather than 1%. I suspect this is an overlacking hack, as there's nothing in the 1.0.3 regional spec about this. Evidence of fishy code surfaces here:

static CONST_TABLE(u4_t, iniChannelFreq)[6] = {
// Join frequencies and duty cycle limit (0.1%)
EU868_F1 | BAND_MILLI, EU868_F2 | BAND_MILLI, EU868_F3 | BAND_MILLI,
// Default operational frequencies and duty cycle limit (1%)
EU868_F1 | BAND_CENTI, EU868_F2 | BAND_CENTI, EU868_F3 | BAND_CENTI,
};

There is later code that manipulates the 0.1% join code:

u1_t su = join ? 0 : NUM_DEFAULT_CHANNELS;
for (u1_t fu = 0; fu<NUM_DEFAULT_CHANNELS; fu++, su++) {
LMIC.channelFreq[fu] = TABLE_GET_U4(iniChannelFreq, su);
// TODO(tmm@mcci.com): don't use EU DR directly, use something from the LMIC context or a static const
LMIC.channelDrMap[fu] = DR_RANGE_MAP(EU868_DR_SF12, EU868_DR_SF7);
}

I don't think that this logic is needed any more with your changes, as you're managing according to the rules in LoRaWAN 1.0.3 chapter 7.

Also... I think we might want to centralize this logic (not limit it to Join) -- chapter 7 can apply to application-triggered messages as well. But perhaps the best should not be the enemy of the good.

@ngraziano
Copy link
Author

Hello

I agree for the 0.1% group for join frequency, I also think it is an hack of the old code to limit the number of join without proper back-off. I was thinking to remove it in a second step (with the "join" parameter of initDefaultChannels only use in EU868)

For back-off for other case of chapter 7, I do not see easy way to implement it.
I see 3 case :

  • Join (the case of this PR)
  • Retransmitted due to missing network ack:
    LMIC limit by default the number of retry to 8 TXCONF_ATTEMPTS, the number advised in chapter 18.4. I don't know if we go outside the limit with this number of retry.
  • Retransmitted due to application layer:
    For this, I don't know how to do, the application will have to signal when a message is a retransmission or not and we can't use the globalDutyRate like in join phase.
    Or we may track the transmission time over 1h ,10h and 24h (may be useful for ttn and the 30s limit by 24h) and allow application layer to delay the send but the problem is ostime_t which overflow at 9.5 hours.

If you have an idea to centralize this let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LoRaWAN Join backoff not implemented
3 participants