Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TX message guaranteed delivery #421

Merged
merged 10 commits into from
Apr 6, 2020
Merged

TX message guaranteed delivery #421

merged 10 commits into from
Apr 6, 2020

Conversation

gregjhogan
Copy link
Member

It is easy to overflow the TX buffer when sending a large number of messages rapidly. When sending UDS messages you don't want anything to get silently dropped. This change causes the client to pause (leave NAK active) until there is enough room in the TX buffer to process the next USB data packet. Not sure if this is desirable in all scenarios, or even if this is the right way to handle the situation, but it does seem to work.

@robbederks
Copy link
Contributor

robbederks commented Jan 23, 2020

Pretty sure this change also means that we need to add a timeout to the openpilot boardd code to prevent it from freezing if the CAN buffers are not cleared.
Another option would be to have disable this NAK'ing for openpilot and reverting back to the old way of just ignoring overflow after a command?
Any thoughts @pd0wm?

@pd0wm
Copy link
Contributor

pd0wm commented Mar 29, 2020

For openpilot the most recent messages are more important, so it makes sense to drop older ones if the buffer overflows. I would like the to detect and log when this happens though!

@gregjhogan
Copy link
Member Author

I was thinking we should set the timeout low (instead of 0) in boardd and then skip past messages that time out.

@pd0wm
Copy link
Contributor

pd0wm commented Mar 31, 2020

If I understand correctly, that will result in the same behavior as we have right now?

@gregjhogan
Copy link
Member Author

very close to the same since the ring buffer is pretty small (once a message is in the buffer it will always get sent); could also add an immediate NAK setting for OP but probably not necessary

@pd0wm pd0wm merged commit da8e00f into master Apr 6, 2020
@robbederks robbederks deleted the bulk-write-nak-fix branch April 7, 2020 02:16
theantihero pushed a commit to theantihero/panda that referenced this pull request Apr 13, 2020
* wait for tx slots before clearing nak

* fix bootstub

* Fixed misra

* Cleanup

* Added bulk write test to test USB NAK on bulk CAN messages

* Added automated bulk tx test

* Fixed linter

* Fixed latency test influence

* Added timeout to python API

* Disabled can write timeout in bulk write test

Co-authored-by: Robbe <robbe.derks@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants