Skip to content

turn on ticks during USB writes #4075

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

dhalbert
Copy link
Collaborator

@dhalbert dhalbert commented Jan 27, 2021

Fixes #3986.

Before 6.0.0, there was alway a 1ms tick mechanism that ran a background task. The task include a call to tud_task(). Starting in 6.0.0, we turned off 1ms ticks all the time, and now we currently call tud_task() only when there's an interrupt.

It appears we need to call it more often than that when USB writes happen, because some writes are not triggered by an interrupt.

So, this PR adds usb_background() back to the supervisor tick background task. And, when a write starts, we enable the 1ms ticks so tud_task() will be called on every tick. When the write completes, we turn off ticks for us. (If someone else needs ticks, they stay on, due to tick_enable_count.)

@hathach thinks there may be some other reason writes are not being scheduled properly, but this appears to solve the problem for now. This fix could be removed or modified when more analysis is done in TinyUSB about this issue.

@dhalbert dhalbert requested a review from tannewt January 27, 2021 01:30
Copy link

@jepler jepler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general outlines of this make sense, in retrospect. No testing performed.

Copy link
Member

@tannewt tannewt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced this is the correct fix. It's not clear to me that we've gotten to the point of understanding what we aren't doing but should be.

This actually has two main changes and we should only need one.

  1. It always runs usb_background in supervisor_background_tasks. This circumvents the callback queue that @jepler added.
  2. It enables ticks during writes which limits sleep times to a tick period which then runs supervisor_background_tasks more frequently when sleeping. Is this problem only present when time.sleep is used? Is the problem fixed when wfi isn't used?

tud_task only needs to be called if an event has been queued up there is no magic "call once a ms" requirement. (It returns quickly when the queue is empty: https://github.com/hathach/tinyusb/blob/045674745afa59028fbeed6dac5cb5a9c4a6033e/src/device/usbd.c#L457 ) If we're not calling it when we should, that means something was queued without our detection. I assume that all events are queued by an interrupt (which should trigger our background task as-is) but they could be queued by tud_task itself as well.

Instead of calling usb_background in background_tasks try calling tud_task_event_ready to see if there is a task queued up and then also check the background task queue to see if usb is queued. The event that is queued in tinyusb but not circuitpython is your culprit.

If the event changes, maybe there is a usb interrupt between when the usb background task completes and when it's removed from the task queue.

@dhalbert
Copy link
Collaborator Author

dhalbert commented Jan 27, 2021

I'm not convinced this is the correct fix. It's not clear to me that we've gotten to the point of understanding what we aren't doing but should be.

I don't think we actually disagree about this. We don't understand the root cause. I am fixing the symptom, and expect we can fix it more properly later. My goal is to make things work better for now, because this seems like a big support headache and regression otherwise. Then we can debug the root cause at our leisure. If you think this is not as serious a problem as I think it is, I can turn this to draft or close it for now. It just seemed kind of bad to me.

This actually has two main changes and we should only need one.

  1. It always runs usb_background in supervisor_background_tasks. This circumvents the callback queue that @jepler added.

That was an easy addition so I could get usb_background() to be called on ticks. It could be conditionalized more with a call to tud_task_event_ready(). The call is cheap.

  1. It enables ticks during writes which limits sleep times to a tick period which then runs supervisor_background_tasks more frequently when sleeping. Is this problem only present when time.sleep is used? Is the problem fixed when wfi isn't used?

It can happen when absolutely nothing is going on: no code.py, and not in the REPL (no serial connection). I tried commenting out the WFI; it didn't help.

tud_task only needs to be called if an event has been queued up there is no magic "call once a ms" requirement. (It returns quickly when the queue is empty: https://github.com/hathach/tinyusb/blob/045674745afa59028fbeed6dac5cb5a9c4a6033e/src/device/usbd.c#L457 ) If we're not calling it when we should, that means something was queued without our detection. I assume that all events are queued by an interrupt (which should trigger our background task as-is) but they could be queued by tud_task itself as well.

I think that assumption may not be true. It appears that tud_task() is queuing something, and there is no corresponding interrupt. So the question is how do we check for that? If we turn off 1ms ticks, then we don't do anything until an interrupt happens or the sleep timer runs out. So there's no place where we can add polling.

Instead of calling usb_background in background_tasks try calling tud_task_event_ready to see if there is a task queued up and then also check the background task queue to see if usb is queued. The event that is queued in tinyusb but not circuitpython is your culprit.

Right, I agree, I think the "culprit" is some work that's queued that has no corresponding interrupt, so we have to poll for it. I don't know whether that's an actual bug or not.

tud_task_event_ready() just checks to see if the event queue is empty. tud_task() does essentially the same thing by trying to read from queue, and if there's nothing, it just returns. So it is only very slightly more expensive, as you mentioned.

If the event changes, maybe there is a usb interrupt between when the usb background task completes and when it's removed from the task queue.

I could just set a flag in the usb interrupt routine and check for this to detect this race condition.

The fundamental question is whether we're missing an interrupt, or whether the queued work has no relation to an interrupt. I am assuming the latter for now, and that's what this (probably temporary) fix is based on.

@dhalbert
Copy link
Collaborator Author

dhalbert commented Feb 4, 2021

Closed in favor of #4122.

@dhalbert dhalbert closed this Feb 4, 2021
@dhalbert dhalbert deleted the tud_task-write-ticks branch February 24, 2021 23:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Copying multiple larger files can hang the copy if autoreload is on
3 participants