-
Notifications
You must be signed in to change notification settings - Fork 515
Fix flashing problems with linux. #1136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
earlephilhower
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
First of all, could you please remove the commented/quoted out code. git history will keep any old stuff for reference.
What exactly was the issue you were trying to solve here? Eyeballing it, it seems like this will make a list of all rootpaths which exist, not the 1st one encountered (i.e. iif you had a dummy /run/media dir the original would not even look a drive mounted in /media. Is that the base issue, or did you have some other reason?
|
Yes, the problem is a few different cases. Not sure how far behind the version is for the install from the IDE, but it seems to not always automatically reset when switching to 1200bauds. Not related, but I thought you should be aware of that. Unfortunately the fix that works for me is to re-plug the device and/or hold boot and powercycle the pico. My guess is that the best way to do this would be to to the udiskctl first, without sniffing stdout, and then pause and use the known suspect places to locate it, or simply to try twice, which is the "usually works" part, because sometimes it does the reset and role change, however the mount comes in much later. Another point is that udisks is going to not work on a headless or text-only setup, since udisks is related to GUI desktops, instead automount in fstab or udev rules are the usual alternatives. I'll remove the dead code, I left it in as an easier reference for myself. |
|
While now more reliable, it still seems to require a few tries. I will experiment with the delay at the beginning, which seems to help. My guess is that udisksctl is called while it is already trying to run, and hits mid-point. then it freaks out, since udisks interfered with it's self, and goes back to ACM device |
|
What distro are you running? I'm not really getting many reports of issues with things as they are and it's been very stable in my experience. And as long as the serial port is up, the 1200bps tickle is super stable since it's running code the RPi team wrote themselves. :) If you're really having trouble, why not use the Picotool upload method which doesn't use the FS at all... |
|
Basically, Linux mint, but with kubuntu shims for KDE. |
|
Seems like updating that delay that I put in did the trick for me. |
|
Even though it delays, I think it should attempt a retry at least twice, if it works the first time, then there's no loss. |
|
Actually, I believe I found another root cause... python isn't closing the ACM port after flipping to 1200 baud. This causes the USB device to hang and not enumerate into disk until python exits. |
|
Here's more testing data. I'll put up the final changes in a bit too. Try to upload, pico goes off line (as indicated by the USB hub LED) and fails, but doesn't indicate the disconnect in the Kernel log until much later. Then the the HUB's LED turns on.. and udisks mounts it. |
…eck with USB analyzer.
|
This is as good as I can get it. |
|
Part of the problem (as I see it, and have experienced in the past) is having the USB D +/- lines to HiZ, or into an illegal state for a few ms before the pico hits the bootloader. Otherwise a hub never sees a "re-connection" event. From this side, we can only do so much, and I will have to take a good look at the code that calls the baked-in ROM. I am betting that it doesn't disable the USB PHY or change the state to something before that point, and if it does, needs a delay so that it looks like a re-connection event. Basically, we need to trick the system into issuing a USB reset sequence. I'll have some time later tonight to put it on my USB analyzer and get the actual facts of what is or isn't happening on the wire. As far as this side of the code, this is as best as I could get it to do successful uploads without pressing bootsel and re-plugging. |
|
We do have full control over the USB peripheral in the chip. It would be possible to set it to reset (not sure what the pin states are at that point) and sleep for 500ms before rebooting into ROM mode. -edit- |
|
Made a pretty sloppy fix, that works consistently.
I'll put in better comments on what it's doing in the code, and push it fairly soon. |
|
All pretty, and working excellent. |
|
Last annoyance to fix, is to sleep in the python script after successful update to allow the USB to re-enumerate. While not always fatal, I have seen a few cases where everything collapses. Basically the IDE is expecting it to be ready upon return. So the "fix" is to fool it into thinking it is still being used for a little bit of time, perhaps 1000ms? |
|
@xxxajk did you want to stuff in a Also, needs a minor tweak on the formatting of the reset, check the astyle run. For sanity's sake we require |
earlephilhower
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Still some small concerns and the astyle (you can run ./tests/restyle.sh to fix)...
|
All works great as-is. |
|
P.S. I'll be fixing SPI next, so that it is compliant with the prior fixes I assisted/did for the following cores: |
|
Don't worry, I didn't forget about this! I'm in the middle of some SDK 1.5 stuff, but hope to pull and try this out by the end of the week for inclusion in the next dot-release. |
earlephilhower
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've done some testing on my own box and this PR significantly increases upload times.
Using blink.ino, uploading 2-3 times before actually measuring (so all binary is in RAM and the Pico is running the updated/original USB code) I get the following average times using my stopwatch from the time upload starts to when the IDE returns on Linux:
master: 5.6s average of 3 runs- #1136 : 9.2s average of 3 runs
(TBD: I'll see how this affects Windows uploads later tonight)
Given the actual binary upload portion is the same, we've added about 4s of time to every upload which really makes it feel less usable overall.
That said, I did not see any warnings about the missing /dev/ttyACM port with the change. FWIW I was never able to see the errors your own box seemed to have on the USB toggle->reset->MSD mode switch so I can't comment other than that still seems to work fine.
I appreciate your on the belt and braces (and elastic waistband) additions, but making the IDE seem slower for everyone, all time, seems like a no go for me.
Let me mess around with a few ideas here and see if that 4s of wasted time per upload can be dropped down any simple way.
* Split the difference on sleeping between get_drive calls * Minimize the time spent at 9600bps->1200bps transition * Abort sleep after upload if serial port comes back sooner
|
I'm at about 7.3s per upload with the minor tweaks to |
|
Windows testing looks good, actually seems a little faster than Linux at ~4.7s average. And no warnings about the serial port not appearing in the IDE 2.0. FWIW, the drive detection and serial port reconnect are much faster, at least on my Windows 10 Intel i5 12th gen vs. my Linux AMD 5700X. So, after the |
If it really is necessary to process USB events whilst in the reset sequence, we'll need to find another way of handling this. For now, pretend the user completely unplugged the device which *should* (I hope) be handled normally by the host USB stack.
|
Thanks again for the work on this! |
|
AMD Ryzen 7 3700X 8-Core Processor and ASUS TUF Gaming Wifi here. This particular hardware has very particular bugs, and USB can be quite painful. It was bought for the purpose of finding faults like this in firmware and was used to do a similar fix with the Ultimate Hacking Keyboard https://github.com/UltimateHackingKeyboard/firmware . I'll check on the results of you removing the loop, but hangs without it because the USB protocol gets violated (See the trace I posted). Instead of eliminating the loop, it might be better to shorten it. The reason why it works, is because there are pending events waiting, and those need to be handled. I do not mind a few seconds of delay if it is reliable. The possible difference here is that I am on a powered USB3.x hub on a 3.x port, also since the different BIOS doing different tweaks to the chipset, that could also be a reason. |
|
The issue with the loop is that it's called from i.e. This is invalid Whereas you could do Also, while it's a "violation" of the USB spec, how is it any different from a surprise removal, which is legal and can happen any time, no? |
|
What I have found is the recursive call (while not intended to be used) works because the messages are already there in the queue, waiting to be processed (not requiring the RX IRQ). Messages arrive as multiple small packets on high-speed USB 2 and greater inside what amounts to a larger wrapper packet. 99.9% of the host hardware out there, only does a software or hardware check when: Now as far as the difference, and why a surprise removal works, depends on how the OS handles a disconnection, and there are 3 types.
What is happening with not processing the final packets is the first one... because it's not a power loss, and the PC simply thinks that it dropped into a sleep mode. Without simulating the D+/- line state of a physical disconnect, the only option is to process any left-over packets, which is what the loop does, even though you "are not supposed to do that". So until I get to fiddling with the D+/- lines, a bandage is to do the for() loop. Yeah, it probably could be shorter and the iteration count I simply made up in order to just get it working. |
I would suggest that anyone with a motherboard with such a messed up chipset/BIOS/etc. take it to the nearest electronics recycler and get a shiny Ryzen 5 series and mobo. :) I'm a little worried about performing extreme unction, breaking some TinyUSB usage rules, and adding additional delays to every upload, ever, for that specific case because I'm not getting reports like that here. If it was something everyone was complaining about, sure. But this seems like just a buggy USB implementation on the host we're bending over backwards to fix at the cost of everyone else's experience. Another option, not breaking TinyUSB contracts: make the USB message pump do the work for you: arduino-pico/cores/rp2040/RP2040USB.cpp Lines 345 to 353 in 2acc161
Make the "imreseting" flag global and in the CheckSerialReset() just set the flag and return. In the above function you can check for that global flag and, if set, do your add'l calls to The same setup in the FreeRTOS USB task is left as an exercise for the reader. :) |
Actually, most developers don't consider this board as something to develop on because of the implied "gaming" in the name. Hardly anyone uses it on Linux, which is Ryzen 7, and better than 5. This system was purchased and gifted to me specifically because it has this problem on purpose in order to make stuff work with it because I have the hardware tools to see what is happening on the actual physical wires (See posted trace!). I even have problems with monitoring the UPS unless I have a USB 2.0 hub on any of the ports, internal, external, all of them wedge, or simply mysteriously reset without it, going into sleep on the host. Basically the fix is to use a very dumb USB2.0 hub that totally lacks the ability to go into sleep. Thankfully, this is actually optional in the USB spec and the hub keeps everything alive.
Very few people usually report a problem, or are willing to or know how to or have the tools to fix bugs. If you go and look at the USB trace I provided, it CLEARLY shows what is happening on the WIRE, and that is the DTR/baud change is never replied back to the host, and the chipset I have, gets confused, since it is expecting one, and it sees a sleep state, and does not know how to properly see this as a disconnect error condition, or how to report it to the kernel. Kernel takes a long time (up to a a few minutes) to time out. 4sec v.s. several minutes, is an improvement. I agree it may be a little long, especially since the loop count was pulled out of the air.
Exactly, as I said, was a bandage until I know better. :-) |
🤣 If only! Believe me, enough folks let you know when stuff doesn't work! It's when things do work they're quiet like 🐭. Anyway, feel free to throw in a few hundred tud_task loops in USB pumps for a new PR and I'll be happy to give it a spin. |
Add VERBATIM to add_custom_command Fixes earlephilhower#1043

Fixes locating commonly mounted paths on Linux. There were too many assumptions made.
I left the original code in, but commented it out as a reference.
Note:
Since this currently "works for me" the TO-DO is only meaningful if someone else has a different issue and wants to scan something else than what is in the current common path coverage, caused by customization.