Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] SKR Pro and BLTouch not working properly with bugfixes since about 5/9/2020 #18372

Closed
viper93458 opened this issue Jun 20, 2020 · 30 comments

Comments

@viper93458
Copy link

Bug Description

With the recent versions of Bugfix (my last version was from 5/9/2020) the BLTouch probe on my SKR Pro randomly deploys during prints despite the G29 probing working without a hitch.

With the most recent BugFix versions (and also recent platformIO STM32 6.1.1 updates) the G29 completes, then Octoprint either throws an error that something on the printer has failed requiring an M999 to reset followed by the printer cooling down and just sitting there waiting for another attempt to print or it doesn't throw the visible error in OctoPrint and instead the printer simply stops moving immediately after the G29 and OctoPrint acts like its printing yet no movement happens. This requires me to reset the printer and try again (which never works as the cycle continues).

My Configurations

Configs.zip

Steps to Reproduce

  1. Use the most recent bugfix 2.0 version
  2. Start a print that includes a G29 to start
  3. G29 completes and printer either throws error and cools down or simply stops moving.

Expected behavior: [What you expect to happen]

Naturally I expect it to print. :)

Actual behavior: [What actually happens]

G29 completes and printer either throws error and cools down or simply stops moving.

Additional Information

Thanks for all the hard work on this wonderful firmware!

@sjasonsmith
Copy link
Contributor

So this worked for you with bugfix from May 9th, but fails with current bugfix?

I assumed this was going to be a deploy issue that I could debug on my desk. Unfortunately I don't have my Pro in a printer I can try actual printing with.

@viper93458
Copy link
Author

viper93458 commented Jun 21, 2020

It worked for me with YOUR branch as of 5/9/20 as the bugfix from that timeframe was boot looping. :)

I have been running your branch from your PR that is on hold up until the last 2 days when I tried the most recent bugfix (which doesn't boot loop and seems to use the EEPROM just fine). Unfortunately it has some weird issue with BLTouch.

William

@boelle
Copy link
Contributor

boelle commented Jun 29, 2020

what is the weird issue with BL touch?

@viper93458
Copy link
Author

@boelle :)

As written in the description:

With the recent versions of Bugfix (my last version was from 5/9/2020) the BLTouch probe on my SKR Pro randomly deploys during prints despite the G29 probing working without a hitch.

With the most recent BugFix versions (and also recent platformIO STM32 6.1.1 updates) the G29 completes, then Octoprint either throws an error that something on the printer has failed requiring an M999 to reset followed by the printer cooling down and just sitting there waiting for another attempt to print or it doesn't throw the visible error in OctoPrint and instead the printer simply stops moving immediately after the G29 and OctoPrint acts like its printing yet no movement happens. This requires me to reset the printer and try again (which never works as the cycle continues).

@sjasonsmith
Copy link
Contributor

@viper93458 does this happen on every print or only sometimes?
I'm hoping someone else using an SKR Pro (or other STM32F4 board, such as a FYSETC S6 or BTT GTR) can tell us whether they are seeing similar issues.

@viper93458
Copy link
Author

Every print as of the date of opening this issue. I reverted back to your PR version from here: #17970 to resolve for the time being.

@sjasonsmith sjasonsmith added help wanted Needs: More Data We need more data in order to proceed labels Jun 30, 2020
@sjasonsmith
Copy link
Contributor

Maybe tagging this as needing help will draw some attention to it from other users with similar configurations.

@Minims
Copy link
Contributor

Minims commented Jun 30, 2020

I have homing problem for now, but I will test SKR PRO with BLTOUCH if needed once my printer is back :-)

@GhostlyCrowd
Copy link
Contributor

I have not got these BLtouch issues on my SKR Pro, but I'm also using UBL @sjasonsmith

@MoellerDi
Copy link
Contributor

I used to have similar issues on my SKR Pro with BLTouch whenever NEOPIXEL_LED (and therefor PRINTER_EVENT_LEDS enabled by default). See #17683 (comment) and #17683 (comment)
(btw, I fixed it for me by disabling PRINTER_EVENT_LEDS)

In your config I see you are using #define RGB_LED and therefore #define PRINTER_EVENT_LEDS is also enabled.

#if ANY(BLINKM, RGB_LED, RGBW_LED, PCA9632, PCA9533, NEOPIXEL_LED)
  #define PRINTER_EVENT_LEDS
#endif

Could you please test if you still see the same issues with your BLTouch if you disable #define RGB_LED (//#define RGB_LED)? I assume it could be some kind of interaction between PRINTER_EVENT_LEDSand BLTouch.

@viper93458
Copy link
Author

That's the same recommendation that was made in another thread about bltouch issues a while back which I tried without success. I will give it another go along with the latest bugfix to see if there is any change and report back. Thanks!

@Dracrius
Copy link

Dracrius commented Jul 9, 2020

@viper93458 Try disabling/ commenting out #define ADAPTIVE_STEP_SMOOTHING I spent the last day dialing prob issues to this setting in #18598 .

I also had failures if I enabled #define MULTIPLE_PROBING and after checking your configs you have both settings enabled. After all my testing though I am sure you can leave #define MULTIPLE_PROBING 2 though it may be unneeded after you disable #define ADAPTIVE_STEP_SMOOTHING as my Bltouch meshes are fairly consistent with the default single probe mode.

As far as I can tell the outright failure with #define MULTIPLE_PROBING is because with multiple probing marlin must pickup the error in probing and throw a Probe Failure vs without where it just takes the odd readings from the ADAPTIVE_STEP_SMOOTHING bug as the bed depth resulting in a mesh with 1 or more -1mm dips.

@viper93458
Copy link
Author

@Dracrius Thanks. I will compile and test it now!

@GhostlyCrowd
Copy link
Contributor

Adaptive step smoothing is causing issues migrating to the new STM32 libs as well maybe once this pr #18496 comes to fruition it will solve this as well.

@viper93458
Copy link
Author

Disabling the adaptive smoothing and using the latest bugfix 2.0 from the last couple hours seems to work OK. My BLTouch isn't causing the printer to hang or crash at the start of a print and I haven't heard any random deployments where the pin hits the glass yet. :)

As there are other BLTouch related tickets open, perhaps this one is no longer needed and can be closed in favor of some of the others where deeeper troubleshooting is going on?

@Dracrius
Copy link

@viper93458 happy to hear it solved your issues!

@sjasonsmith
Copy link
Contributor

sjasonsmith commented Jul 10, 2020

@viper93458 let's leave it open for a bit. Keep using it like this for a while to be sure the problems really seem gone.

Right now it looks like multiple issues may all be related to ADAPTIVE_STEP_SMOOTHING causing too many high-priority step interrupts. We've only just started to solidify around this idea, so continuing to collect some more data will be helpful.

It is worth noting that on all STM32 devices servo control is done through software, not directly through a timer output. That interrupt is almost certainly lower priority than the step interrupt. If it is starved out by step interrupts the PWM output signal would be corrupted and could cause undefined behavior.

@viper93458
Copy link
Author

Sounds like a plan. Thanks to all for helping and working to hopefully find the root cause and squash the troubles.

William

@AnHardt
Copy link
Member

AnHardt commented Jul 10, 2020

At least for the SKR PROs a hardware PWMd servo output on PA1 would be possible.

Up to now i could not find any board with a suitable pin for a hardware_tone() - but i scanned only that i own.

@DoughyInTheMiddle
Copy link

DoughyInTheMiddle commented Jul 13, 2020

To add to this: After turning off ADAPTIVE_STEP_SMOOTHING, this also helped on my printer.

  • Ender 3 Pro
  • BLTouch 3.1
  • SKR Mini 1.2
  • Marlin-bugfix-2.0.x

Was having at least one probe failure on most mesh passes. Would occasionally get one clean, but usually I'd have passes with at least one if not two near-crashes into the bed.

To help with this, these settings seemed to fix so much (along with turning off step smoothing after the recommendations above):

Configuration.h:
E: #define BLTOUCH_SET_5V_MODE

Configuration_Adv.h:

E&C:  #define Z_PROBE_LOW_POINT          -3
E:  #define BLTOUCH_HS_MODE
D:  // #define ADAPTIVE_STEP_SMOOTHING

I do not have MULTIPLE_PROBING turned on, but I've seen it nearly error (flashes), but then it slowly raises Z, reprobes, gets a good sensor reading and moves on. I have now had about six passes in a row, seriously dialing in the leveling, and haven't a full on "Kill Stop" error hit.

Also of note: VSCode/PlatformIO just pushed updates to STM32 builds. Not sure if that also might have aided or not.

@sjasonsmith
Copy link
Contributor

@DoughyInTheMiddle a fix was merged about a week ago for boards such as your SKR Mini, which use HAL/STM32F1. I expect that will improve BLTouch reliability for you, even though ADAPTIVE_STEP_SMOOTHING still has some issues.

As for this issue, I'd like to keep it open to track the same BLTouch reliability issues on boards using HAL/STM32. This includes the SKR Pro, GTR, FYSETC S6, etc...

@sjasonsmith
Copy link
Contributor

These problems aren't as easy to observe as they are on the STM32F1 controllers, probably because there is just more available CPU time on these boards. You would probably have probe accuracy issues if probing with just the right feed rate, but I haven't tried to find something to reproduce this.

Instead, I am watching the Servo PWM output while performing arc movements. Arcs are an easy way to generate a gradient of step frequencies with a single line of gcode.

I'm triggering on probes that are outside of range for the the 1473µs "Stow" command. 1453-1493µs should be acceptable, but I am seeing all the way to 1420-1520µs in my simple tests. It looks like the STEP ISR is saturating the CPU for periods of time, completely blocking TEMP and SERVO ISRs from running. Under the right conditions it could be feasible to block the SERVO ISR for a very long time, enough to happen upon another command causing BLTouch misbehaviors.

@sjasonsmith
Copy link
Contributor

@viper93458 can you make your system fail reliably enough to test the pull request linked above?

@sjasonsmith
Copy link
Contributor

The PR I posted should resolve these issues with random BLTouch behaviors for all boards using HAL/STM32, which includes the SKR Pro, GTR, FYSETC S6, etc.

These problems were mostly exposed by ADAPTIVE_STEP_SMOOTHING, because it caused the stepper ISRs to saturate the CPU.
With this enabled my error was sometimes as high as 473µs, for what was supposed to be a 1473µs pulse!

If you have ADAPTIVE_STEP_SMOOTHING disabled the improvement is lesser, but still significant. In my tests error decreased from +/-13µs to +/-3µs. Both of these should control a BLTouch fine, but some configurations might have crossed into the invalid +/-20µs range even without step smoothing.

@viper93458
Copy link
Author

@sjasonsmith I have adaptive_step_smoothing turned off with current bugfix code. Am I to turn it back on and use current code with the PR or go back to a random broken version with adaptive_step_smoothing on and test? I will have to determine how to go back to an older version if I need to roll back first.

@sjasonsmith
Copy link
Contributor

@viper93458 if you want to test it needs to be from my branch on the PR #18702. If your only problems with ADAPTIVE_STEP_SMOOTHING were BLTouch problems then it would be good to test with it enabled.

Right now I would generally not recommend step smoothing due to it consuming all available CPU time. In this case it is helpful because it helps reproduce the BLTouch problems.

@sjasonsmith
Copy link
Contributor

@viper93458 any update on this? If not I'm going to assume this is resolved.

@viper93458
Copy link
Author

@sjasonsmith I am using current bugfix code and it works fine at this point. I do have adaptive step smoothing off still and haven't retested with it on. So as far as I am concerned this can be closed at this time. I know there was discussion going on here regarding other things to check so I left it be, but we may as well close it and see what happens when you new PR for STM32 8 is merged. :)

@sjasonsmith
Copy link
Contributor

@viper93458 as far as I know that PR is working fine on SKR Pro right now. Feel free to try it and report back if you encounter issues.

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked and limited conversation to collaborators Oct 16, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants