Fix CliRunner error with logging message#3140
Conversation
|
I confirm that this approach does not regress on #2993 |
|
You're right thanks, I'll change it. |
|
Running |
|
Interesting, so there is a regression after all ? to be complete, I used a slightly different command to check this earlier. Here it is: Also note that I multiplied the number of tests by 10, as compared to my reprod in #2993 I also re-ran this just now, and still didn't hit the race. Maybe it's platform specific now ? I'm running macOS 15 on a 10-cores M2 arch for what it's worth (but I originally discovered this race in GHA). |
|
I'm running inside a Ubuntu 24.04-based docker container (GDAL 3.10.3) on a 16-core Ryzen 9950X3D running Ubuntu 24.04. I don't know anything about the race conditions, but I still believe the root cause is multiple objects believing they have ownership over these buffers. |
| if not self.copy_to.closed: | ||
| self.copy_to.flush() |
There was a problem hiding this comment.
unfortunately this LBYL (Look Before You Leap) approach is infamously inappropriate to guard against races; in a concurrent execution model, self.copy_to.closed could change between the moment the guard clause (if ...) is evaluated, and the rest of the block is, and then you're back to square one, but the code is more complex and the race is potentially harder to reproduce (which isn't necessarily a good thing when you're try to debug it).
I agree. That seems like an accurate description of the problem. |
|
I am in favor of introducing I am testing the effects of these two in: #3151 |
|
FYI, these changes does not fix the issues uncovered from #3151. It doesn't mean that this PR is not fixing something else! 😅 |
Tests using CliRunner run in parallel (with
pytest-xdist) sometimes raise an error due to logging messages not being processed correctly because a buffer has been closed. This bug is a due to a regression after #2991.fixes #3110
Difficult to write a test for it...