-
Notifications
You must be signed in to change notification settings - Fork 219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run tests with Julia 1.5 instead of Julia 1.6 #1568
Conversation
Do you understand why the tests here are failing on 1.5? Seems like it's still complaining about segmentation fault (Signal 11 = SIGSEGV)? |
No, I don't know why it segfaults. I reran the tests, so it seems to be reproducible. |
Have you tried running locally? Trying now myself. |
Yes, it segfaults. I try 1.5.3 now. Currently, I assume it is caused by the multithreading tests. |
1.5.3 seems to work, at least it passed the point were Julia 1.5.4 failed. |
Ah, yeah I'm running 1.5.3 locally 😕 Isn't this a bit strange though? If it's something with the multithreading, shouldn't this have been caught in DPPL tests? |
I don't know which of the changes in 1.5.4 causes the segfaults: JuliaLang/julia@v1.5.3...v1.5.4 |
I think this is the offending line btw: https://github.com/TuringLang/Turing.jl/blob/master/test/inference/Inference.jl#L106 At least the one where it errors. Basing it off the |
Just wanted to change this line 😄 I don't think we ever tested DynamicPPL on Julia 1.5.4. I assume we only checked 1.5.3 and 1.6.0 as soon as it became available. |
Haha 😄
Ah, that makes sense. |
It's the first multi-threaded testset in Inference.jl. As soon as I removed it, I could run the tests successfully on Julia 1.5.4. I wonder if it's a specific algorithm in this testset that is problematic or if it is a general problem. |
You're referring to the part where it iterates through the samplers? Do you know which algorithm it fails for? |
Yes, exactly, in the part where it loops through the samplers. I added print statements at the top of the for loop and it seems |
Gibbs with PG segfaults as well. So it seems only PG is problematic. I wonder if multithreading with Libtask is broken on Julia 1.5.4. @KDr2 Is there any change in JuliaLang/julia@v1.5.3...v1.5.4 that would explain problems with Libtask on Julia 1.5.4? |
Hmm maybe it's fine to disable these multithreading tests for PG? Standard sampling seems to work, and particle methods are somewhat broken anyway since they don't pass around the RNG (see e.g. TuringLang/AdvancedPS.jl#20 (comment)) - maybe that's the reason for the segfault? |
I mean, I'm fine with it. Feels like something @yebai should maybe comment on? For some context: tests are failiing due to PG methods, which means that there are other bugfixes that we can't get through. |
I just want to stress that in general particle methods are fine - it's just some problem related to multithreading. And I don't know what change in Julia 1.5.4 could have caused it - maybe it's something Libtask-specific? Maybe related to the RNG issues of particle methods? |
Let's just drop the test for now and create an issue. @yebai said he was also fine with it so don't think anyone is opposed to it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This is caused by a change of @@ -1751,8 +1751,6 @@ typedef struct _jl_task_t {
jl_gcframe_t *gcstack;
// saved exception stack
jl_excstack_t *excstack;
- // current world age
- size_t world_age;
// id of owning thread
// does not need to be defined until the task runs So the memory layout of We should build another jll package for [1.5.4, 1.6.0) to fix this, a PR against Yggdrasil should be made, like the one for v1.6.0 (https://github.com/JuliaPackaging/Yggdrasil/pull/2176), but no code need to be changed except the julia version. |
Status Quo:
If we file a new PR for 1.5.4, it will be:
Is it rational and feasible? How about filing two PR to make this:
|
IMO it is fine to not have a perfectly ordered versioning scheme in this case but maybe we could just ask in the PR for 1.5.4 what's commonly done? I am sure others had same/similar problems before. |
I found that Julia 1.5.4 is not available on Yggdrasil... |
Since Turing (or rather Libtask_jll) is not compatible with Julia 1.6 yet (see e.g. #1554 and the explanations there).
I guess it would be good to open an issue that reminds us to revert this change once a compatible version is available.