-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow specification of initial state for sample
#119
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #119 +/- ##
==========================================
- Coverage 97.37% 96.87% -0.51%
==========================================
Files 8 8
Lines 305 320 +15
==========================================
+ Hits 297 310 +13
- Misses 8 10 +2
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Can you add tests of all new lines?
- The names
init_params
andinitial_state
seem inconsistent, can we either useinit
orinitial
in both cases?
src/sample.jl
Outdated
chains = Distributed.pmap( | ||
sample_chain, pool, seeds, _init_params, _initial_state | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work? I don't think pmap
broadcasts its arguments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't have to, right? _init_params
will either be fill(nothing, nchains)
or it will be _initial_state
which should also be a vector of the correct length (I should add a check for this though)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah, you created these arrays. The motivation for the branch here was to avoid allocating such arrays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, am aware 👍 But the branching becomes a bit more annoying if we also have to do this for init_state
, so I figured just allocating was a better. I can make it a conditional if you prefer 👍
Will do (but tomorrow; am about to sleep)!
Agreed. Hmm, I guess I'll make it |
It's officially supported: It's clearly documented and used in downstream packages such as EllipticalSliceSampling, DynamicPPL, and Turing. |
Yes, that I'm fully aware of! Just remembered seeing the following in the docs:
Which I took to mean "we haven't really made an explicit decision on how to support initial parameters, but because so many downstream packages use But nonetheless, I'll change it to |
Generally, I think I'd prefer Since #120 requires a breaking release anyway, we could also include more breaking changes. Otherwise we could deprecate the keyword argument, maybe something like the following could work: function f(...; init_params=nothing, initial_params=init_params)
if init_params !== nothing
if initial_params !== init_params
throw(ArgumentError("..."))
end
Base.deprecate("....", f)
end
...
end |
@devmotion I just came across this again; given that we decided to scrap #120 in the end, what do you think of at least merging this? Being able to specify the initial state would be quite useful. EDIT: Just remember you're on vacation! No need to rush this:) |
I decided to just rip the band-aid off and go with renaming If we don't, we end up in a somewhat awkward scenario where we have to pass along both |
Most certainly 👍 |
bf70831
to
00054ef
Compare
Rebased and change base for PR. I'll add some tests for the |
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
00054ef
to
ca4f4b9
Compare
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Seems the PR breaks EllipticalSliceSampling? 🤔 Interesting, even though it won't cause any breakage in practice if we tag a breaking release. |
Could it maybe be something related to But yes, I'll bump the major version so won't break anything in practice:) |
Yeah, I assume that's the reason: https://github.com/TuringLang/EllipticalSliceSampling.jl/blob/ca4babb2baba9008805bc8234a6fd182119e57dc/src/abstractmcmc.jl#L25 https://github.com/TuringLang/EllipticalSliceSampling.jl/blob/ca4babb2baba9008805bc8234a6fd182119e57dc/test/simple.jl#L40 I wonder though why other packages that currently use |
@@ -9,6 +9,7 @@ version = "4.5.0" | |||
BangBang = "198e06fe-97b7-11e9-32a5-e1d131e6ad66" | |||
ConsoleProgressMonitor = "88cd18e8-d9cc-4ea6-8889-5259c0d15c8b" | |||
Distributed = "8ba89e20-285c-5b6f-9357-94700520ee1b" | |||
FillArrays = "1a297f60-69ca-5386-bcde-b61e274b549b" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means it could be removed from the [extras] section below. But do we actually have to depend on FillArrays? I think we should just forward the user-input or nothing
, but not build any arrays explicitly?
_initial_params = | ||
initial_params === nothing ? FillArrays.Fill(nothing, nchains) : initial_params | ||
_initial_state = | ||
initial_state === nothing ? FillArrays.Fill(nothing, nchains) : initial_state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nice to avoid these FillArray
s. Maybe we could
- move the function below to a callable struct (could be shared between all ensemble algorithms maybe?)
- pass
initial_params
andinitial_state
to the constructor as well but only use it to define type parameters that allow us to distinguish between the four possible cases (no initial params and no initial state, only initial state, only initial params, and both initial params and state) - define the function of the callable struct depending on the type parameters, forwarding the versions with only the seed or only the seed and one additional argument to the three-argument version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is adding in callable structs here really an improvement? 😕 I agree it's more efficient, but it seems like this will be quite a bit more complex + the efficiency doesn't really matter here, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A callable struct should generally be better for the compiler than a closure, shouldn't it? Regardless of whether we change or add arguments as in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep! But is this performance critical code? And it seems to be me that we'll need a callable struct for each scenario?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤷 Are you sure? To me it seems one struct is sufficient - both the multithreaded and the multicore version seem to use the same inner structure, and in the serial case we could set channel = nothing
. If needed we could also dispatch on the type of the algorithm to handle minor differences in the function call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me it seems one struct is sufficient
To clarify, I don't question whether we can have a single callable struct with different call implementations; I meant more that it seems it won't be as simple as just doing
struct SampleFunc
# ...
end
function (f::SampleFunc)(args...)
# ...
end
multithreaded and the multicore version seem to use the same inner structure
But if we put initial_params
and initial_state
in the callable struct, then we'll need to pmap
, etc. over a range containing the corresponding indices, no? Which seems like it would lead to more allocations than the current impl using Fill(nothing, nchains)
?
Or am I misunderstanding what you mean here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just bumping this discussion:) Would be nice to get a version of this PR merged.
Very likely 😕 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's go with the FillArrays dependency for now.
…tial-state"" This reverts commit 420e588.
This seems convenient to have, e.g. for resuming sampling, running special warm-up procedures.
EDIT: Note that this is now dependent on #126