Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function to optimize prior under constraints #5231

Merged
merged 54 commits into from
Jan 4, 2022
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
8ca3ded
Replace print statement by AttributeError
AlexAndorra Nov 30, 2021
9dc0096
pre-commit formatting
AlexAndorra Nov 30, 2021
9675e4f
Mention in release notes
AlexAndorra Nov 30, 2021
d132364
Handle 1-param and 3-param distributions
AlexAndorra Dec 1, 2021
6f9ccd4
Update tests
AlexAndorra Dec 1, 2021
fea6643
Fix some wording
AlexAndorra Dec 1, 2021
524a900
pre-commit formatting
AlexAndorra Dec 3, 2021
91174b9
Only raise UserWarning when mass_in_interval not optimal
AlexAndorra Dec 3, 2021
29741f1
Raise NotImplementedError for non-scalar params
AlexAndorra Dec 3, 2021
1ad4297
Remove pipe operator for old python versions
AlexAndorra Dec 3, 2021
a708e6d
Update tests
AlexAndorra Dec 3, 2021
e1c5125
Add test with discrete distrib & wrap in pytest.warns(None)
AlexAndorra Dec 7, 2021
bc9b543
Remove pipe operator for good
AlexAndorra Dec 7, 2021
18ad975
Fix TypeError in dist_params
AlexAndorra Dec 7, 2021
e92d6d8
Relax tolerance for tests
AlexAndorra Dec 7, 2021
94b406b
Force float64 config in find_optim_prior
AlexAndorra Dec 14, 2021
76dbb1f
Rename file name to func_utils.py
AlexAndorra Dec 14, 2021
53bfc00
Replace print statement by AttributeError
AlexAndorra Nov 30, 2021
77a0bb1
pre-commit formatting
AlexAndorra Nov 30, 2021
fd5f498
Mention in release notes
AlexAndorra Nov 30, 2021
171a4aa
Handle 1-param and 3-param distributions
AlexAndorra Dec 1, 2021
36b95cb
Update tests
AlexAndorra Dec 1, 2021
55138d9
Fix some wording
AlexAndorra Dec 1, 2021
4bed2cd
pre-commit formatting
AlexAndorra Dec 3, 2021
02d117b
Only raise UserWarning when mass_in_interval not optimal
AlexAndorra Dec 3, 2021
7742571
Raise NotImplementedError for non-scalar params
AlexAndorra Dec 3, 2021
8a6e0e7
Remove pipe operator for old python versions
AlexAndorra Dec 3, 2021
602391b
Update tests
AlexAndorra Dec 3, 2021
9bb14a3
Add test with discrete distrib & wrap in pytest.warns(None)
AlexAndorra Dec 7, 2021
ab0ef0f
Remove pipe operator for good
AlexAndorra Dec 7, 2021
58f5d56
Fix TypeError in dist_params
AlexAndorra Dec 7, 2021
a6c7f0d
Relax tolerance for tests
AlexAndorra Dec 7, 2021
c9c24d6
Force float64 config in find_optim_prior
AlexAndorra Dec 14, 2021
c75f8c9
Rename file name to func_utils.py
AlexAndorra Dec 14, 2021
3ffd7ff
Change optimization error function and refactor tests
ricardoV94 Dec 16, 2021
a1a6bdf
Use aesaraf.compile_pymc
ricardoV94 Dec 21, 2021
7cd0e55
Merge branch 'optim-prior' of https://github.com/pymc-devs/pymc into …
AlexAndorra Dec 22, 2021
1d868fa
Add and test AssertionError for mass value
AlexAndorra Dec 22, 2021
063bc96
Fix type error in warning message
AlexAndorra Dec 23, 2021
cb7908c
Split up Poisson test
AlexAndorra Dec 24, 2021
16ed438
Use scipy default for Exponential and reactivate tests
AlexAndorra Dec 24, 2021
1b84e18
Refactor Poisson tests
AlexAndorra Dec 24, 2021
6ea7861
Reduce Poisson test tol to 1% for float32
AlexAndorra Dec 25, 2021
d63b652
Remove Exponential logic
AlexAndorra Dec 27, 2021
37e6251
Rename function
AlexAndorra Dec 27, 2021
b912ac6
Refactor test functions names
AlexAndorra Dec 28, 2021
d4bce39
Use more precise exception for gradient
AlexAndorra Dec 30, 2021
9a51289
Don't catch TypeError
AlexAndorra Jan 3, 2022
90a88ff
Merge branch 'main' into optim-prior
AlexAndorra Jan 3, 2022
8b9ae6e
Remove specific Poisson test
AlexAndorra Jan 3, 2022
d53154a
Remove typo from old Poisson test
AlexAndorra Jan 3, 2022
1f42835
Put tests for constrained priors into their own file
AlexAndorra Jan 3, 2022
bad236c
Add code examples in docstrings
AlexAndorra Jan 4, 2022
d89e375
Merge branch 'main' into optim-prior
AlexAndorra Jan 4, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion RELEASE-NOTES.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,8 @@ This includes API changes we did not warn about since at least `3.11.0` (2021-01
- Added partial dependence plots and individual conditional expectation plots [5091](https://github.com/pymc-devs/pymc3/pull/5091).
- Modify how particle weights are computed. This improves accuracy of the modeled function (see [5177](https://github.com/pymc-devs/pymc3/pull/5177)).
- `pm.Data` now passes additional kwargs to `aesara.shared`. [#5098](https://github.com/pymc-devs/pymc/pull/5098)
- ...
- The new `pm.find_optim_prior` function can be used to find the optimal prior parameters of a distribution under some
constraints (e.g lower and upper bound). See [#5231](https://github.com/pymc-devs/pymc/pull/5231).


### Internal changes
Expand Down
1 change: 1 addition & 0 deletions pymc/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@ def __set_compiler_flags():
from pymc.distributions import *
from pymc.distributions import transforms
from pymc.exceptions import *
from pymc.find_optim_prior import find_optim_prior
from pymc.math import (
expand_packed_triangular,
invlogit,
Expand Down
96 changes: 96 additions & 0 deletions pymc/find_optim_prior.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# Copyright 2021 The PyMC Developers
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from typing import List

import aesara
import aesara.tensor as aet
import numpy as np

from scipy import optimize

import pymc as pm

__all__ = ["find_optim_prior"]


def find_optim_prior(
pm_dist: pm.Distribution,
lower: float,
upper: float,
init_guess: List[float],
ricardoV94 marked this conversation as resolved.
Show resolved Hide resolved
mass: float = 0.95,
) -> np.ndarray:
"""
Find optimal parameters to get `mass` % of probability
of `pm_dist` between `lower` and `upper`.
Note: only works for two-parameter distributions, as there
are exactly two constraints.

Parameters
----------
pm_dist : pm.Distribution
PyMC distribution you want to set a prior on.
Needs to have a ``logcdf`` method implemented in PyMC.
lower : float
Lower bound to get `mass` % of probability of `pm_dist`.
upper : float
Upper bound to get `mass` % of probability of `pm_dist`.
init_guess: List[float]
Initial guess for ``scipy.optimize.least_squares`` to find the
optimal parameters of `pm_dist` fitting the interval constraint.
mass: float, default to 0.95
Share of the probability mass we want between ``lower`` and ``upper``.
Defaults to 95%.

Returns
-------
The optimized distribution parameters as a numpy array.
"""
if len(pm_dist.rv_op.ndims_params) != 2:
ricardoV94 marked this conversation as resolved.
Show resolved Hide resolved
raise NotImplementedError(
"This function only works for two-parameter distributions, as there are exactly two "
"constraints. "
)

dist_params = aet.vector("dist_params")
lower_, upper_ = aet.scalars("lower", "upper")

dist_ = pm_dist.dist(*[dist_params[i] for i in range(len(init_guess))])
try:
logcdf_lower = pm.logcdf(dist_, lower_)
logcdf_upper = pm.logcdf(dist_, upper_)
except AttributeError:
raise AttributeError(
f"You cannot use `find_optim_params` with {pm_dist} -- it doesn't have a logcdf "
f"method yet. Open an issue or, even better, a PR on PyMC repo if you really need it."
)

alpha = 1 - mass
out = [logcdf_lower - np.log(alpha / 2), logcdf_upper - np.log(1 - alpha / 2)]
logcdf = aesara.function([dist_params, lower_, upper_], out)

try:
symb_grad = aet.as_tensor_variable([pm.gradient(o, [dist_params]) for o in out])
jac = aesara.function([dist_params, lower_, upper_], symb_grad)

# when PyMC cannot compute the gradient
except Exception:
jac = "2-point"

opt = optimize.least_squares(logcdf, init_guess, jac=jac, args=(lower, upper))
if not opt.success:
raise ValueError("Optimization of parameters failed.")

return opt.x
14 changes: 14 additions & 0 deletions pymc/tests/test_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -142,3 +142,17 @@ def fn(a=UNSET):
help(fn)
captured = capsys.readouterr()
assert "a=UNSET" in captured.out


def test_find_optim_prior():
MASS = 0.95
opt_params = pm.find_optim_prior(pm.Gamma, lower=0.1, upper=0.4, mass=MASS, init_guess=[1, 10])
np.testing.assert_allclose(opt_params, np.array([8.47481597, 37.65435601]))

opt_params = pm.find_optim_prior(
pm.Normal, lower=155, upper=180, mass=MASS, init_guess=[170, 3]
)
np.testing.assert_allclose(opt_params, np.array([167.5000001, 6.37766828]))

with pytest.raises(NotImplementedError, match="only works for two-parameter distributions"):
pm.find_optim_prior(pm.Exponential, lower=0.1, upper=0.4, mass=MASS, init_guess=[1])