-
Notifications
You must be signed in to change notification settings - Fork 360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conservative search for target epsilon in get_noise_multiplier #348
Conservative search for target epsilon in get_noise_multiplier #348
Conversation
This pull request was exported from Phabricator. Differential Revision: D34003401 |
opacus/accountants/utils.py
Outdated
while eps > target_epsilon: | ||
sigma_max = 2 * sigma_max | ||
accountant.steps = [(sigma_max, sample_rate, int(epochs / sample_rate))] | ||
sigma_high = 2 * sigma_high | ||
accountant.steps = [(sigma_high, sample_rate, steps)] | ||
eps = accountant.get_epsilon(delta=target_delta, **kwargs) | ||
if sigma_max > MAX_SIGMA: | ||
if sigma_high > sigma_max: | ||
raise ValueError("The privacy budget is too low.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we try to automatically initialize sigma_high
for the binary search. This behaviour is not consistent with the docstring: I can set a specific bound but it can change it anyway. Also, having both sigma_high
and sigma_max
is counter-intuitive and it takes knowing the whole algorithm to understand what is the difference.
Proposal
Leave only sigma_low
, sigma_high
in parameters. I would also rename to min/max to emphasize that the result will be bounded:
def get_noise_multiplier(
...
sigma_min: float = 0.01,
sigma_max: Optional[float] = None,
)
If sigma_max is None
, run this part to get the upper bound starting from sigma_max+1
.
Leave MAX_SIGMA a global constant, maybe make it higher.
opacus/accountants/utils.py
Outdated
sigma_precision=0.01, | ||
sigma_low=0.01, | ||
sigma_high=10, | ||
sigma_max=2000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add type definitions please
opacus/accountants/utils.py
Outdated
accountant: str = "rdp", | ||
sigma_precision=0.01, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's rename it to sigma_tolerance
, which is the common word used in optimization (see scipy.bisect).
Isn't this precision too low? I mean we should make this value lower, closer to typical float precision. Btw, this higher/lower confusion is another reason to rename the argument to tolerance.
opacus/accountants/utils.py
Outdated
epochs: int = None, | ||
steps: int = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the argument can be None, it should have type Optional[int]
|
||
return sigma | ||
return sigma_high |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A possible corner case when sigma_high < sigma_low
is not considered here.
opacus/accountants/utils.py
Outdated
while sigma_high - sigma_low > sigma_precision: | ||
sigma = (sigma_low + sigma_high) / 2 | ||
accountant.steps = [(sigma, sample_rate, steps)] | ||
eps = accountant.get_epsilon(delta=target_delta, **kwargs) | ||
|
||
if eps < target_epsilon: | ||
sigma_max = sigma | ||
sigma_high = sigma | ||
else: | ||
sigma_min = sigma | ||
sigma_low = sigma |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking if we can use scipy.optimzie.bisect()
instead of implementing our own.
It is easy to implement binary search incorrectly so we need to make tests for all corner cases. If we use scipy, we don't need to add our own tests, plus it is much more stable than the rest of our dependencies.
opacus/accountants/utils.py
Outdated
accountant: str = "rdp", | ||
sigma_precision=0.01, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to add check that sigma_tolerance
is strictly positive. Alternative: pass is to scipy.bisect and let it do the check for you.
This pull request was exported from Phabricator. Differential Revision: D34003401 |
Summary: Pull Request resolved: pytorch#348 Noise multiplier now verifies that the actual epsilon is *lower* than the target epsilon. Bounds of the range of search for sigma, and precision can now be given as arguments to get_noise_multiplier. Differential Revision: D34003401 fbshipit-source-id: bed5aab383a867d1608e692c54c0832b8acef373
e43d06b
to
19d2981
Compare
Summary: Pull Request resolved: pytorch#348 Noise multiplier now verifies that the actual epsilon is *lower* than the target epsilon. Bounds of the range of search for sigma, and precision can now be given as arguments to get_noise_multiplier. Differential Revision: D34003401 fbshipit-source-id: ba00d6bc0a60bdba71564a81dd5e895590c7a0ea
19d2981
to
affd532
Compare
This pull request was exported from Phabricator. Differential Revision: D34003401 |
opacus/accountants/utils.py
Outdated
eps = accountant.get_epsilon(delta=target_delta, **kwargs) | ||
if eps < target_epsilon: | ||
warnings.warn( | ||
f"Noise multiplier of {sigma_low} is still too high for a privacy budget of {target_epsilon}, please use a lower sigma_low." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would replace "please use a lower sigma_low" with something like:
"you can benefit quality by allowing smaller sigma_low without exceeding target epsilon"
But the current version is also fine.
opacus/accountants/utils.py
Outdated
accountant: str = "rdp", | ||
epsilon_tolerance: float = 0.01, | ||
sigma_low: float = 0.01, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to ask, why the default sigma_low
is so high. We will surprise the user with the warning and suboptimal result if the best sigma is 1e-3 for example.
But then I found that I don't understand what is the point of having this parameter at all? Why not starting with sigma_low=0 always?
This won't hurt this expression target_epsilon - eps_high > epsilon_tolerance
. We will never reach this value closer than tolerance/2.
opacus/accountants/utils.py
Outdated
accountant: str = "rdp", | ||
epsilon_tolerance: float = 0.01, | ||
sigma_low: float = 0.01, | ||
sigma_high: float = 200, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we so sure that 200 is high enough?
Why not eliminate this constant by probing sigma_high = tol*2^k
using this removed algorithm:
sigma_high = tol
while eps > target_epsilon:
sigma_high = 2 * sigma_high
accountant.steps = [(sigma_high, sample_rate, int(epochs / sample_rate))]
eps = accountant.get_epsilon(delta=target_delta, **kwargs)
if sigma_high > MAX_SIGMA:
raise ValueError("The privacy budget is too low.")
Summary: Pull Request resolved: pytorch#348 Noise multiplier now verifies that the actual epsilon is *lower* than the target epsilon. Bounds of the range of search for sigma, and precision can now be given as arguments to get_noise_multiplier. Differential Revision: D34003401 fbshipit-source-id: 93cedb970a44e1cf406834ffa61f34307de43ae1
affd532
to
ca984ee
Compare
This pull request was exported from Phabricator. Differential Revision: D34003401 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sweet!
One nit to add a comment.
# Noise higher than MAX_SIGMA considered unreasonable | ||
MAX_SIGMA = 2000 | ||
|
||
MAX_SIGMA = 1e6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a comment before which I recommend to leave because the constant looks a bit surprising.
# Noise level higher than MAX_SIGMA considered unreasonable
# leading to "not sufficient privacy budget" exception
MAX_SIGMA = 1e6
Problem
@tholop noticed in #346 that
get_noise_multiplier
can overshoot the target epsilon when searching for highest possible noise multiplier. This is because of the binary search implementation.Changes
get_noise_multiplier
.Differential Revision: D34003401