Skip to content

Deprecate and disable budget optimization method & Updating notebook #1832

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 15 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 4 additions & 5 deletions data/config_files/multi_dimensional_example_model.yml
Original file line number Diff line number Diff line change
Expand Up @@ -101,12 +101,11 @@ effects:
# ----------------------------------------------------------------------
# (optional) sampler options you plan to forward to pm.sample():
sampler_config:
tune: 1000
draws: 200
chains: 8
tune: 800
draws: 400
chains: 2
random_seed: 42
target_accept: 0.90
nuts_sampler: "nutpie"
target_accept: 0.80

# ----------------------------------------------------------------------
# (optional) idata from a previous sample
Expand Down
359 changes: 359 additions & 0 deletions data/multidimensional_mock_data.csv

Large diffs are not rendered by default.

1,011 changes: 663 additions & 348 deletions docs/source/notebooks/mmm/mmm_allocation_assessment.ipynb

Large diffs are not rendered by default.

557 changes: 309 additions & 248 deletions docs/source/notebooks/mmm/mmm_budget_allocation_example.ipynb

Large diffs are not rendered by default.

2,256 changes: 1,173 additions & 1,083 deletions docs/source/notebooks/mmm/mmm_multidimensional_example.ipynb

Large diffs are not rendered by default.

Binary file modified docs/source/notebooks/mmm/multidimensional_model.nc
Binary file not shown.
47 changes: 18 additions & 29 deletions pymc_marketing/mmm/mmm.py
Original file line number Diff line number Diff line change
Expand Up @@ -2338,19 +2338,21 @@ def optimize_budget(
):
"""Optimize the given budget based on the specified utility function over a specified time period.

This function optimizes the allocation of a given budget across different channels
to maximize the response, considering adstock and saturation effects. It scales the
budget and budget bounds, performs the optimization, and generates a synthetic dataset
for posterior predictive sampling.

The function first scales the budget and budget bounds using the maximum scale
of the channel transformer. It then uses the `BudgetOptimizer` to allocate the
budget, and creates a synthetic dataset based on the optimal allocation. Finally,
it performs posterior predictive sampling on the synthetic dataset.

**Important**: When generating the posterior predicive distribution for the target with the optimized budget,
we are setting the control variables to zero! This is done because in many situations we do not have all the
control variables in the future (e.g. outlier control, special events).
.. deprecated:: 0.0.3
This function optimizes the allocation of a given budget across different channels
to maximize the response, considering adstock and saturation effects. It scales the
budget and budget bounds, performs the optimization, and generates a synthetic dataset
for posterior predictive sampling.

The function first scales the budget and budget bounds using the maximum scale
of the channel transformer. It then uses the `BudgetOptimizer` to allocate the
budget, and creates a synthetic dataset based on the optimal allocation. Finally,
it performs posterior predictive sampling on the synthetic dataset.

**Important**: When generating the posterior predicive distribution for the target with the
optimized budget, we are setting the control variables to zero! This is done because in many
situations we do not have all the control variables in the future (e.g. outlier control,
special events).

Parameters
----------
Expand Down Expand Up @@ -2390,22 +2392,9 @@ def optimize_budget(
ValueError
If the noise level is not a float.
"""
from pymc_marketing.mmm.budget_optimizer import BudgetOptimizer

allocator = BudgetOptimizer(
num_periods=num_periods,
utility_function=utility_function,
response_variable=response_variable,
custom_constraints=constraints,
default_constraints=default_constraints,
model=self,
)

return allocator.allocate_budget(
total_budget=budget,
budget_bounds=budget_bounds,
callback=callback,
**minimize_kwargs,
raise NotImplementedError(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we warn instead of raise?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think better raise, only because the current method even if it works will return incorrect results. No sense to keep it, but what do you think?

"This method is deprecated and no longer available. "
"Please migrate to the `Multidimensal.MMM` class."
)

def plot_budget_allocation(
Expand Down
4 changes: 2 additions & 2 deletions pymc_marketing/mmm/utility.py
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@ def mean_tightness_score(
It is calculated as:

.. math::
Mean\ Tightness\ Score = \mu - \alpha \cdot Tail\ Distance
Mean\ Tightness\ Score = \mu - \alpha \cdot Tail\ Distance / \mu

where:
- :math:`\mu` is the mean of the sample returns.
Expand Down Expand Up @@ -202,7 +202,7 @@ def _mean_tightness_score(
samples = _check_samples_dimensionality(samples)
mean = pt.mean(samples)
tail_metric = tail_distance(confidence_level)
return mean - alpha * tail_metric(samples, budgets)
return (mean - alpha * tail_metric(samples, budgets)) / mean

return _mean_tightness_score

Expand Down
53 changes: 10 additions & 43 deletions tests/mmm/test_budget_optimizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -576,7 +576,7 @@ def test_callback_functionality_parametrized(
],
)
def test_mmm_optimize_budget_callback_parametrized(dummy_df, dummy_idata, callback):
"""Test callback functionality through MMM.optimize_budget interface."""
"""Test that MMM.optimize_budget properly raises deprecation error."""
df_kwargs, X_dummy, y_dummy = dummy_df

mmm = MMM(
Expand All @@ -588,48 +588,15 @@ def test_mmm_optimize_budget_callback_parametrized(dummy_df, dummy_idata, callba
mmm.build_model(X=X_dummy, y=y_dummy)
mmm.idata = dummy_idata

# Test the MMM interface
result = mmm.optimize_budget(
budget=100,
num_periods=10,
callback=callback,
)

# Check return value count
if callback:
assert len(result) == 3
optimal_budgets, opt_result, callback_info = result

# Validate callback info
assert isinstance(callback_info, list)
assert len(callback_info) > 0

# Each iteration should have required keys
for iter_info in callback_info:
assert "x" in iter_info
assert "fun" in iter_info
assert "jac" in iter_info

# Check that objective values are finite
objectives = [iter_info["fun"] for iter_info in callback_info]
assert all(np.isfinite(obj) for obj in objectives)

else:
assert len(result) == 2
optimal_budgets, opt_result = result

# Common validations
assert isinstance(optimal_budgets, xr.DataArray)
assert optimal_budgets.dims == ("channel",)
assert len(optimal_budgets) == len(mmm.channel_columns)

# Budget should sum to total (within tolerance)
assert np.abs(optimal_budgets.sum().item() - 100) < 1e-6

# Check optimization result
assert hasattr(opt_result, "success")
assert hasattr(opt_result, "x")
assert hasattr(opt_result, "fun")
# Test that the deprecated MMM interface raises NotImplementedError
with pytest.raises(
NotImplementedError, match="This method is deprecated and no longer available"
):
mmm.optimize_budget(
budget=100,
num_periods=10,
callback=callback,
)


@pytest.mark.parametrize(
Expand Down
69 changes: 69 additions & 0 deletions tests/mmm/test_budget_optimizer_multidimensional.py
Original file line number Diff line number Diff line change
Expand Up @@ -1004,3 +1004,72 @@ def test_budget_distribution_carryover_interaction_issue(dummy_df, fitted_mmm):
np.abs(channel_1_spend_with_carryover - channel_1_allocation * num_periods)
< 0.1
), "With carryover: total spend should still equal allocation * num_periods"


@pytest.mark.parametrize(
"callback",
[
False, # Default no callback
True, # With callback
],
ids=[
"no_callback",
"with_callback",
],
)
def test_multidimensional_optimize_budget_callback_parametrized(
dummy_df, fitted_mmm, callback
):
"""Test callback functionality through MultiDimensionalBudgetOptimizerWrapper.optimize_budget interface."""
df_kwargs, X_dummy, y_dummy = dummy_df

optimizable_model = MultiDimensionalBudgetOptimizerWrapper(
model=fitted_mmm,
start_date=X_dummy["date_week"].max() + pd.Timedelta(weeks=1),
end_date=X_dummy["date_week"].max() + pd.Timedelta(weeks=10),
)

# Test the MultiDimensionalBudgetOptimizerWrapper interface
result = optimizable_model.optimize_budget(
budget=100,
callback=callback,
)

# Check return value count
if callback:
assert len(result) == 3
optimal_budgets, opt_result, callback_info = result

# Validate callback info
assert isinstance(callback_info, list)
assert len(callback_info) > 0

# Each iteration should have required keys
for iter_info in callback_info:
assert "x" in iter_info
assert "fun" in iter_info
assert "jac" in iter_info

# Check that objective values are finite
objectives = [iter_info["fun"] for iter_info in callback_info]
assert all(np.isfinite(obj) for obj in objectives)

else:
assert len(result) == 2
optimal_budgets, opt_result = result

# Common validations
assert isinstance(optimal_budgets, xr.DataArray)
assert optimal_budgets.dims == (
"geo",
"channel",
) # Multidimensional has geo dimension
assert len(optimal_budgets.coords["channel"]) == len(fitted_mmm.channel_columns)

# Budget should sum to total (within tolerance)
assert np.abs(optimal_budgets.sum().item() - 100) < 1e-6

# Check optimization result
assert hasattr(opt_result, "success")
assert hasattr(opt_result, "x")
assert hasattr(opt_result, "fun")
15 changes: 9 additions & 6 deletions tests/mmm/test_utility.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@
EXPECTED_RESULTS = {
"avg_response": 5.5,
"tail_dist": 4.5,
"mean_tight_score": 3.25,
"mean_tight_score": 0.591,
"var_95": 1.45,
"cvar_95": 1.0,
"sharpe": 1.81327,
Expand Down Expand Up @@ -196,7 +196,7 @@ def test_tail_distance(mean1, std1, mean2, std2, expected_order):
60,
0.1,
"higher_mean",
), # With low alpha, higher mean should dominate
), # With low alpha, lower std still dominates due to normalization
],
)
def test_compare_mean_tightness_score(
Expand All @@ -215,14 +215,17 @@ def test_compare_mean_tightness_score(
score1 = mean_tightness_score_func(samples1, None).eval()
score2 = mean_tightness_score_func(samples2, None).eval()

# Assertions based on observed behavior: higher mean should dominate in both cases
# Assertions based on actual behavior of the normalized formula
# With the normalized mean tightness score, lower std tends to dominate
# because the score gets closer to 1 with less tail distance
if expected_relation == "higher_mean":
assert score2 > score1, (
f"Expected score for mean={mean2} to be higher, but got {score2} <= {score1}"
# Even with low alpha, lower std distribution scores higher due to normalization
assert score1 > score2, (
f"Expected score for std={std1} to be higher due to normalization, but got {score1} <= {score2}"
)
elif expected_relation == "lower_std":
assert score1 > score2, (
f"Expected score for std={std1} to be lower, but got {score1} <= {score2}"
f"Expected score for std={std1} to be higher, but got {score1} <= {score2}"
)


Expand Down
Loading