|
87 | 87 | "import pymc as pm\n", |
88 | 88 | "import seaborn as sns\n", |
89 | 89 | "\n", |
90 | | - "from pymc_marketing.mmm.delayed_saturated_mmm import MMM\n", |
| 90 | + "from pymc_marketing.mmm import MMM, GeometricAdstock, LogisticSaturation\n", |
91 | 91 | "from pymc_marketing.mmm.transformers import geometric_adstock, logistic_saturation\n", |
92 | 92 | "\n", |
93 | 93 | "warnings.filterwarnings(\"ignore\", category=FutureWarning)\n", |
|
979 | 979 | "cell_type": "markdown", |
980 | 980 | "metadata": {}, |
981 | 981 | "source": [ |
982 | | - "We can specify the model structure using the {class}`MMM <pymc_marketing.mmm.delayed_saturated_mmm.MMM>` class. This class, handles a lot of internal boilerplate code for us such us scaling the data (see details below) and handy diagnostics and reporting plots. One great feature is that we can specify the channel priors distributions ourselves, which fundamental component of the [bayesian workflow](https://arxiv.org/abs/2011.01808) as we can incorporate our prior knowledge into the model. This is one of the most important advantages of using a bayesian approach. Let's see how we can do it.\n", |
| 982 | + "We can specify the model structure using the {class}`MMM <pymc_marketing.mmm.mmm.MMM>` class. This class, handles a lot of internal boilerplate code for us such us scaling the data (see details below) and handy diagnostics and reporting plots. One great feature is that we can specify the channel priors distributions ourselves, which fundamental component of the [bayesian workflow](https://arxiv.org/abs/2011.01808) as we can incorporate our prior knowledge into the model. This is one of the most important advantages of using a bayesian approach. Let's see how we can do it.\n", |
983 | 983 | "\n", |
984 | 984 | "As we do not know much more about the channels, we start with a simple heuristic: \n", |
985 | 985 | "\n", |
986 | 986 | "1. The channel contributions should be positive, so we can for example use a {class}`HalfNormal <pymc.distributions.continuous.HalfNormal>` distribution as prior. We need to set the `sigma` parameter per channel. The higher the `sigma`, the more \"freedom\" it has to fit the data. To specify `sigma` we can use the following point.\n", |
987 | 987 | "\n", |
988 | 988 | "2. We expect channels where we spend the most to have more attributed sales , before seeing the data. This is a very reasonable assumption (note that we are not imposing anything at the level of efficiency!).\n", |
989 | 989 | "\n", |
990 | | - "How to incorporate this heuristic into the model? To begin with, it is important to note that the {class}`MMM <pymc_marketing.mmm.delayed_saturated_mmm.MMM>` class scales the target and input variables through an [`MaxAbsScaler`](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MaxAbsScaler.html) transformer from [`scikit-learn`](https://scikit-learn.org/stable/), its important to specify the priors in the scaled space (i.e. between 0 and 1). One way to do it is to use the spend share as the `sigma` parameter for the `HalfNormal` distribution. We can actually add a scaling factor to take into account the support of the distribution.\n", |
| 990 | + "How to incorporate this heuristic into the model? To begin with, it is important to note that the {class}`MMM <pymc_marketing.mmm.mmm.MMM>` class scales the target and input variables through an [`MaxAbsScaler`](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MaxAbsScaler.html) transformer from [`scikit-learn`](https://scikit-learn.org/stable/), its important to specify the priors in the scaled space (i.e. between 0 and 1). One way to do it is to use the spend share as the `sigma` parameter for the `HalfNormal` distribution. We can actually add a scaling factor to take into account the support of the distribution.\n", |
991 | 991 | "\n", |
992 | 992 | "First, let's compute the share of spend per channel:" |
993 | 993 | ] |
|
1072 | 1072 | "source": [ |
1073 | 1073 | "You can use the optional parameter 'model_config' to apply your own priors to the model. Each entry in the 'model_config' contains a key that corresponds to a registered distribution name in our model. The value of the key is a dictionary that describes the input parameters of that specific distribution.\n", |
1074 | 1074 | "\n", |
1075 | | - "If you're unsure how to define your own priors, you can use the 'default_model_config' property of {class}`MMM <pymc_marketing.mmm.delayed_saturated_mmm.MMM>` to see the required structure." |
| 1075 | + "If you're unsure how to define your own priors, you can use the 'default_model_config' property of {class}`MMM <pymc_marketing.mmm.mmm.MMM>` to see the required structure." |
1076 | 1076 | ] |
1077 | 1077 | }, |
1078 | 1078 | { |
|
1101 | 1101 | "dummy_model = MMM(\n", |
1102 | 1102 | " date_column=\"\",\n", |
1103 | 1103 | " channel_columns=[\"\"],\n", |
1104 | | - " adstock=\"geometric\",\n", |
1105 | | - " saturation=\"logistic\",\n", |
1106 | | - " adstock_max_lag=4,\n", |
| 1104 | + " adstock=GeometricAdstock(l_max=4),\n", |
| 1105 | + " saturation=LogisticSaturation(),\n", |
1107 | 1106 | ")\n", |
1108 | 1107 | "dummy_model.default_model_config" |
1109 | 1108 | ] |
|
1150 | 1149 | "cell_type": "markdown", |
1151 | 1150 | "metadata": {}, |
1152 | 1151 | "source": [ |
1153 | | - "**Remark:** For the prior specification there is no right or wrong answer. It all depends on the data, the context and the assumptions you are willing to make. It is always recommended to do some prior predictive sampling and sensitivity analysis to check the impact of the priors on the posterior. We skip this here for the sake of simplicity. If you are not sure about specific priors, the {class}`MMM <pymc_marketing.mmm.delayed_saturated_mmm.MMM>` class has some default priors that you can use as a starting point." |
| 1152 | + "**Remark:** For the prior specification there is no right or wrong answer. It all depends on the data, the context and the assumptions you are willing to make. It is always recommended to do some prior predictive sampling and sensitivity analysis to check the impact of the priors on the posterior. We skip this here for the sake of simplicity. If you are not sure about specific priors, the {class}`MMM <pymc_marketing.mmm.mmm.MMM>` class has some default priors that you can use as a starting point." |
1154 | 1153 | ] |
1155 | 1154 | }, |
1156 | 1155 | { |
1157 | 1156 | "cell_type": "markdown", |
1158 | 1157 | "metadata": {}, |
1159 | 1158 | "source": [ |
1160 | | - "Model sampler allows specifying set of parameters that will be passed to fit the same way as the `kwargs` are getting passed so far. It doesn't disable the fit kwargs, but rather extend them, to enable customizable and preservable configuration. By default the sampler_config for {class}`MMM <pymc_marketing.mmm.delayed_saturated_mmm.MMM>` is empty. But if you'd like to use it, you can define it like showed below: " |
| 1159 | + "Model sampler allows specifying set of parameters that will be passed to fit the same way as the `kwargs` are getting passed so far. It doesn't disable the fit kwargs, but rather extend them, to enable customizable and preservable configuration. By default the sampler_config for {class}`MMM <pymc_marketing.mmm.mmm.MMM>` is empty. But if you'd like to use it, you can define it like showed below: " |
1161 | 1160 | ] |
1162 | 1161 | }, |
1163 | 1162 | { |
|
1173 | 1172 | "cell_type": "markdown", |
1174 | 1173 | "metadata": {}, |
1175 | 1174 | "source": [ |
1176 | | - "Now we are ready to use the {class}`MMM <pymc_marketing.mmm.delayed_saturated_mmm.MMM>` class to define the model." |
| 1175 | + "Now we are ready to use the {class}`MMM <pymc_marketing.mmm.mmm.MMM>` class to define the model." |
1177 | 1176 | ] |
1178 | 1177 | }, |
1179 | 1178 | { |
|
1186 | 1185 | " model_config=my_model_config,\n", |
1187 | 1186 | " sampler_config=my_sampler_config,\n", |
1188 | 1187 | " date_column=\"date_week\",\n", |
1189 | | - " adstock=\"geometric\",\n", |
1190 | | - " saturation=\"logistic\",\n", |
| 1188 | + " adstock=GeometricAdstock(l_max=8),\n", |
| 1189 | + " saturation=LogisticSaturation(),\n", |
1191 | 1190 | " channel_columns=[\"x1\", \"x2\"],\n", |
1192 | 1191 | " control_columns=[\n", |
1193 | 1192 | " \"event_1\",\n", |
1194 | 1193 | " \"event_2\",\n", |
1195 | 1194 | " \"t\",\n", |
1196 | 1195 | " ],\n", |
1197 | | - " adstock_max_lag=8,\n", |
1198 | 1196 | " yearly_seasonality=2,\n", |
1199 | 1197 | ")" |
1200 | 1198 | ] |
|
6348 | 6346 | "cell_type": "markdown", |
6349 | 6347 | "metadata": {}, |
6350 | 6348 | "source": [ |
6351 | | - "The {func}`fit_result <pymc_marketing.mmm.delayed_saturated_mmm.MMM.fit_result>` attribute contains the `pymc` trace object." |
| 6349 | + "The {func}`fit_result <pymc_marketing.mmm.mmm.MMM.fit_result>` attribute contains the `pymc` trace object." |
6352 | 6350 | ] |
6353 | 6351 | }, |
6354 | 6352 | { |
|
9400 | 9398 | "cell_type": "markdown", |
9401 | 9399 | "metadata": {}, |
9402 | 9400 | "source": [ |
9403 | | - "The results look great! We therefore successfully recovered the true values from the data generation process. We have also seen how easy is to use the {class}`MMM <pymc_marketing.mmm.delayed_saturated_mmm.MMM>` class to fit media mix models! It takes over the model specification and the media transformations, while having all the flexibility of `pymc`!" |
| 9401 | + "The results look great! We therefore successfully recovered the true values from the data generation process. We have also seen how easy is to use the {class}`MMM <pymc_marketing.mmm.mmm.MMM>` class to fit media mix models! It takes over the model specification and the media transformations, while having all the flexibility of `pymc`!" |
9404 | 9402 | ] |
9405 | 9403 | }, |
9406 | 9404 | { |
|
10443 | 10441 | "metadata": { |
10444 | 10442 | "hide_input": false, |
10445 | 10443 | "kernelspec": { |
10446 | | - "display_name": "Python 3", |
| 10444 | + "display_name": "Python 3 (ipykernel)", |
10447 | 10445 | "language": "python", |
10448 | 10446 | "name": "python3" |
10449 | 10447 | }, |
|
0 commit comments