If you use these materials for teaching or research, please use the following citation:
Rhoads, S. A. (2023). pyEM: Expectation Maximization with MAP estimation in Python. Zenodo. https://doi.org/10.5281/zenodo.10415396
This is a Python implementation of the Hierarchical Expectation Maximization algorithm with MAP estimation for fitting models to behavioral data. See below for more information on the algorithm.
pyEM.fitting
: contains the main function (EMfit
) for fitting modelspyEM.math
: contains functions used for fittingpyEM.plotting
: contains functions for simple plottingpyEM.classes
: contains classes for models (under development)
Users should create new functions based on their modeling needs:
- Simulation function to simulate behavior (see
examples.rw_models.simulate
) - Fit function to fit the model to the behavior (see
examples.rw_models.fit
)
You will use EMfit()
to fit the model to the data. This function takes the following inputs:
- all behavioral data: a list of lists (level 1: subject, level 2: relevant
np.array
containing behavior for each subject) or a list of pd.DataFrames (level 1: subject, level 2: relevantpd.DataFrame
containing behavior for each subject) - objective function (e.g.,
examples.rw_models.fit
): a function that takes level 2 inputs from above: (e.g., singlenp.array
orpd.DataFrame
for a single subject) and outputs the negative posterior likelihood based on likelihood and prior probabilities (see explanation of implementation below) - parameter names: a list of parameter names in the correct order (e.g.,
['alpha', 'beta']
)
An example fit()
function is provided in examples.template_model.fit
. You can copy this file and modify it to fit your model.
Your objective function fit()
function is that main implementation of your custom model. It should take the following inputs:
params
(list): list of parameter estimatesbehavioral_data
: this can be anp.array
orpd.DataFrame
(choose whichever is more convenient for your model)prior=None
: this is for the EM algorithm, and should have a default value asNone
(see below for more information)output='npl'
(str): this is also for the EM algorithm, and should have a default value as'npl'
(see below for more information)
You can also add additional inputs as needed.
At the top of your function, please include code to convert relevant parameters from Gaussian space to parameter space (see Daw et al., 2011 below). For example, if you have a parameter lr
that is bounded between 0 and 1, you can convert it from Gaussian space to parameter space using the following code (you can add custom functions if needed):
lr = norm2alpha(params[0])
After your transformation, please also ensure that your parameters are in the correct range. For example, if you have parameters lr
(bounded between 0 and 1) and inv_tmp
(bounded between 0.00001 and 10), you can add the following code to ensure that the parameter is in the correct range. Here, we are returning a very large number if the parameter is out of range, which will effectively prevent the algorithm from using that parameter value:
this_alpha_bounds = [0, 1]
if lr < min(this_alpha_bounds) or lr > max(this_alpha_bounds):
return 10000000
this_beta_bounds = [0.00001, 10]
if inv_tmp < min(this_beta_bounds) or inv_tmp > max(this_beta_bounds):
return 10000000
At the bottom of your function, please return the negative posterior likelihood. This is the negative log-likelihood (e.g., from your choice likelihoods) multiplied by the prior probability. You can use the following code to return the negative posterior likelihood:
# you can compute your negative log likelihood first (this might change depending on your model)
# here, we are assuming you have a list or np.array of choice likelihoods
negll = -np.nansum(np.log(choice_likelihoods))
# then compute the negative posterior likelihood
# you can use the provided function `math.calc_fval` to compute the negative posterior likelihood
# e.g.: `fval = calc_fval(choice_nll, params, prior, output)`
# ...or you can copy and paste or modify the following code (assuming that your negative log likelihood is `negll`)
if (output == 'npl') or (output == 'nll'):
fval = calc_fval(choice_nll, params, prior=prior, output=output)
return fval
This algorithm requires Python 3.7.10 with the following packages:
numpy: 1.21.6
scipy: 1.6.2
joblib: 1.1.0
matplotlib: 3.5.3
seaborn: 0.12.2
pandas: 1.1.5
tqdm: 4.65.0
We also use:
copy
datetime
pickle
sys
To install the package, you can use pip install
in a new Anaconda environment (recommended), but you can also just pip install
it into your current environment:
conda create --name emfit pip python=3.7.10
conda activate emfit
python -m pip install git+https://github.com/shawnrhoads/pyEM.git
To update the package, you can use pip:
python -m pip install --upgrade git+https://github.com/shawnrhoads/pyEM.git
See examples/RW.ipynb
with an example notebook on implementing the algorithm using the Rescorla-Wagner model of reinforcement learning. This notebook simulates behavior and fits the model to the simulated data to demonstrate how hierarchical EM-MAP can be used for parameter recovery.
See examples/EMClass.ipynb
with an example notebook on implementing the algorithm using the EMClass()
, which can be imported from pyEM.classes
. This class is a more flexible implementation of the algorithm, and allows for the use of different models without having to change the code. This notebook simulates behavior and fits the model to the simulated data to demonstrate how to use the EMClass()
for parameter recovery. This feature is still under development, so please report any issues you encounter.
This is meant to be a basic implementation of hierarchical EM with MAP estimation, but I invite other researchers and educators to help improve and expand the code here!
Here are some ways you can help!
- If you spot an error (e.g., typo, bug, inaccurate descriptions, etc.), please open a new issue on GitHub by clicking on the GitHub Icon in the top right corner on any page and selecting "open issue". Alternatively, you can open a new issue directly through GitHub.
- If there is inadvertently omitted credit for any content that was generated by others, please also open a new issue directly through GitHub.
- If you have an idea for a new example tutorial or a new module to include, please either open a new issue and/or submit a pull request directly to the repository on GitHub.
Negative Log-Likelihood
The negative log-likelihood is a measure of how well the model fits the observed data. It is obtained by taking the negative natural logarithm of the likelihood function. The goal of MLE is to find the parameter values that minimize the negative log-likelihood, effectively maximizing the likelihood of the observed data given the model.
Prior Probability
The prior probability represents our knowledge or belief about the parameters before observing the data. It is typically based on some prior information or assumptions. In this case, we are using a normal distribution to represent our prior beliefs about the parameters, with mean
MAP Estimation
In MAP estimation, we are incorporating the prior probability into the estimation process. Instead of only maximizing the likelihood (as in MLE), we are maximizing the posterior probability, which combines the likelihood and the prior. Mathematically, MAP estimation can be expressed as:
where
We are effectively combining the likelihood and the prior in a way that biases the parameter estimation towards the prior beliefs. Since we are maximizing this combined term, we are seeking parameter values that not only fit the data well (as indicated by the likelihood) but also align with the prior probability distribution.
Code originally adapted for Python from:
Wittmann, M. K., Fouragnan, E., Folloni, D., Klein-Flügge, M. C., Chau, B. K., Khamassi, M., & Rushworth, M. F. (2020). Global reward state affects learning and activity in raphe nucleus and anterior insula in monkeys. Nature Communications, 11(1), 3771. https://doi.org/10.1038/s41467-020-17343-w
Cutler, J., Wittmann, M. K., Abdurahman, A., Hargitai, L. D., Drew, D., Husain, M., & Lockwood, P. L. (2021). Ageing is associated with disrupted reinforcement learning whilst learning to help others is preserved. Nature Communications, 12(1), 4440. https://doi.org/10.1038/s41467-021-24576-w
Rhoads, S. A., Gan, L., Berluti, K., OConnell, K., Cutler, J., Lockwood, P. L., & Marsh, A. A. (2023). Neurocomputational basis of learning when choices simultaneously affect both oneself and others. PsyArXiv. https://doi.org/10.31234/osf.io/rf4x9
See also:
Daw, N. D. (2011). Trial-by-trial data analysis using computational models. Decision making, affect, and learning: Attention and performance XXIII, 23(1). https://doi.org/10.1093/acprof:oso/9780199600434.003.0001 [pdf]
Huys, Q. J., Cools, R., Gölzer, M., Friedel, E., Heinz, A., Dolan, R. J., & Dayan, P. (2011). Disentangling the roles of approach, activation and valence in instrumental and pavlovian responding. PLoS computational biology, 7(4), e1002028. https://doi.org/10.1371/journal.pcbi.1002028
For MATLAB flavors of this algorithm: