Skip to content

Commit

Permalink
Deploy to GitHub pages
Browse files Browse the repository at this point in the history
  • Loading branch information
github-actions[bot] authored Apr 7, 2024
0 parents commit a5d66d7
Show file tree
Hide file tree
Showing 151 changed files with 40,517 additions and 0 deletions.
4 changes: 4 additions & 0 deletions .buildinfo
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: 7b035083ce19390472c8abc4f4e52395
tags: 645f666f9bcd5a90fca523b33c5a78b7
Empty file added .nojekyll
Empty file.
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,346 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Set the environment such that multiple R processes do not crash the kernel\nimport os\nos.environ['KMP_DUPLICATE_LIB_OK']='True'%matplotlib inline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n# 1. Basic Usage Guide\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
":code:`pymer4` comes with sample data for testing purposes which we'll utilize for most of the tutorials.\nThis sample data has:\n\n- Two kinds of dependent variables: *DV* (continuous), *DV_l* (dichotomous)\n- Three kinds of independent variables: *IV1* (continuous), *IV2* (continuous), *IV3* (categorical)\n- One grouping variable for multi-level modeling: *Group*.\n\nLet's check it out below:\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# import some basic libraries\nimport os\nimport pandas as pd\n\n# import utility function for sample data path\nfrom pymer4.utils import get_resource_path\n\n# Load and checkout sample data\ndf = pd.read_csv(os.path.join(get_resource_path(), \"sample_data.csv\"))\nprint(df.head())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Standard regression models\nFitting a standard regression model is accomplished using the :code:`Lm` model class in :code:`pymer4`. All we need to do is initialize a model with a formula, some data, and call its :code:`.fit()` method.\n\nBy default the output of :code:`.fit()` has been formated to be a blend of :code:`summary()` in R and :code:`.summary()` from [statsmodels](http://www.statsmodels.org/dev/index.html/). This includes metadata about the model, data, and overall fit as well as estimates and inference results of model terms.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Import the linear regression model class\nfrom pymer4.models import Lm\n\n# Initialize model using 2 predictors and sample data\nmodel = Lm(\"DV ~ IV1 + IV2\", data=df)\n\n# Fit it\nprint(model.fit())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"All information about the model as well as data, residuals, estimated coefficients, etc are saved as attributes and can be accessed like this:\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Print model AIC\nprint(model.AIC)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Look at residuals (just the first 10)\nprint(model.residuals[:10])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A copy of the dataframe used to estimate the model with added columns for residuals and fits are are available at :code:`model.data`. Residuals and fits can also be directly accessed using :code:`model.residuals` and :code:`model.fits` respectively\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Look at model data\nprint(model.data.head())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This makes it easy to assess overall model fit visually, for example using seaborn\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# import dataviz\nimport seaborn as sns\n\n# plot model predicted values against true values\nsns.regplot(x=\"fits\", y=\"DV\", data=model.data, fit_reg=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Robust and WLS estimation\n:code:`Lm` models can also perform inference using robust-standard errors or perform weight-least-squares (experimental feature) for models with categorical predictors (equivalent to Welch's t-test).\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Refit previous model using robust standard errors\nprint(model.fit(robust=\"hc1\"))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Since WLS is only supported with 2 groups for now, filter the data first\ndf_two_groups = df.query(\"IV3 in [0.5, 1.0]\").reset_index(drop=True)\n\n# Fit new a model using a categorical predictor with unequal variances (WLS)\nmodel = Lm(\"DV ~ IV3\", data=df_two_groups)\nprint(model.fit(weights=\"IV3\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Multi-level models\nFitting a multi-level model works similarly and actually just calls :code:`lmer` or :code:`glmer` in R behind the scenes. The corresponding output is also formatted to be very similar to output of :code:`summary()` in R.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Import the lmm model class\nfrom pymer4.models import Lmer\n\n# Initialize model instance using 1 predictor with random intercepts and slopes\nmodel = Lmer(\"DV ~ IV2 + (IV2|Group)\", data=df)\n\n# Fit it\nprint(model.fit())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Similar to :code:`Lm` models, :code:`Lmer` models save details in model attributes and have additional methods that can be called using the same syntax as described above.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Get population level coefficients\nprint(model.coefs)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Get group level coefficients (just the first 5)\n# Each row here is a unique intercept and slope\n# which vary because we parameterized our rfx that way above\nprint(model.fixef.head(5))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Get group level deviates from population level coefficients (i.e. rfx)\nprint(model.ranef.head(5))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
":code:`Lmer` models also have some basic plotting abilities that :code:`Lm` models do not\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Visualize coefficients with group/cluster fits overlaid (\"forest plot\")\nmodel.plot_summary()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Plot coefficients for each group/cluster as separate regressions\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"model.plot(\"IV2\", plot_ci=True, ylabel=\"predicted DV\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Because :code:`Lmer` models rely on R, they have also some extra arguments to the :code:`.fit()` method for controlling things like optimizer behavior, as well as additional methods such for post-hoc tests and ANOVAs. See tutorial 2 for information about this functionality.\n\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Two-stage summary statistics models\nFitting :code:`Lm2` models are also very similar\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Import the lm2 model class\nfrom pymer4.models import Lm2\n\n# This time we use the 'group' argument when initializing the model\nmodel = Lm2(\"DV ~ IV2\", group=\"Group\", data=df)\n\n# Fit it\nprint(model.fit())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Like :code:`Lmer` models, :code:`Lm2` models also store group/cluster level estimates and have some basic plotting functionality\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Get group level coefficients, just the first 5\nprint(model.fixef.head(5))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Visualize coefficients with group/cluster fits overlaid (\"forest plot\")\nmodel.plot_summary()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Model Persistence\nAll pymer4 models can be saved and loaded from disk. Doing so will persist *all* model attributes and data i.e. anything accessible with the '.' syntax. Models are saved and loaded using [Joblib](https://joblib.readthedocs.io/en/latest/persistence.html#persistence) Therefore all filenames must end with :code:`.joblib`. For :code:`Lmer` models, an additional file ending in :code:`.rds` will be saved in the same directory as the HDF5 file. This is the R model object readable in R using :code:`readRDS`.\n\nPrior to version 0.8.1 models were saved to HDF5 files using [deepdish](https://github.com/uchicago-cs/deepdish/) but this library is no longer maintained. If you have old models saved as :code:`.h5` or :code:`.hdf5` files you should use the same version of pymer4 that you used to estimate those models.\n\nTo persist models you can use the dedicated :code:`save_model` and :code:`load_model` functions from the :code:`pymer4.io` module\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Import functions\nfrom pymer4.io import save_model, load_model\n\n# Save the Lm2 model above\nsave_model(model, \"mymodel.joblib\")\n# Load it back up\nmodel = load_model(\"mymodel.joblib\")\n# Check that it looks the same\nprint(model)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Wrap Up\nThis was a quick overview of the 3 major model classes in :code:`pymer4`. However, it's highly recommended to check out the API to see *all* the features and options that each model class has including things like permutation-based inference (:code:`Lm` and :code:`Lm2` models) and fine-grain control of optimizer and tolerance settings (:code:`Lmer` models).\n\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.19"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Loading

0 comments on commit a5d66d7

Please sign in to comment.