Abstract:
The MouseArmTransformer module provides a transformer-based solution for lifting 2D DeepLabCut (DLC) keypoints into a 3D mouse-arm space using minimal camera setups. Triangulation is performed using two camera views, after which ground truth (GT) labels are corrected via a custom GUI. The transformer model then predicts 3D coordinates from single-camera DLC keypoints by leveraging temporal smoothness, joint connectivity, and ground truth data. The model is trained using a combination of losses: triangulation, continuity, connectivity, and ground truth. This framework offers an efficient method for reconstructing 3D mouse-arm movements with minimal hardware, enabling accurate biomechanical analysis.
This directory contains the MouseArmTransformer
module along with prototyping code and notebooks.
The relevant files are contained in the MouseArmTransformer
directory. Right now, it is possible
to do a minimal install for loading model weights and performing inference.
Figure caption: A: Example 2D images (as in DeWolf, Schneider et al. Figure 1) of DeepLabCut keypoints on both camera views. B: Using 3D DeepLabCut (Nath, Mathis et al. 2019) we triangulated data, then manually labeled frames for GT. This was then all passed to a transformer (see below), and video inference was done with a single camera (Camera 1) to directly predict 3D. C: Example lifted 3D pose, showing a sequence of reaching; black is early, red is late (matching the view in Camera 1), into the 3D MuJoCo spaces D: Quantification of errors in 3 example sessions (vs. GT) in MuJoCo space. Note, the forearm is 1.33572 cm in our model, and 1 cm equals 1 MuJoCo unit.
For estimating 3D coordinates from the two available camera views we first use simple triangulation to obtain 3D estimates, then correct outliers in a GUI to obtain 3D ground truth. The GUI is located in mausspaun/visualization/gui
. For usage refer to the gui.md
file within this folder.
The transformer model takes as input the 2D DLC extracted coordinates from one camera and extracts the 3D coordinates through a linear output layer. The input to the transformer consists of sequences (of length T=2) of 2D coordinates representing the marker positions (see the Dataloader classes within data.py). Each marker is represented by an x and y position. We then encode the flattened joint coordinates using a transformer layer and then project the encoder output to a three-dimensional space using a fully connected layer. We train the model using four different losses and their corresponding weights:
- Triangulation loss: The mean-squared-error (MSE) between the model output and the noisy triangulated data (weight: 1)
- Continuity loss: The MSE between successive timesteps, which ensures temporal smoothness between consecutive output frames (weight: 25)
- Connectivity loss: The mean squared prediction error (MSPE) between markers, using a skeleton model of the mouse. This encourages the model to preserve the geometric relationship between joints during model training (weight: 1)
- Ground truth loss: Additionally we use the MSE between the model output and our ground truth 3D predictions (see above) to provide the model with noise-free triangulation results. This loss was only active within batches that contained ground truth frames (weight: 0.0001)
We train the model using the above-described loss terms across all sessions for which two cameras are available. We then use the resulting weights to generate 3D predictions for sessions with only one camera. For training specifics refere to the training.py file within this folder.
You can build a python package by running:
make build
You can check the contents of that build by running:
make test_contents
The wheels and source distributions are located in the dist/
directory, which will look
sth like this:
dist/
dist/MouseArmTransformer-0.1.0-py3-none-any.whl
dist/MouseArmTransformer-0.1.0.tar.gz
If you want to run tests (in tests/
) in a fully-configured docker environment, simply run:
make test_docker
which will build the container, the package, and then run the tests.
If you have a suitable python environment set up, you can also run:
make test
to run the tests.
To update the version, please update
the version number of this code. This can be done in MouseArmTransformer/__init__.py
.
You also need to edit the Makefile
and change this part
tests/contents.tgz.lst:
tar tf dist/MouseArmTransformer-0.0.1.tar.gz | sort > tests/contents.tgz.lst
to reflect the new version information for the tests to pass.
- Mackenzie (@MMathisLab) oversees the project and trained the 2D DeepLabCut models.
- Markus (@CYHSM) wrote the majority of the
MouseArmTransformer
contents, and trained the 3D model. He also wrote the explorative notebooks and validated the model. - Wesley (@wesleymth) contributed code inputs and 3D ground truth labels.
- Travis (studywolf) contributed 3D labels.
- Steffen (@stes) packaged the code and provided code review.
email: mackenzie.mathis@epfl.ch
If you use this code or ideas, please cite our work 🤗
Frey, M., Monteith-Finas, W., DeWolf, T., Schneider, S., & Mathis, M. W. (2024). MouseArmTransformer: a transformer-based 3D lifting module for an adult mouse arm (0.1.0). Zenodo. https://doi.org/10.5281/zenodo.12673173