Skip to content

SerpRateAI/merrypopins

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Merrypopins

Merrypopins

Merrypopins CI Tests codecov CodeQL ๐Ÿ“˜ Merrypopins Documentation Merrypopins Streamlit App PyPI Python Docker Pulls Downloads Issues Dependencies Dependabot Status Last commit Release Contributors License: MIT

merrypopins is a Python library to streamline the workflow of nanoโ€‘indentation experiment data processing, automated pop-in detection and analysis. It provides five core modules:

  • load_datasets: Load and parse .txt measurement files and .tdm/.tdx metadata files into structured pandas DataFrames. Automatically detects headers, timestamps, and measurement channels.
  • preprocess: Clean and normalize indentation data with filtering, baseline correction, and contact point detection.
  • locate: Identify and extract popโ€‘in events within indentation curves using advanced detection algorithms, including:
    • Isolation Forest anomaly detection
    • CNN Autoencoder reconstruction error
    • Fourier-based derivative outlier detection
    • Savitzky-Golay smoothed gradient thresholds
  • statistics: Perform statistical analysis and model fitting on located popโ€‘in events (e.g., frequency, magnitude, distribution). The statistics module allows you to compute detailed pop-in statistics, such as:
    • Pop-in statistics (e.g., load-depth and stress-strain metrics)
    • Stress-strain transformation using Kalidindi & Pathak. (2008)
    • Curve-level summary statistics (e.g., total pop-in duration, average time between pop-ins)
    • Pop-in shape statistics like depth jump, average velocity, and curvature
  • make_dataset: Construct enriched datasets by running the full merrypopins pipeline and exporting annotated results and visualizations.

Merrypopins is developed by Cahit Acar, Anna Marcelissen, Hugo van Schrojenstein Lantman, and John M. Aiken.


๐ŸŒ Try Merrypopins Library Online

๐Ÿš€ Live demo: explore Merrypopins in your browser! Open in Streamlit

The hosted app lets you:

  • upload raw .txt indentation files (and optional .tdm/.tdx metadata),
  • tune preprocessing, detection & statistics parameters,
  • visualise pop-ins interactively,
  • download annotated CSVs + plots.

๐Ÿ›  Source Instrumentation

Merrypopins was developed using datasets generated by the Bruker Hysitron TI 990 TriboIndenter โ€” a high-precision nanoindentation platform. The library natively supports .txt and .tdm/.tdx file formats exported by the Hysitron software suite.

Hysitron TI 990 Nanoindenter
Typical indentation experiments conducted with the TI 990 include:
  • Force-depth curve acquisition at nano/micro scale
  • High-resolution pop-in event detection
  • Automated test grid data export

The preprocessing and pop-in detection tools in Merrypopins are tuned to handle the structural patterns and noise profiles specific to these datasets.

Example: Nanoindentation Grain Selection and Deformation

Below are example visualizations from Electron Backscatter Diffraction (EBSD) maps used to select grain areas, followed by indentation marks after testing:

โžค Pre-indentation EBSD with Labeled Grains

Grain Selection Map

โžค Post-indentation Microstructure with Deformation (Area on Grain 5)

Grain 5 After Indentation

These images highlight the complex deformation behavior analyzed by the merrypopins toolset for robust pop-in detection.


Installation

# From PyPI
pip install merrypopins

# For development
git clone https://github.com/SerpRateAI/merrypopins.git
cd merrypopins
pip install -e .

merrypopins supports Python 3.10+ and depends on:

  • matplotlib
  • numpy
  • pandas
  • scipy
  • scikit-learn
  • tensorflow

These are installed automatically via pip.

All core and development dependencies are tested with Python 3.10 through 3.13.

Quickstart

Importing merrypopins Modules

from pathlib import Path
from merrypopins.load_datasets import load_txt, load_tdm
from merrypopins.preprocess import default_preprocess, remove_pre_min_load, rescale_data, finalise_contact_index
from merrypopins.locate import default_locate
from merrypopins.make_dataset import merrypopins_pipeline
from merrypopins.statistics import default_statistics, calculate_stress_strain, calculate_stress_strain_statistics, default_statistics_stress_strain

Load Indentation Data and Metadata

# 1) Load indentation data:
data_file = Path("data/experiment1.txt")
df = load_txt(data_file)
print(df.head())
print("Timestamp:", df.attrs['timestamp'])
print("Number of Points:", df.attrs['num_points'])

# 2) Load tdm metadata:
tdm_meta_file = Path("data/experiment1.tdm")
# Load tdm metadata and channels this will create dataframe for root and channels
df_tdm_meta_root, df_tdm_meta_channels = load_tdm(tdm_meta_file)
# The root metadata is stored as one row with their respective columns
print(df_tdm_meta_root.head())
# To be able to read all the columns of root metadata dataframe it can be transposed
df_tdm_meta_root = df_tdm_meta_root.T.reset_index()
df_tdm_meta_root.columns = ['attribute', 'value']
print(df_tdm_meta_root.head(50))
# The channel metadata is stored as multiple rows with their respective columns
print(df_tdm_meta_channels.head(50))

Preprocess Data

Option 1: Use default pipeline

# This applies:
# 1. Removes all rows before minimum Load
# 2. Detects contact point and shifts Depth so contact = 0
# 3. Removes Depth < 0 rows and adds a flag for the contact point

df_processed = default_preprocess(df)

print(df_processed.head())
print("Contact point index:", df_processed[df_processed["contact_point"]].index[0])

Option 2: Customize each step (with optional arguments)

# Step 1: Remove initial noise based on minimum Load
df_clean = remove_pre_min_load(df, load_col="Load (ยตN)")

# Step 2: Automatically detect contact point and zero the depth
df_rescaled = rescale_data(
    df_clean,
    depth_col="Depth (nm)",
    load_col="Load (ยตN)",
    N_baseline=30,     # number of points for baseline noise estimation
    k=5.0,             # noise threshold multiplier
    window_length=7,   # Savitzky-Golay smoothing window (must be odd)
    polyorder=2        # Polynomial order for smoothing
)

# Step 3: Trim rows before contact and/or flag the point
df_final = finalise_contact_index(
    df_rescaled,
    depth_col="Depth (nm)",
    remove_pre_contact=True,       # remove rows where depth < 0
    add_flag_column=True,          # add a boolean column marking the contact point
    flag_column="contact_point"    # customize the column name if needed
)

print(df_final[df_final["contact_point"]])  # display contact row
print("Contact point index:", df_final[df_final["contact_point"]].index[0])

๐Ÿงช Tip You can omit or modify any step depending on your data:

  • Skip remove_pre_min_load() if your data is already clean.
  • Set remove_pre_contact=False if you want to retain all data.
  • Customize flag_column to integrate with your own schema.

Locate Pop-in Events

Detect Pop-ins using Default Method

# Detect pop-ins using all methods
results = default_locate(df_processed)
print(results[results.popin])

Customize Detection Thresholds

results_tuned = default_locate(
    df_processed,
    iforest_contamination=0.002,
    cnn_threshold_multiplier=4.0,
    fd_threshold=2.5,
    savgol_threshold=2.0
)

Visualize Detections

import matplotlib.pyplot as plt

plt.figure(figsize=(8,6))
plt.plot(results_tuned["Depth (nm)"], results_tuned["Load (ยตN)"], label="Preprocessed", alpha=0.4, color='orange')

colors = {
    "popin_iforest": 'red',
    "popin_cnn": 'purple',
    "popin_fd": 'darkorange',
    "popin_savgol": 'green'
}
markers = {
    "popin_iforest": '^',
    "popin_cnn": 'v',
    "popin_fd": 'x',
    "popin_savgol": 'D'
}

for method, color in colors.items():
    mdf = results_tuned[results_tuned[method]]
    plt.scatter(mdf["Depth (nm)"], mdf["Load (ยตN)"],
                c=color, label=method.replace("popin_", "").capitalize(),
                marker=markers[method], alpha=0.7)

confident = results_tuned[results_tuned["popin_confident"]]
plt.scatter(confident["Depth (nm)"], confident["Load (ยตN)"],
            edgecolors='k', facecolors='none', label="Majority Vote (2+)", s=100, linewidths=1.5)

plt.xlabel("Depth (nm)"); plt.ylabel("Load (ยตN)")
plt.title("Pop-in Detections by All Methods")
plt.legend(); plt.grid(True); plt.tight_layout(); plt.show()

Run Full Pipeline with merrypopins_pipeline

This function runs the entire merrypopins workflow, from loading data to locating pop-ins and generating visualizations.

Define Input and Output Paths

# Define the text file that will be processed and output directory that will contain the visualization
text_file = Path("datasets/6microntip_slowloading/grain9_6um_indent03_HL_QS_LC.txt")
output_dir = Path("visualisations/6microntip_slowloading/grain9_6um_indent03_HL_QS_LC")

# Make sure output directory exists
output_dir.mkdir(parents=True, exist_ok=True)

Run The merrypopins Pipeline

df_pipeline = merrypopins_pipeline(
    text_file,
    save_plot_dir=output_dir,
    trim_margin=30
)

View Result DataFrame

df_pipeline.head()

View Result Visualizations

# The pipeline generates plot in the specified output directory for the provided text file.
from PIL import Image
import matplotlib.pyplot as plt

# Load all PNGs from output folder
image_paths = sorted(output_dir.glob("*.png"))

# Only proceed if there are images
if image_paths:
    img = Image.open(image_paths[0])
    plt.figure(figsize=(12, 6))
    plt.imshow(img)
    plt.title(image_paths[0].stem)
    plt.axis('off')
    plt.show()
else:
    print("No plots found in output folder.")

Calculate Pop-in Statistics

Calculate Pop-in Statistics (Load-Depth)

df_statistics = default_statistics(df_pipeline)

# View the computed statistics for each pop-in
print(df_statistics.head())

Calculate Stress-Strain Statistics

Perform Stress-Strain Transformation and Statistics

# Perform stress-strain transformation
df_stress_strain = calculate_stress_strain(df_statistics)

# Calculate stress-strain statistics
df_stress_strain_statistics = calculate_stress_strain_statistics(df_stress_strain)

# View the calculated stress-strain statistics
print(df_stress_strain_statistics.head())

Full Statistics Pipeline

Perform Default Full Statistics Pipeline for Stress-Strain

df_statistics_stress_strain = default_statistics_stress_strain(
    df_pipeline,
    popin_flag_column="popin",
    before_window=0.5,
    after_window=0.5,
    Reff_um=5.323,
    min_load_uN=2000,
    smooth_stress=True,
    stress_col="stress",
    strain_col="strain",
    time_col="Time (s)",
)

# View the final stress-strain statistics
print(df_statistics_stress_strain.head())

Development & Testing

  1. Install development requirements:

     # For development (includes dev tools like pytest, black, ruff, etc.)
     pip install -e '.[dev]'

    This installs the main package and development dependencies listed in pyproject.toml under [project.optional-dependencies].dev

    Optionally, you can install development dependencies via:

    pip install -r requirements-dev.txt

๐Ÿ”ง Pre-commit Hooks

We rely on pre-commit to auto-run ruff (lint) and black (format) against every change before it is committed.
If these checks are not executed locally, your PR will fail in CI.

๐Ÿšจ Important: You must have the pre-commit package installed globally
(pip install --user pre-commit or via the projectโ€™s dev extras) before making commits.

Setup (Run once per clone)

# 1) Install the tool (only needed if itโ€™s not already on your system)
pip install pre-commit          # or: pip install -e '.[dev]'

# 2) Install the Git hooks defined in .pre-commit-config.yaml
pre-commit install

This adds a Git hook that formats / lints the staged files automatically at each git commit.

Run Checks Manually

To run all checks on all files:

pre-commit run --all-files

What if the hook rejects my commit?

If pre-commit finds issues (usually formatting via black or lint via ruff),
the commit will abort and the affected files will be modified in-place to satisfy the rules.

  1. Open Source Control (e.g. the Git sidebar in VS Code).
  2. You will see the updated (but unstaged) files.
  3. Click the โž• (stage) button next to each fixed file or git add <file>.
  4. Re-run git commit โ€“ it should now succeed.
  5. Finally, push your branch to the remote.

Tip: always run pre-commit run --all-files before making a commit to catch issues early.

Notes:

  • Hooks are defined in .pre-commit-config.yaml.
  • You can exclude specific files or directories (e.g., tutorials/) by modifying the config file .pre-commit-config.yaml.
  • CI will re-run the same hooks; commits that bypass them locally will be rejected.

๐Ÿงช Running Tests

  1. Run tests with coverage:

    pytest --cov=merrypopins --cov-report=term-missing

    This command runs all tests in the tests/ directory and generates a coverage report showing which lines of code were executed during the tests. Tests and linting are automatically run on each pull request via GitHub Actions. The CI uses Python 3.10โ€“3.13 and runs pre-commit, pytest, and coverage checks.

  2. Generate HTML coverage report:

    pytest --cov=merrypopins --cov-report=html
    # open htmlcov/index.html in browser

๐Ÿ“ฆ Run Merrypopins Streamlit App

Merrypopins includes an interactive Streamlit app for visualizing and detecting pop-ins in indentation data. This app allows you to upload your data files, run the detection algorithms, and visualize the results in a user-friendly interface.

๐ŸŒ Try It Online

๐Ÿš€ Live demo: explore Merrypopins in your browser! Open in Streamlit

๐Ÿณ Using Docker

You can run the interactive Streamlit app for visualizing and detecting pop-ins directly using Docker.

๐Ÿ”ง Option 1: Build and Run Locally

# Clone the repo if not already
git clone https://github.com/SerpRateAI/merrypopins.git
cd merrypopins

# Build the Docker image
docker build -t merrypopins-app .

# Run the app on http://localhost:8501
docker run -p 8501:8501 merrypopins-app

๐ŸŒ Option 2: Pull and Run Pre-built Image from Docker Hub (Recommended)

# Pull the latest pre-built image from Docker Hub
docker pull cacarvuai/merrypopins-app:latest

# Run the container
docker run -p 8501:8501 cacarvuai/merrypopins-app:latest

๐ŸŒŸ Access the App

Once the app is running, you can access it in your web browser at http://localhost:8501.

๐Ÿงผ Clean Up

To stop the app, press Ctrl+C in the terminal where it's running.

If you want to remove the Docker container, you can run:

docker rm -f $(docker ps -aq --filter "ancestor=cacarvuai/merrypopins-app:latest")

If you built the image locally, you can remove it with:

docker rmi merrypopins-app

Running the App Locally Without Docker

If you prefer to run the Streamlit app without Docker, you can do so by following these steps:

  1. Install the required dependencies for the app:

    pip install -r streamlit_app/requirements.txt
  2. Run the Streamlit app:

    streamlit run streamlit_app/app.py
  3. Open your web browser and go to http://localhost:8501 to access the app.

Publishing New App Versions to Docker Hub (Only for Maintainers)

To publish a new version of the Merrypopins Streamlit app, follow these steps:

  1. Update the version date in streamlit_app/app.py.
  2. Commit your changes to the Git repository.
  3. Push the changes to the remote repository.
  4. Build and push the updated Docker image to Docker Hub:
    docker login
    docker buildx build \
     --platform linux/amd64,linux/arm64 \
     -t cacarvuai/merrypopins-app:latest \
     --push .
  5. Update the documentation to reflect the new version.

Note: If you are not a maintainer, please do not attempt to publish new versions. Instead, you can contribute by submitting issues or pull requests with improvements or bug fixes. If you want to create your own version of the app, you can fork the repository and modify the code as needed. Just remember to change the Docker image name to avoid conflicts with the official version. e.g docker build -t yourdockerhubusername/merrypopins-app:latest .


Contributing

Contributions are welcome! Please file issues and submit pull requests on GitHub.

Branching Model

main โ† ๐Ÿ“ฆ production releases dev โ† ๐Ÿ›  active development (default Pull Request target)

  • main holds only stable, version-tagged releases.
  • dev is the rolling integration branch where all feature / fix PRs land first.
    Maintainers periodically open an internal PR from dev โ†’ main when a new release is ready.

How to Open a Pull Request

  1. Fork the repository to your GitHub account.
  2. Clone your fork and set the upstream remote:
    git clone https://github.com/<your-user>/merrypopins.git
    cd merrypopins
    git remote add upstream https://github.com/SerpRateAI/merrypopins.git
  3. Sync & branch off dev:
    git fetch upstream
    git checkout -b feature/awesome upstream/dev
  4. Do your work โ†’ commit:
    git commit -m "feat: add awesome feature"
  5. Push to your fork:
    git push origin feature/awesome
  6. Open a pull request into dev (set the PRโ€™s base branch to dev).
  7. Address any review comments & keep your feature branch updated with the latest dev if needed.

Note: Once your PR is merged into dev, the maintainers will handle promoting dev to main when preparing a new releaseโ€”please donโ€™t open PRs directly against main.


License

This project is licensed under the MIT. See LICENSE for details.

Packages

No packages published

Contributors 6