Feat/mlflow models from code migration vacation recommendation with bert #274

gabisponciano · 2025-09-02T11:55:30Z

MLflow 3.1.0 Models-from-Code Migration for Vacation Recommendation with BERT

Overview

This migration modernizes the "Vacation Recommendation with BERT" blueprint by transitioning from MLflow's legacy serialization-based model logging (python_model) to the models-from-code approach (loader_module + data_path). This update resolves MLflow 3.1.0 compatibility issues, improves code architecture, and maintains API compatibility for vacation recommendation workflows.

Key Changes

Primary Goal: Eliminate MLflow 3.1.0 serialization errors and adopt the models-from-code pattern.
Technical Approach: Introduce a clean separation of concerns with standalone BERT model classes.
Universal Structure: Align with the standardized src/mlflow/ structure established in PR feat: MLflow 3.1.0 Models-from-Code Migration for Vanilla RAG Blueprint #208.
Scope: Comprehensive migration affecting model loading, recommendation generation, and deployment workflows.

Universal Structure Standardization

The blueprint now follows the universal AI-Blueprints structure pattern:

Standardized Architecture

src/
├── utils.py               # Common utilities (e.g., get_model_path, load_config)
└── mlflow/
    ├── __init__.py        # Dynamic imports
    ├── model.py           # Business logic for BERT recommendation
    ├── loader.py          # Canonical loader
    ├── logger.py          # Canonical logger

Universal Loader & Logger Synchronization

loader.py and logger.py: Exact copies from PR feat: MLflow 3.1.0 Models-from-Code Migration for Vanilla RAG Blueprint #208.
Signature Handling: Signature creation occurs in the notebook and is passed to Logger.log_model(signature, ...).
Loader Module: All references now use src.mlflow.loader.
Class Names: Generic Model and Logger (no blueprint-specific prefixes).

Technical Changes

New Architecture Components

`src/mlflow/model.py`

Purpose: Standalone BERT recommendation logic with zero MLflow dependencies.
Class Name: class Model (generic name, no blueprint prefixes).
Functionality:
- Loads pre-trained BERT models and tokenizers.
- Processes vacation-related input data.
- Generates vacation recommendations based on embeddings.
- Maintains the predict(model_input, params) API signature for backward compatibility.
Design Pattern: Clean separation between BERT logic and MLflow integration.

`src/mlflow/loader.py`

Purpose: MLflow models-from-code entry point implementing _load_pyfunc().
Source: Exact copy from PR feat: MLflow 3.1.0 Models-from-Code Migration for Vanilla RAG Blueprint #208 with adaptations for BERT.
Functionality:
- Loads BERT model artifacts (e.g., tokenizer, model weights).
- Validates artifact directory structure.
- Returns an initialized Model instance for prediction.

`src/mlflow/logger.py`

Purpose: MLflow registration layer for packaging BERT models.
Source: Exact copy from PR feat: MLflow 3.1.0 Models-from-Code Migration for Vanilla RAG Blueprint #208 with signature parameter requirement.
Key Changes:
- Requires signature as the first parameter: Logger.log_model(signature, ...).

`src/mlflow/init.py`

Purpose: Standardized dynamic imports for universal structure.
Source: Exact copy from PR feat: MLflow 3.1.0 Models-from-Code Migration for Vanilla RAG Blueprint #208.
Exports: Model and Logger.

Notebook Updates

Enhanced Signature Handling

Signature creation now occurs in the notebook:

from mlflow.models.signature import ModelSignature
from mlflow.types.schema import Schema, ColSpec

# Define model input/output schema for BERT recommendation
input_schema = Schema([
    ColSpec("string", "user_id"),  # User identifier
    ColSpec("string", "preferences")  # User preferences
])
output_schema = Schema([
    ColSpec("string", "recommendation")  # Recommended vacation
])
signature = ModelSignature(inputs=input_schema, outputs=output_schema)

# Pass signature to logger
Logger.log_model(
    signature=signature,
    artifact_path=MODEL_NAME,
    config_path=CONFIG_PATH,
    docs_path=ARTIFACTS_PATH,  # Contains BERT artifacts
    demo_folder=DEMO_FOLDER
)

Artifact Management

BERT Model Artifacts Structure

The blueprint now handles BERT-specific artifacts:

/artifacts/data/
    ├── config.yaml          # Model configuration
    ├── tokenizer/           # Tokenizer files
    ├── model/               # Pre-trained BERT model weights
    └── demo/                # UI components

Testing Strategy

Manual Testing

Test Scenarios:
- BERT model registration with vacation recommendation datasets.
- Model loading and recommendation generation.
- Deployment validation through Streamlit UI with various user inputs.
- Notebook execution validation in both development and MLflow deployment contexts.
- API endpoint testing with different payload configurations.

Quality Assurance

Code Quality

Code Style: Consistent with repository standards, comprehensive docstrings, proper type hints.
Universal Structure: Follows canonical pattern from PR feat: MLflow 3.1.0 Models-from-Code Migration for Vanilla RAG Blueprint #208.
Documentation: Clear architectural layer responsibilities, detailed BERT recommendation documentation.
Error Handling: Robust exception management with informative error messages and logging.

Performance Impact

Model Loading: Faster initialization due to optimized artifact handling.
Memory Usage: Reduced memory footprint by removing unnecessary MLflow inheritance.
Deployment Time: Improved deployment reliability with models-from-code approach.

Review Guidelines

Critical Review Areas

Universal Structure: Verify alignment with canonical structure from PR feat: MLflow 3.1.0 Models-from-Code Migration for Vanilla RAG Blueprint #208.
Signature Handling: Confirm signature creation occurs in the notebook and is passed to the logger.
MLflow Integration: Verify loader.py correctly implements models-from-code pattern.
API Compatibility: Confirm Model.predict() maintains identical signature and behavior.
Artifact Handling: Validate proper organization and cleanup of BERT artifacts.
Configuration Management: Review model path resolution and environment variable handling.
ONNX Support: Review if ONNX is supported

Deployment Considerations

Rollback Procedure: Previous python_model approach is incompatible with models-from-code.
Environment Setup: Ensure MODEL_ARTIFACTS_PATH environment variable is configured in deployment containers.
Dependencies: Verify MLflow 3.1.0 compatibility with transformers in target deployment environments.
Artifact Dependencies: Ensure all required BERT artifacts are present in deployment packages.

Migration Status: ✅ Complete and Ready for Review

Specialized Features:

✅ BERT Models: Full support for vacation recommendation generation.
✅ User Preferences Analysis: Specialized recommendation logic for user preferences.
✅ Artifact Management: Optimized handling of tokenizer and model weights.
✅ Streamlined Deployment: Universal structure alignment with other blueprints.

Printed page of the Streamlit web app showing evidence of successful local deployment and API testing:

Streamlit for Vacation Recommendation with BERT.pdf

…s-from-code approach

…ndation-with-bert' of https://github.com/HPInc/AI-Blueprints into feat/mlflow-models-from-code-migration-vacation-recommendation-with-bert

…at/mlflow-models-from-code-migration-vacation-recommendation-with-bert

…ndation-with-bert' of https://github.com/HPInc/AI-Blueprints into feat/mlflow-models-from-code-migration-vacation-recommendation-with-bert

for more information, see https://pre-commit.ci

ata-turhan

Looks great 🚀

ata-turhan and others added 5 commits August 29, 2025 10:34

Delete deep-learning/text-generation-with-rnn/notebooks/models/model.txt

9926d6f

Migrate vacation-recommendation-with-bert from MLflow legacy to model…

f8369c2

…s-from-code approach

[refacotr] adjustments on run-workflow

895cc76

[refactor] Adjustmens to follow pattern

adb13b6

[refactor] Adjusting logger and model, notebooks output

0c7b4e3

gabisponciano self-assigned this Sep 2, 2025

gabisponciano marked this pull request as draft September 2, 2025 11:55

github-actions bot added enhancement Improvements to existing features dependencies Pull requests that update a dependency file python Pull requests that update python code labels Sep 2, 2025

gabisponciano added 2 commits September 4, 2025 09:27

[refactor] Using onnx log_model

9e26939

Merge branch 'feat/mlflow-models-from-code-migration-vacation-recomme…

78502ac

…ndation-with-bert' of https://github.com/HPInc/AI-Blueprints into feat/mlflow-models-from-code-migration-vacation-recommendation-with-bert

ata-turhan force-pushed the main branch from f5a9249 to 162ff14 Compare September 4, 2025 12:54

gabisponciano added 3 commits September 4, 2025 14:32

[refactor] changing the logger pattern

d081579

[refactor] ONNX integration with mlflow new pattern

fbfb343

[test] notebook output and streamlit pdf

9c411a9

github-actions bot added the documentation Improvements or additions to documentation label Sep 8, 2025

gabisponciano added 2 commits September 10, 2025 15:36

Merge branch 'main' of https://github.com/HPInc/AI-Blueprints into fe…

847d146

…at/mlflow-models-from-code-migration-vacation-recommendation-with-bert

Merge branch 'feat/mlflow-models-from-code-migration-vacation-recomme…

10a794e

…ndation-with-bert' of https://github.com/HPInc/AI-Blueprints into feat/mlflow-models-from-code-migration-vacation-recommendation-with-bert

gabisponciano marked this pull request as ready for review September 10, 2025 18:38

gabisponciano requested a review from ata-turhan September 10, 2025 18:38

ata-turhan force-pushed the feat/mlflow-models-from-code-migration-vacation-recommendation-with-bert branch from 539b308 to 10a794e Compare September 25, 2025 15:25

all the merge conflicts are resolved

7cef091

github-actions bot removed the documentation Improvements or additions to documentation label Oct 14, 2025

pre-commit-ci bot and others added 3 commits October 14, 2025 20:10

[pre-commit.ci] auto fixes from pre-commit.com hooks

efb60ec

for more information, see https://pre-commit.ci

more merge conflicts are resolved

fb0d708

[pre-commit.ci] auto fixes from pre-commit.com hooks

9bb6110

for more information, see https://pre-commit.ci

ata-turhan approved these changes Oct 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat/mlflow models from code migration vacation recommendation with bert #274

Feat/mlflow models from code migration vacation recommendation with bert #274

Uh oh!

gabisponciano commented Sep 2, 2025 •

edited by ata-turhan

Loading

Uh oh!

ata-turhan left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Feat/mlflow models from code migration vacation recommendation with bert #274

Are you sure you want to change the base?

Feat/mlflow models from code migration vacation recommendation with bert #274

Uh oh!

Conversation

gabisponciano commented Sep 2, 2025 • edited by ata-turhan Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

MLflow 3.1.0 Models-from-Code Migration for Vacation Recommendation with BERT

Overview

Key Changes

Universal Structure Standardization

Standardized Architecture

Universal Loader & Logger Synchronization

Technical Changes

New Architecture Components

src/mlflow/model.py

src/mlflow/loader.py

src/mlflow/logger.py

src/mlflow/__init__.py

Notebook Updates

Enhanced Signature Handling

Artifact Management

BERT Model Artifacts Structure

Testing Strategy

Manual Testing

Quality Assurance

Code Quality

Performance Impact

Review Guidelines

Critical Review Areas

Deployment Considerations

Uh oh!

ata-turhan left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gabisponciano commented Sep 2, 2025 •

edited by ata-turhan

Loading

`src/mlflow/model.py`

`src/mlflow/loader.py`

`src/mlflow/logger.py`

`src/mlflow/init.py`