Skip to content

feat: Integrate MLflow for metrics and artifacts #229

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

linusseelinger
Copy link
Contributor

@linusseelinger linusseelinger commented Jun 17, 2025

Relevant issue or PR

N/A

Description of changes

  • tesseract engine now wraps each evaluation in an MLflow run -> users can simply call mlflow.log_metric and friends in their code
  • tesseract serve spins up an MLflow server, ensures all tesseracts are on the same network (shared network only existed within each individual docker compose file before), and ensures all tesseracts find the MLflow server
  • tesseract run (and custom docker client under the hood) now passes through env. variables and network. This can be used to point a tesseract to an external MLflow server, e.g. in P4D.
HOW TO
Option A: Serve tesseract

Serving a tesseract as always via

tesseract serve helloworld

automatically spins up an MLflow server (shared among all served tesseracts), which you can access at http://localhost:5000 in order to view metrics.
After requesting an apply (or jacobian, ...) from the tesseract, a new entry will show up in the MLflow server.

Option B: Single tesseract execution
tesseract run helloworld apply --network=host --env=MLFLOW_TRACKING_URI="http:/
/localhost:5000" '{"inputs": {"name": "Osborne"}}'

In this case, we can't spin up an MLflow server. Instead, you'll have to point the tesseract to an existing one (e.g. one running on your host system).
If you don't specify MLFLOW_TRACKING_URI, the Mlflow client falls back to logging locally within the tesseract, making logged metrics nearly inaccessible.

TODO
  • Clean up helloworld example (now logs a metric in MLflow; should probably be removed later)
  • Add documentation
  • By default, the MLflow client simply blocks if it doesn't find a tracking server. We should introduce a timeout and make MLflow fall back to local logging of metrics.
  • Investigate whether tesseracts triggered thousands of times flood the MLflow server with noticeable performance impact; maybe ensure that if the user code doesn't log a single metric, we simply don't submit a run to the tracking server?
  • Support and document custom MLFLOW_TRACKING_URI in tesseract serve

Testing done

Local testing for calling via tesseract run and via tesseract serve.

Copy link

codecov bot commented Jun 17, 2025

Codecov Report

Attention: Patch coverage is 29.41176% with 12 lines in your changes missing coverage. Please review.

Project coverage is 66.82%. Comparing base (5beaafa) to head (000bbfa).
Report is 10 commits behind head on main.

Files with missing lines Patch % Lines
tesseract_core/sdk/docker_client.py 0.00% 8 Missing and 2 partials ⚠️
tesseract_core/sdk/engine.py 71.42% 1 Missing and 1 partial ⚠️

❗ There is a different number of reports uploaded between BASE (5beaafa) and HEAD (000bbfa). Click for more details.

HEAD has 16 uploads less than BASE
Flag BASE (5beaafa) HEAD (000bbfa)
24 8
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #229      +/-   ##
==========================================
- Coverage   75.98%   66.82%   -9.16%     
==========================================
  Files          28       28              
  Lines        3110     3171      +61     
  Branches      489      510      +21     
==========================================
- Hits         2363     2119     -244     
- Misses        529      857     +328     
+ Partials      218      195      -23     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@joglekara
Copy link

very nice!

how is this exposed for tesseract use via tesseract-jax? Or is that out of scope and this implementation requires the use of the CLI?

@linusseelinger
Copy link
Contributor Author

how is this exposed for tesseract use via tesseract-jax?

We log metrics to an MLflow server outside of tesseract-core, so that's where you can view metrics (added a HOW TO section above).
This should all work through tesseract-jax as well. Would be great if you could give it a try :) Rebuilding a tesseract with this branch (and adding metrics to it) should do the trick!

@joglekara
Copy link

Hmm, one more thing. I run a self-hosted mlflow server that I would like to push to rather than to a new one and then figure out how to export/import. Can we read in a MLFLOW_TRACKING_URI from somewhere?

@linusseelinger
Copy link
Contributor Author

Hmm, one more thing. I run a self-hosted mlflow server that I would like to push to rather than to a new one and then figure out how to export/import. Can we read in a MLFLOW_TRACKING_URI from somewhere?

Good point! As it stands, you can't set MLFLOW_TRACKING_URI in the docker-compose setup. Making a note on that though, we should support that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants