237 copy purge model storage functionality for main #238

david-thrower · 2025-09-21T22:06:46Z

Item 1: Bug fix:

When a Keras metric returns a non-float value like "inf", it would break casting to float / computing .min() / .max() for finding the best performing model.

Item 2:

A model for each sub-trial are stored in the experiment folder Cerebros creates.
In most cases only the best one from that meta-trial needs preserved.
The models cached can hold up a lot of disk space, especially for NLP / CV problems.
When running in a container [unless you have a mounted volume] these are stored in memory / in RAM, and the RAM pressure, therefore becomes cumulative.
With the new feature, by default, this cache is not purged to maintain backward compatibility. When you explicitly set purge_model_storage_files=True when calling cerebros.simile_rabdom_search.SimpleRandomSearch().get_best_model(), the the storage will be purged after returning the best model.

Item 3:

Appended the 2 Ames CICD tests with a positive case and negative case test to endure this feature works when explicitly called and is not triggered unless called explicitly / proving backward compatibility.

Duplicate vetted bug fix (keras metrics returning "inf" or other non-float types) from #234 , #230 without other non-vetted changes.

Trigger CICD suite to run tests.

Add functionality from #230 to purge model storage after each meta - trail.

Add negative case test for purge_model_storage.

Trigger tests.

Syntax correction

Add positive case test for assert purge_model_storage_files.

Syntax

Added corrected CICD test for positive case for purge_model_storage_files.

Add better CICD test for negative case for purge_model_storage_files.

Syntax correction / typo.

Syntax / typo.

Refactor best metric error handling to allow +/- inf metrics (these actually will work in current versions of pd.Series.min() / ....max() ), but still exclude other arbitrary types (str, Exception)...

Thunderblok

✅ What’s good

Fix for non-float metrics (e.g., "inf") avoids best-metric crashes.

New purge_model_storage_files=True option frees disk/RAM after picking best sub-trial model.

Tests include positive & negative cases to prove opt-in and back-compat.

⚠️ Small asks before merge

Metric safety

Coerce values via a helper that treats NaN/±inf as non-finite and falls back cleanly.

Add a tiny table test for ["1.0","inf","-inf","nan", None].

Purge safety

Guard against symlinks and .. paths; don’t follow links.

Wrap purge in try/except so failure doesn’t break get_best_model().

Log simple stats: {dirs_deleted, files_deleted, bytes_freed, duration_ms}.

DX/Docs

One line in README: “Use purge_model_storage_files=True to reclaim space (especially in containers).”

👍 Nice-to-have (don’t block)

Warn if artifacts are on Docker overlay (no volume mount).

Env toggle CEREBROS_PURGE_DEFAULT for ops environments.

✅ Verdict

Approve with the above nits. The change is backward-compatible, fixes a real footgun, and gives operators a clean way to control disk/RAM.

david-thrower added 12 commits September 21, 2025 01:02

Update simple_cerebros_random_search.py

f60de6b

Duplicate vetted bug fix (keras metrics returning "inf" or other non-float types) from #234 , #230 without other non-vetted changes.

Update automerge.yml

852ae14

Trigger CICD suite to run tests.

Update simple_cerebros_random_search.py

d774374

Add functionality from #230 to purge model storage after each meta - trail.

Update regression-example-ames-no-preproc.py

3f9bd0c

Add negative case test for purge_model_storage.

Update automerge.yml

e078d99

Trigger tests.

Update regression-example-ames-no-preproc.py

20a33df

Syntax correction

Update regression-example-ames-no-preproc-val-set.py

5a6b6f6

Add positive case test for assert purge_model_storage_files.

Update regression-example-ames-no-preproc-val-set.py

49d261c

Syntax

Update regression-example-ames-no-preproc-val-set.py

2f23de9

Added corrected CICD test for positive case for purge_model_storage_files.

Update regression-example-ames-no-preproc.py

d88b3dc

Add better CICD test for negative case for purge_model_storage_files.

Update regression-example-ames-no-preproc-val-set.py

fbc8780

Syntax correction / typo.

Update regression-example-ames-no-preproc-val-set.py

c587c68

Syntax / typo.

david-thrower linked an issue Sep 21, 2025 that may be closed by this pull request

copy-purge-model-storage-functionality-for-main #237

Open

3 tasks

david-thrower self-assigned this Sep 21, 2025

david-thrower requested a review from Thunderblok September 21, 2025 22:17

david-thrower mentioned this pull request Sep 22, 2025

236 copy of 234 error handling on finding best metric #239

Open

Update simple_cerebros_random_search.py

fcc2efd

Refactor best metric error handling to allow +/- inf metrics (these actually will work in current versions of pd.Series.min() / ....max() ), but still exclude other arbitrary types (str, Exception)...

Thunderblok approved these changes Sep 22, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

237 copy purge model storage functionality for main #238

237 copy purge model storage functionality for main #238

Uh oh!

david-thrower commented Sep 21, 2025

Uh oh!

Thunderblok left a comment •

edited

Loading

Uh oh!

Uh oh!

237 copy purge model storage functionality for main #238

Are you sure you want to change the base?

237 copy purge model storage functionality for main #238

Uh oh!

Conversation

david-thrower commented Sep 21, 2025

Item 1: Bug fix:

Item 2:

Item 3:

Uh oh!

Thunderblok left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Thunderblok left a comment •

edited

Loading