You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PR #436 added an after_batch callback to Chronos2Pipeline.predict (see src/chronos/chronos2/pipeline.py:627-652). The callback is invoked inside the per-batch loop, but the loop is not wrapped in a try / finally, and the callback's resource-safety contract is undocumented.
If a user callback raises (for example the AutoGluon time-limit signalling referenced in #436), any batch-scoped resources acquired in the loop (the DataLoader created at line 627 with pin_memory=True on CUDA, in-flight CUDA streams, open worker processes) are released only when Python's GC happens to run, and any predictions already appended to all_predictions are discarded without being returned to the caller.
Proposal
Pick one of the following and document it in the after_batch parameter docstring:
Contract: state that after_batch must not raise; any exception will abort predict and discard in-flight results. Callers are responsible for catching inside the callback.
Hardening: wrap the batch loop in try / finally so the DataLoader and pinned host buffers are always released deterministically, and optionally return the partial predictions collected so far when the callback signals early stop via return value.
Option 1 is zero-code; option 2 covers the AutoGluon timeout path without requiring users to audit their callbacks for resource safety.
Problem
PR #436 added an
after_batchcallback toChronos2Pipeline.predict(seesrc/chronos/chronos2/pipeline.py:627-652). The callback is invoked inside the per-batch loop, but the loop is not wrapped in atry/finally, and the callback's resource-safety contract is undocumented.If a user callback raises (for example the AutoGluon time-limit signalling referenced in #436), any batch-scoped resources acquired in the loop (the
DataLoadercreated at line 627 withpin_memory=Trueon CUDA, in-flight CUDA streams, open worker processes) are released only when Python's GC happens to run, and any predictions already appended toall_predictionsare discarded without being returned to the caller.Proposal
Pick one of the following and document it in the
after_batchparameter docstring:after_batchmust not raise; any exception will abortpredictand discard in-flight results. Callers are responsible for catching inside the callback.try/finallyso theDataLoaderand pinned host buffers are always released deterministically, and optionally return the partial predictions collected so far when the callback signals early stop via return value.Option 1 is zero-code; option 2 covers the AutoGluon timeout path without requiring users to audit their callbacks for resource safety.
References
Chronos-2: Add after_batch callback).src/chronos/chronos2/pipeline.py:627-652.astinspection ofmainconfirms zerotry/finallywrapping theafter_batch_callback()call at line 652.Environment
mainas of commit6d68ed7.