refactor_logging_utils #183
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Refactored logging_utils.py with improved directory creation, error handling for wandb.init, and a configurable log file path.
Refactored dist_checkpoint_utils.py with improved path handling, directory creation, and error handling.
Added unit tests for logging_utils.py and dist_checkpoint_utils.py to verify the refactorings.
This commit introduces several improvements to
training/utils/logging_utils.py
andtraining/utils/dist_checkpoint_utils.py
.In
logging_utils.py
:os.system("mkdir -p ...")
withos.makedirs(..., exist_ok=True)
for safer directory creation.wandb.init()
to catch potential initialization failures.loguru
log file path configurable via arguments, defaulting to "logs/file_{time}.log".In
dist_checkpoint_utils.py
:os.system("mkdir -p ...")
withos.makedirs(..., exist_ok=True)
.load_checkpoint
by catching more specific exceptions (e.g.,FileNotFoundError
) and providing clearer messages.load_stream_dataloader_state_dict
to use existing variables.Unit tests have been added for both modules to cover the new functionality and error handling, ensuring the stability of these changes.