Skip to content

refactor_logging_utils #183

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

carlosrinc
Copy link

Refactored logging_utils.py with improved directory creation, error handling for wandb.init, and a configurable log file path.
Refactored dist_checkpoint_utils.py with improved path handling, directory creation, and error handling.
Added unit tests for logging_utils.py and dist_checkpoint_utils.py to verify the refactorings.

This commit introduces several improvements to training/utils/logging_utils.py and training/utils/dist_checkpoint_utils.py.

In logging_utils.py:

  • Replaced os.system("mkdir -p ...") with os.makedirs(..., exist_ok=True) for safer directory creation.
  • Added error handling for wandb.init() to catch potential initialization failures.
  • Made the loguru log file path configurable via arguments, defaulting to "logs/file_{time}.log".

In dist_checkpoint_utils.py:

  • Replaced os.system("mkdir -p ...") with os.makedirs(..., exist_ok=True).
  • Refactored path joining to reduce redundancy and improve readability.
  • Enhanced error handling in load_checkpoint by catching more specific exceptions (e.g., FileNotFoundError) and providing clearer messages.
  • Corrected path construction in load_stream_dataloader_state_dict to use existing variables.
  • Added basic error handling for saving dataset state dicts.

Unit tests have been added for both modules to cover the new functionality and error handling, ensuring the stability of these changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant