Skip to content

Conversation

@specture724
Copy link
Collaborator

Refactor test_update.py to pytest style and create a workflow for main branch

@specture724 specture724 self-assigned this Oct 20, 2025
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the test_update.py to use pytest framework and adds CI workflow capabilities. The main change is replacing the direct test execution with a pytest-based test structure that uses subprocess to run distributed training via torchrun.

  • Converted test_update.py from standalone script to pytest-based test structure
  • Added pytest mark for GPU-specific tests with multi-GPU requirements
  • Enhanced logging control in ParameterServer to support flexible rank-based logging

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
tests/test_update.py Refactored from standalone script to pytest framework with subprocess-based distributed test execution
checkpoint_engine/ps.py Enhanced logging system to support configurable logger rank for different update scenarios
.pre-commit-config.yaml Added ruff ignore rule for subprocess security warning (S603)

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

assert world_size >= 2, "This test requires at least 2 GPUs."

master_addr = "localhost"
master_port = random.randint(20000, 30000)
Copy link

Copilot AI Oct 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using random port selection for distributed training can lead to port conflicts in CI environments. Consider using a more robust port selection mechanism or environment variable override.

Copilot uses AI. Check for mistakes.

Comment on lines +681 to 682
self._logger_rank = 0

Copy link

Copilot AI Oct 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _logger_rank initialization should be documented to explain when and how this value changes during execution.

Suggested change
self._logger_rank = 0
# _logger_rank determines which rank is responsible for logging output.
# By default, it is set to 0, meaning only rank 0 will perform logging.
# If logging from other ranks is required, this value can be changed accordingly.
self._logger_rank = 0

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant