Feat/gpu-ci #37

specture724 · 2025-10-20T12:10:52Z

Refactor test_update.py to pytest style and create a workflow for main branch

Copilot

Pull Request Overview

This PR refactors the test_update.py to use pytest framework and adds CI workflow capabilities. The main change is replacing the direct test execution with a pytest-based test structure that uses subprocess to run distributed training via torchrun.

Converted test_update.py from standalone script to pytest-based test structure
Added pytest mark for GPU-specific tests with multi-GPU requirements
Enhanced logging control in ParameterServer to support flexible rank-based logging

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
tests/test_update.py	Refactored from standalone script to pytest framework with subprocess-based distributed test execution
checkpoint_engine/ps.py	Enhanced logging system to support configurable logger rank for different update scenarios
.pre-commit-config.yaml	Added ruff ignore rule for subprocess security warning (S603)

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-20T12:11:24Z

tests/test_update.py

+    assert world_size >= 2, "This test requires at least 2 GPUs."
+
+    master_addr = "localhost"
+    master_port = random.randint(20000, 30000)


Using random port selection for distributed training can lead to port conflicts in CI environments. Consider using a more robust port selection mechanism or environment variable override.

Copilot · 2025-10-20T12:11:24Z

checkpoint_engine/ps.py

+        self._logger_rank = 0



The _logger_rank initialization should be documented to explain when and how this value changes during execution.

Suggested change

self._logger_rank = 0

# _logger_rank determines which rank is responsible for logging output.

# By default, it is set to 0, meaning only rank 0 will perform logging.

# If logging from other ranks is required, this value can be changed accordingly.

self._logger_rank = 0

specture724 added 2 commits October 20, 2025 12:06

refactor: pytest version of test_update

3e58010

fix: logger for update_p2p without rank 0

99cc301

specture724 requested review from Copilot and weixiao-huang October 20, 2025 12:10

specture724 self-assigned this Oct 20, 2025

Copilot AI reviewed Oct 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat/gpu-ci #37

Feat/gpu-ci #37

Uh oh!

specture724 commented Oct 20, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Oct 20, 2025

Uh oh!

Copilot AI Oct 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

-        self._logger_rank = 0
+        # _logger_rank determines which rank is responsible for logging output.
+        # By default, it is set to 0, meaning only rank 0 will perform logging.
+        # If logging from other ranks is required, this value can be changed accordingly.
+        self._logger_rank = 0

Feat/gpu-ci #37

Are you sure you want to change the base?

Feat/gpu-ci #37

Uh oh!

Conversation

specture724 commented Oct 20, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant