Skip to content

Conversation

@finbarrtimbers
Copy link
Collaborator

@finbarrtimbers finbarrtimbers commented Oct 29, 2025

Previously, they only ran when someone manually ran them.


Note

Runs dataset_transformation unit tests in CI with a new test suite and updated workflow, and factors dataset config construction into a reusable helper.

  • CI:
    • Use 8-Core-XL-Runner-Ubuntu-Latest and pass HF_TOKEN when running pytest in .github/workflows/tests.yml.
  • Tests:
    • Add open_instruct/test_dataset_transformation.py with unit tests for tokenizer equality, config-hash differences, and cached dataset parity (using small splits and temp caches).
  • Refactor:
    • Extract load_dataset_configs(...) and reuse it in get_cached_dataset_tulu_with_statistics(...).
    • Remove embedded test code and __main__ block from open_instruct/dataset_transformation.py.

Written by Cursor Bugbot for commit 4c7ef47. This will update automatically on new commits. Configure here.

finbarrtimbers and others added 30 commits October 21, 2025 09:47
Added detailed logging throughout the vLLM generation pipeline to diagnose why the benchmark script hangs:
- Log prompt submission and queue sizes in benchmark_generators.py
- Log actor_manager should_stop status and engine ready status
- Log _prefetch_worker iterations and request processing in vllm_utils.py
- Log async task creation, completion accumulation, and result queue operations

This will help identify where in the pipeline the hang occurs (prompt queue, async tasks, completion queue, or results queue).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Added detailed logging at critical points in the vLLM engine initialization
pipeline to diagnose where the script hangs:

- benchmark_generators.py: Log ActorManager creation and vLLM engine setup
- vllm_utils.py create_vllm_engines: Log engine creation loop progress
- LLMRayActor.__init__: Log each initialization step
- _setup_and_start_async_engine: Log thread creation and startup sequence

This will help identify whether the hang occurs during Ray actor creation,
actor initialization, or vLLM AsyncLLMEngine startup.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
The script was hanging because two placement groups were being created:
1. One in setup_vllm_engines() that allocated the GPU
2. Another in create_vllm_engines() that waited forever for the same GPU

The placement group created in setup_vllm_engines was only passed to
create_vllm_engines when single_gpu_mode=True, causing create_vllm_engines
to create a second placement group when single_gpu_mode=False.

Fix: Always pass the placement group to create_vllm_engines, preventing
the duplicate allocation attempt.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively refactors the testing for dataset_transformation.py by moving ad-hoc tests into a proper unittest suite, which is a great improvement for CI and maintainability. The changes also include significant enhancements to the benchmarking scripts, making them more configurable and robust. The updates to benchmark_generators.py and the new shell scripts for launching benchmarks are well-structured. I've found one minor semantic issue in how epoch_number is being set, which I've commented on. Overall, these are solid improvements to the project's testing and benchmarking infrastructure.

cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

Copy link
Collaborator

@hamishivi hamishivi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit on naming but otherwise g2g!

@finbarrtimbers finbarrtimbers added this pull request to the merge queue Nov 3, 2025
@hamishivi hamishivi removed this pull request from the merge queue due to a manual request Nov 3, 2025
finbarrtimbers and others added 2 commits November 3, 2025 10:05
Co-authored-by: Hamish Ivison <hamishivi@gmail.com>
Co-authored-by: Hamish Ivison <hamishivi@gmail.com>
@finbarrtimbers finbarrtimbers added this pull request to the merge queue Nov 3, 2025
Merged via the queue into main with commit 456d6a1 Nov 3, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants