-
Notifications
You must be signed in to change notification settings - Fork 644
refactor: Remove coupled Unsloth service, upgrade to vLLM 0.11+ #497
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…orrect Llama naming conventions (#495)
…cessing and checkpoint saving. This change consolidates training input creation and log probability calculation into reusable functions, improving code maintainability and readability.
…ilure tracking and timeout handling. Update LoRA request handling in openai_server_task to ensure compatibility with Unsloth by checking for the lora_tensors attribute.
…unsloth and aiodns, adjust python version requirements, and modify torchao constraints. Enhance yes-no-maybe.ipynb with updated model registration and error handling for missing modules.
…unsloth, vllm, and transformers. Modify yes-no-maybe.ipynb to reflect model name changes and remove deprecated configurations. Enhance engine arguments in get_model_config to support vLLM 0.13+ compatibility, and refactor logging in server.py to utilize NewLineFormatter.
…oduce internal configuration settings. Modify decoupled_service.py to improve PEFT model initialization by checking for existing LoRA adapters, enhancing compatibility with loaded checkpoints.
…edUnslothService to optimize GPU memory usage. Update yes-no-maybe.py to comment out engine arguments for improved configuration management.
…d memory for faster tensor transfers, improving GPU memory management. Update yes-no-maybe.py to include initialization arguments for model configuration.
…ng.version for improved compatibility checks with vllm versioning.
…ig.py and LocalBackend. Delete UnslothService and ModelState classes to streamline architecture, enhancing maintainability and reducing complexity.
…te and DecoupledUnslothService. Streamline model configuration handling in get_model_config.py by eliminating obsolete checks, enhancing code clarity and maintainability.
…ompatibility. Remove obsolete environment variable checks in __init__.py and streamline import logic in server.py and patches.py.
…rted 'expandable_segments' configuration for PyTorch CUDA memory allocation. This change ensures compatibility with sleep mode functionality in the application.
…d related utilities into service.py, while removing the obsolete shared.py module. Update import paths in loss.py and backend.py for improved clarity and compatibility with the new structure.
…r.py for improved clarity. Update EngineArgs in engine.py to enhance readability of deprecated comments. Adjust formatting in backend.py and service.py for better code style consistency. Streamline tensor allocation logic in UnslothState and UnslothService to improve readability and maintainability.
…n guided_completion.py. Update message formatting in format_message.py to ensure proper handling of function tool calls. Simplify tool call representation in trajectories.py by utilizing model_dump for JSON output. Remove deprecated V1 engine checks in engine.py to streamline code.
…l call type. Remove deprecated patch functions from patches.py and update import statements in __init__.py for clarity and maintainability.
…d maintainability.
…at completions, improving response handling, and updating tool call management. Introduce logging for better error tracking and refactor imports for clarity.
…environment variable support for BASE_MODEL and MODEL_NAME. Adjust training step handling to utilize NUM_STEPS from environment variables, improving flexibility and configurability.
…adata and logging configuration. Adjust Python version in notebook metadata for compatibility.
… for isort. Refactor profile.ipynb to improve import order and enhance logging configuration. Update execution metadata in test_tokenize_trajectory_groups.ipynb for consistency and clarity.
…mports and adjusting their order for clarity. Update pyproject.toml to include 'wandb' as a known third-party package for isort. Refactor various scripts and notebooks to improve readability and maintainability, ensuring consistent import practices throughout the codebase.
…ps.ipynb for consistency. Remove unnecessary logging suppression during tokenizer loading to enhance clarity in the notebook's execution flow.
…imports and clearing output commands to enhance clarity. Introduce a new code cell for tokenizer initialization, improving the notebook's structure and readability.
…ne and PEFT arguments
…heir order in cli.py, trajectory_migration.py, log_constant_metrics_wandb.py, and test_trajectory_parquet.py for improved clarity and maintainability.
…message handling. Update `_flatten_message` function to accommodate `Choice` and `Choices` types. Enhance unit tests in `test_trajectory_parquet.py` with additional type checks and assertions for message roles, ensuring robust validation of trajectory data.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Upgrade to vLLM 0.11+, unsloth 2025.12, torch 2.8, and transformers 4.55+. Switch exclusively to decoupled Unsloth service architecture, removing legacy coupled mode and obsolete vLLM patches.