Skip to content

Conversation

@yashuatla
Copy link
Owner

This PR contains changes from a range of commits from the original repository.

Commit Range: 516843d..c88d236
Files Changed: 47 (22 programming files)
Programming Ratio: 46.8%

Commits included:

ryangyuan and others added 13 commits April 24, 2025 14:25
* feat: replace tutorial link

* replace video link

---------

Co-authored-by: kevin-mindverse <kevin@mindverse.ai>
* Add CUDA support

- CUDA detection
- Memory handling
- Ollama model release after training

* Fix logging issue

added cuda support flag so log accurately reflected cuda toggle

* Update llama.cpp rebuild

Changed llama.cpp to only check if cuda support is enabled and if so rebuild during the first build rather than each run

* Improved vram management

Enabled memory pinning and optimizer state offload

* Fix CUDA check

rewrote llama.cpp rebuild logic, added manual y/n toggle if user wants to enable cuda support

* Added fast restart and fixed CUDA check command

Added make docker-restart-backend-fast to restart the backend and reflect code changes without causing a full llama.cpp rebuild

Fixed make docker-check-cuda command to correctly reflect cuda support

* Added docker-compose.gpu.yml

Added docker-compose.gpu.yml to fix error on machines without nvidia gpu and made sure "\n" is added before .env modification

* Fixed cuda toggle

Last push accidentally broke cuda toggle

* Code review fixes

Fixed errors resulting from removed code:
- Added return save_path to end of save_hf_model function
- Rolled back download_file_with_progress function

* Update Makefile

Use cuda by default when using docker-restart-backend-fast

* Minor cleanup

Removed unnecessary makefile command and fixed gpu logging

* Delete .gpu_selected

* Simplified cuda training code

- Removed dtype setting to let torch automatically handle it
- Removed vram logging
- Removed Unnecessary/old comments

* Fixed gpu/cpu selection

Made "make docker-use-gpu/cpu" command work with .gpu_selected flag and changed "make docker-restart-backend-fast" command to respect flag instead of always using gpu

* Fix Ollama embedding error

Added custom exception class for Ollama embeddings, which seemed to be returning keyword arguments while the Python exception class only accepts positional ones

* Fixed model selection & memory error

Fixed training defaulting to 0.5B model regardless of selection and fixed "free(): double free detected in tcache 2" error caused by cuda flag being passed incorrectly
…rse#279)

* feature: use uv to setup python environment

* TrainProcessService add singleten method: get_instance
* feature: use uv to setup python environment

* TrainProcessService add singleten method: get_instance

* feat: fix code

* Added CUDA support (mindverse#228)

* Add CUDA support

- CUDA detection
- Memory handling
- Ollama model release after training

* Fix logging issue

added cuda support flag so log accurately reflected cuda toggle

* Update llama.cpp rebuild

Changed llama.cpp to only check if cuda support is enabled and if so rebuild during the first build rather than each run

* Improved vram management

Enabled memory pinning and optimizer state offload

* Fix CUDA check

rewrote llama.cpp rebuild logic, added manual y/n toggle if user wants to enable cuda support

* Added fast restart and fixed CUDA check command

Added make docker-restart-backend-fast to restart the backend and reflect code changes without causing a full llama.cpp rebuild

Fixed make docker-check-cuda command to correctly reflect cuda support

* Added docker-compose.gpu.yml

Added docker-compose.gpu.yml to fix error on machines without nvidia gpu and made sure "\n" is added before .env modification

* Fixed cuda toggle

Last push accidentally broke cuda toggle

* Code review fixes

Fixed errors resulting from removed code:
- Added return save_path to end of save_hf_model function
- Rolled back download_file_with_progress function

* Update Makefile

Use cuda by default when using docker-restart-backend-fast

* Minor cleanup

Removed unnecessary makefile command and fixed gpu logging

* Delete .gpu_selected

* Simplified cuda training code

- Removed dtype setting to let torch automatically handle it
- Removed vram logging
- Removed Unnecessary/old comments

* Fixed gpu/cpu selection

Made "make docker-use-gpu/cpu" command work with .gpu_selected flag and changed "make docker-restart-backend-fast" command to respect flag instead of always using gpu

* Fix Ollama embedding error

Added custom exception class for Ollama embeddings, which seemed to be returning keyword arguments while the Python exception class only accepts positional ones

* Fixed model selection & memory error

Fixed training defaulting to 0.5B model regardless of selection and fixed "free(): double free detected in tcache 2" error caused by cuda flag being passed incorrectly

* fix: train service singlten

---------

Co-authored-by: Zachary Pitroda <30330004+zpitroda@users.noreply.github.com>
* fix: adjustment status order

* fix: adjustment train status

* fix: split the status of service and train
* Update README.md

Changed the updated tutorial link

* Update README.md with FAQ

New section for FAQ doc
* fix: adjustment status order

* fix: adjustment train status

* fix: split the status of service and train

* feat: adjustment train rule
* feat: what? no llama.cpp

* add cache
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants