GPU Accelaration for Clustering #195

rendyhd · 2025-11-20T08:38:21Z

Clustering was very slow for me, this PR is my attempt to add GPU support for Clustering. In my case it speeds it up by 11-12x.
I had to include a new DockerFile because of new dependencies. This made the Docker build incredibly slow, so I changed it to a more optimal version.

I haven't done a regression on CPU clustering yet.

edit:

Includes fix for OpenRouter rate limit and empty response Fix/openrouter rate limit and dmr #196
Includes DMR fix discussed in Add Docker Model Runner–Ready Compose Configuration #194

…s8mGAn5UghuFemm5ydF3 Claude/add openrouter support 01 c fs8m g an5 ughu femm5yd f3

CONFIRMED: GPU (CUDA) is currently only used for Analysis, not Clustering RESEARCH FINDINGS: - GPU acceleration can provide 10-30x speedup for clustering tasks - KMeans: 10-50x faster, DBSCAN: 5-100x faster, PCA: 10-40x faster - Example: 5000 clustering runs that take 2-4 hours on CPU can complete in 5-15 minutes on GPU IMPLEMENTATION: Implemented GPU-accelerated clustering using RAPIDS cuML as an optional feature: New Features: - GPU-accelerated KMeans, DBSCAN, and PCA using RAPIDS cuML - Automatic fallback to CPU if GPU unavailable or encounters errors - New environment variable USE_GPU_CLUSTERING (default: false) - Maintains all existing clustering options and settings - Compatible with CUDA 12.2+ and NVIDIA GPUs Files Modified: 1. config.py: - Added USE_GPU_CLUSTERING configuration flag 2. tasks/clustering_gpu.py (NEW): - Created GPU clustering module with RAPIDS cuML implementations - GPU-accelerated classes: GPUKMeans, GPUDBSCAN, GPUPCA - CPU-only wrappers: GPUGaussianMixture, GPUSpectralClustering - Factory functions for model creation with GPU/CPU selection - Automatic GPU availability detection and graceful fallback 3. tasks/clustering_helper.py: - Imported GPU clustering module with fallback handling - Updated _perform_single_clustering_iteration to use GPU PCA when enabled - Modified _apply_clustering_model to support GPU clustering - Maintains full backward compatibility with CPU-only mode 4. Dockerfile: - Added cupy-cuda12x and cuml-cu12 installation for NVIDIA builds - Only installs GPU packages when BASE_IMAGE is nvidia/cuda - CPU builds remain unchanged and lightweight 5. deployment/.env.example: - Added USE_GPU_CLUSTERING configuration with documentation - Default: false (CPU only, backward compatible) 6. README.md: - Added "GPU Acceleration for Clustering" section - Documented performance improvements and usage instructions - Listed supported algorithms and compatibility requirements - Noted that GaussianMixture and SpectralClustering use CPU (no GPU version) Notes: - GPU clustering is OPTIONAL and disabled by default - CPU clustering remains the default for backward compatibility - All existing clustering parameters and settings are preserved - GaussianMixture and SpectralClustering always use CPU (no cuML implementation) - GPU usage: Analysis (ONNX inference) + Clustering (RAPIDS cuML, optional)

…Ju9JeTX1N4popdd3Kt6FoM

The bash -lc subshell prevented access to the BASE_IMAGE ARG, causing GPU packages (cupy, cuml) to never be installed. Changed to use 'set -ux;' pattern (matching base stage) which properly accesses Docker ARG variables in the current shell. This ensures cuML and cupy are installed when building with nvidia/cuda base images, enabling GPU-accelerated clustering.

The new docker image needed packages that where quite large, resulting in a very slow docker build

Feat/gpu clustering

…R fix

rendyhd · 2025-11-20T15:37:15Z

I just did a clean install and found it this breaks the Analysis, I previously only tested it on Clustering. Please hold while I check.

NeptuneHub · 2025-11-20T17:36:24Z

Please let me know when finished with that for fix. And yes, thanks, please do a full round of test with ALL the functionality to be sure that even not "directly connected" functionality continue working.

Also I'm a bit sceptical to have a clustering.py and clustering_gpu.py. Is it possible to use your library with both CPU and GPU to don't have duplicated code to mantain? It will run also on old cpu?

rendyhd · 2025-11-20T20:50:05Z

The mistake wasn't in the code, but I build the image with the wrong base image.
Just finished a full test on the GPU / Cuda version. I haven't tested a CPU only image.
Can confirm:

Analysis and Clustering are both GPU accelerated.
Improved response handling from OpenRouter
All other pages have been tested and still work.
Note, I haven't done any API specific testing.

NeptuneHub · 2025-11-21T12:51:52Z

tasks/clustering_gpu.py

Thanks for your effort but I think that having a totally different algorithm for clustering is not ok because we duplicate the code to mantain.

Please check if the same library that support GPU can also support CPU. In this case add all in one file (clustering.py) that should call the same function.

In doing this please do some test that this work good with both CPU and GPU. Check also (maybe in the documentation) that the library work as well on ARM and also on old Intel CPU (some user used very old CPU, you can have a look in the past closed issue to read more).

Meanwhile I'm doing some test to check how is going as it is on GPU.

I'll try to look into it. I separated it on purpose because I didn't want to mess up the CPU compatibility, but the used package didn't support GPU acceleration. In my case clustering took over 50 hours, which made testing very hard.
I understand your concern of having to maintain two version though.

Ok but please take attention: in other issue related to old cpu support (generally related to AVX) even include the library/code without running the functionality itself bring the entire application to don't start.

In Artist Similarity some days ago I introduced a python library, that use a C library under the hood with need of AVX, and it bring the entire application to don't start.

The only alternative if this library require AVX is to look if there is "an altenrative version" or the possibility to recompile without it.

I think another things that have sense for who have big library is run clustering only with 1000 run instead of 5000 that was super catelautive.

That's good input, I hadn't considered that. I've only been looking at optimizing for cuda.
That also makes it a lot more complex. I appreciate you take this into account, I can imagine a lot of people in this hobby run their music library on an old server.

NeptuneHub · 2025-11-21T13:47:27Z

Dockerfile

Cache seems not working correctly, each rebuild takes several minutes on this steps evne if I don't change a line of code, please review:

=> [audiomuse-ai-flask] exporting to docker image format 22.7s => => exporting layers 0.0s => => exporting manifest sha256:3c5fe6420f4c612992b198d7fdd8fdb0e082ac2690526d71e1f1763a97b0a5a3 0.0s => => exporting config sha256:b00acf6664a8f1a6366be39010fac462b48063e156874a6f08b7794349ebfb03 0.0s => => sending tarball 22.7s => [audiomuse-ai-worker] exporting to docker image format 22.7s => => exporting layers 0.0s => => exporting manifest sha256:1332fa40105ad14378572056f87954783640142115eea4b503b9779abbfed53f 0.0s => => exporting config sha256:a6a0206adf191b8af770cf1601ca97fd4f5b7ed41140826c1a484ed3a27bca45 0.0s => => sending tarball 22.7s

I've been experiencing a slow build as well, especially with the new packages for the clustering. I've been trying to optimize it but not at a place where I'd like it be.
Regarding the log; (exporting to docker image format -> sending tarball) indicate that the build execution itself is fast (cached), but the Export Phase is slow. I think the delay is because Docker is moving the final image data from the BuildKit engine to the Docker Daemon.
I'll be further looking into this too.

See issue #189 — three users are running AudioMuse-AI on older CPUs only in this issue. While I'd love to leverage modern GPUs, I don't want to exclude these users. Without GPU support they can still run it; with incompatible dependencies, they can't run it at all.

By some fast search I'm also really in doubt that cuML can supprot old CPU. So OK keep two implementation BUT you need to check at runtime the import of the new one.

So by doing a try / except, if the import of cuML fail, in the except you include the old one. But please be very carefully on this or those user will knock on your door if it stop working :D

Edit: anyway really thanks for your effort and your patience with my review :)

Dealing with empty responses, sql handling, and how it retires. The pre-requirements also would fail silently, now that's explicit and it asks for a retry

Feat/gpu clustering

- Dockerfile: Conditionally install `voyager` only in CUDA/GPU images. `voyager` requires AVX instructions which may crash on old CPUs. The non-CUDA image will now skip `voyager` to ensure compatibility with older hardware (Celeron/ARM). - tasks/artist_gmm_manager.py: - Wrap `import voyager` in try-except block to handle missing library gracefully on old CPUs. - Add `VOYAGER_AVAILABLE` checks to disable artist similarity features if the library is missing, preventing crashes. - Add support for GPU-accelerated GMM using `cuml.mixture.GaussianMixture` if available, falling back to `sklearn` otherwise. - tasks/voyager_manager.py: - Wrap `import voyager` in try-except block. - Add `VOYAGER_AVAILABLE` checks to `build_and_store_voyager_index` and `load_voyager_index_for_querying` to prevent crashes on non-AVX systems.

NeptuneHub · 2025-11-23T10:15:57Z

Meanwhile checking your code I look that def check_gpu_available(): already try to import cuml and cupy in a try/except and it should be sufficient to don't brake stuff on old cpu for this library.

So for the point of avoidind breaking change have a look if any other "braking" library is imported in that way and it should be already sufficient.

Also let me know if you find a way to improve the time of build of the image.

When you finished just advice me that I'll do another round of test.

Another time thansk for your effort, is really appreciated!

rendyhd · 2025-11-23T14:38:27Z

Meanwhile checking your code I look that def check_gpu_available(): already try to import cuml and cupy in a try/except and it should be sufficient to don't brake stuff on old cpu for this library.

So for the point of avoidind breaking change have a look if any other "braking" library is imported in that way and it should be already sufficient.

Also let me know if you find a way to improve the time of build of the image.

When you finished just advice me that I'll do another round of test.

Another time thansk for your effort, is really appreciated!

I've done a first attempt to look at it, just haven't had the time to really test yet. Still trying to optimize the build time of the image, because testing takes forever now :(

NeptuneHub · 2025-11-23T14:43:07Z

You're right.
I think that the:

2 step build, is nice to have smaller image, but for build multiple time is a nightmer. For now I'll remove this
Is there something that you're compiling? maybe of the same library exist already the compiled version?

As you noticed we can afford to have an image that without nay change of the code takes minutes to recompile. Better to have a bigger image for now.

rendyhd · 2025-11-26T18:00:51Z

Just wanted to let you know that I should have some updates this weekend! :)

rendyhd · 2025-11-29T15:01:40Z

Sorry that it took a while, had to replace my ssd. This version should keep the same level of compatibility and rebuild took 7 sec. Tested it on Analysis and Clustering GPU, not the artist function. Would look at that separately

Kept uv-based Dockerfile with requirements files from feat/gpu-clustering 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

NeptuneHub · 2025-11-29T17:42:47Z

Ok I'm start testing in, still compiling (but maybe because is the first time that I compile this new version). Then I would like to check why the unit test are failing.

In parallel if you can do a check on all the functionality, especially have a look that also the cpu version keep working.

I would like to merge your work and then do a new release, but first I need to be sure that all is ok :)

NeptuneHub · 2025-11-29T18:11:14Z

I'm doing some test, meanwhile I do I update the post. Please have a check to them because I put some suggestion/request of change in the middle. Thanks.

TEST 1 - BUILD TIME - BETTER, COULD BE IMPROVED?

So first test I'm building it locally with docker with this docker-compose file

services:
  # Redis service for RQ (task queue)
  redis:
    image: redis:7-alpine
    container_name: audiomuse-redis
    ports:
      - "6379:6379" # Expose Redis port to the host
    volumes:
      - redis-data:/data # Persistent storage for Redis data
    restart: unless-stopped

  # PostgreSQL database service
  postgres:
    image: postgres:15-alpine
    container_name: audiomuse-postgres
    environment:
      POSTGRES_USER: ${POSTGRES_USER:-audiomuse}
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-audiomusepassword}
      POSTGRES_DB: ${POSTGRES_DB:-audiomusedb}
    ports:
      - "5432:5432"
    volumes:
      - postgres-data:/var/lib/postgresql/data
    restart: unless-stopped

  # AudioMuse-AI Flask application service (GPU-enabled)
  audiomuse-ai-flask:
    image: audiomuse-ai:pr195-gpu
    build:
      context: ../       # Build from project root (one level up from deployment/)
      dockerfile: Dockerfile
      args:  # GPU build args
        BASE_IMAGE: nvidia/cuda:12.8.1-cudnn-runtime-ubuntu22.04
    container_name: audiomuse-ai-flask-app
    ports:
      - "8000:8000"
    environment:
      SERVICE_TYPE: "flask"
      MEDIASERVER_TYPE: "navidrome"
      NAVIDROME_URL: "${NAVIDROME_URL}"
      NAVIDROME_USER: "${NAVIDROME_USER}"
      NAVIDROME_PASSWORD: "${NAVIDROME_PASSWORD}"
      POSTGRES_USER: ${POSTGRES_USER:-audiomuse}
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-audiomusepassword}
      POSTGRES_DB: ${POSTGRES_DB:-audiomusedb}
      POSTGRES_HOST: "postgres"
      POSTGRES_PORT: "${POSTGRES_PORT:-5432}"
      REDIS_URL: "${REDIS_URL:-redis://redis:6379/0}"
      AI_MODEL_PROVIDER: "${AI_MODEL_PROVIDER}"
      OPENAI_API_KEY: "${OPENAI_API_KEY}"
      OPENAI_SERVER_URL: "${OPENAI_SERVER_URL}"
      OPENAI_MODEL_NAME: "${OPENAI_MODEL_NAME}"
      GEMINI_API_KEY: "${GEMINI_API_KEY}"
      MISTRAL_API_KEY: "${MISTRAL_API_KEY}"
      TEMP_DIR: "/app/temp_audio"
      USE_GPU_CLUSTERING: "${USE_GPU_CLUSTERING:-false}"
    volumes:
      - temp-audio-flask:/app/temp_audio
    depends_on:
      - redis
      - postgres
    restart: unless-stopped
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ["0"]
              capabilities: [gpu]

  # AudioMuse-AI RQ Worker service (GPU-enabled)
  # NOTE: This service uses the SAME image as flask, built only once
  audiomuse-ai-worker:
    image: audiomuse-ai:pr195-gpu  # Reuses the image built above
    environment:
      SERVICE_TYPE: "worker"
      MEDIASERVER_TYPE: "navidrome"
      NAVIDROME_URL: "${NAVIDROME_URL}"
      NAVIDROME_USER: "${NAVIDROME_USER}"
      NAVIDROME_PASSWORD: "${NAVIDROME_PASSWORD}"
      POSTGRES_USER: ${POSTGRES_USER:-audiomuse}
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-audiomusepassword}
      POSTGRES_DB: ${POSTGRES_DB:-audiomusedb}
      POSTGRES_HOST: "postgres"
      POSTGRES_PORT: "${POSTGRES_PORT:-5432}"
      REDIS_URL: "${REDIS_URL:-redis://redis:6379/0}"
      AI_MODEL_PROVIDER: "${AI_MODEL_PROVIDER}"
      OPENAI_API_KEY: "${OPENAI_API_KEY}"
      OPENAI_SERVER_URL: "${OPENAI_SERVER_URL}"
      OPENAI_MODEL_NAME: "${OPENAI_MODEL_NAME}"
      GEMINI_API_KEY: "${GEMINI_API_KEY}"
      MISTRAL_API_KEY: "${MISTRAL_API_KEY}"
      TEMP_DIR: "/app/temp_audio"
      USE_GPU_CLUSTERING: "${USE_GPU_CLUSTERING:-false}"
    volumes:
      - temp-audio-worker:/app/temp_audio
    depends_on:
      - redis
      - postgres
    restart: unless-stopped
    deploy:
      replicas: 3  # Run 3 worker instances
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ["0"]
              capabilities: [gpu]

# Define volumes for persistent data and temporary files
volumes:
  redis-data:
  postgres-data:
  temp-audio-flask:
  temp-audio-worker:

It took around 2 minute to rebuild where I only change an env variable, no code, not seconds like the old build. Is there something wrong in my docker-compose file?

TEST 2 - K-MEANS RUNNING TIME - GOOD 3X ON GPU VS CPU

KMM clustering (40 runs, 100 stratified, ~1100 songs): CPU 00:02:20 → GPU 00:00:46 (~3.0× faster — 94 s saved, ~67% runtime reduction).

So here the result is very good, should be tested with more run but is already good.

TEST 3 - GPU VERSION - OTHER FUNCTIONALITY TEST ON NAVIDROME LIKE (AudioMuse-AI Music Server)

The other functionality test on GPU version with Navidrome-like seems ok. Here if you are able to do with jellyfin for example, just to have more cover is better (even if your change shouldn't be server related)

Analysis => OK
Clustering => OK with K-MEANS, other not tested
Instant Playlist => OK
Playlist from Similar Song => OK
Artist Similarity => OK
Song Path => OK
Song Alchemy => OK
Music Map => OK
Sonic Fingerprint => OK
Waveform => OK
Cleaning => OK

TEST 4 - UNIT TEST

Unit test pass except from /tests/unit/test_ai.py that need a fix in the test it self to address smaller playlist name created with AI. Here please see my review and fix the test.

rendyhd · 2025-11-29T19:15:14Z

I see you have the replicas in your compose, I tried with that but didn't perform on my setup, so I moved away from it.

    deploy:
      replicas: 3  # Run 3 worker instances

Rebuild with .env change took me 6.1s:
command: docker build --build-arg BASE_IMAGE=nvidia/cuda:12.8.1-cudnn-runtime-ubuntu22.04 -t audiomuse-ai-gpu:test .
[+] Building 6.1s (25/25) FINISHED

Then the docker compose up took 12s
docker-compose -f docker-compose.dev.yaml up
[+] Running 4/4
✔ Container audiomuse-postgres-dev Running 0.0s
✔ Container audiomuse-redis-dev Running 0.0s
✔ Container audiomuse-ai-flask-app-dev Recreated 12.1s
✔ Container audiomuse-ai-worker-instance-dev Recreated 11.3s
Attaching to audiomuse-ai-flask-app-dev, audiomuse-ai-worker-instance-dev, audiomuse-postgres-dev, audiomuse-redis-dev

However, i've been using "docker compose -f deployment/docker-compose.dev.yaml restart audiomuse-ai-worker" most of the time, which takes about 2s, with the following compose:

services:
  # Redis service for RQ (task queue)
  redis:
    image: redis:7-alpine
    container_name: audiomuse-redis-dev
    ports:
      - "6379:6379" # Expose Redis port to the host
    volumes:
      - redis-data:/data # Persistent storage for Redis data
    restart: unless-stopped

  # PostgreSQL database service
  postgres:
    image: postgres:15-alpine
    container_name: audiomuse-postgres-dev
    environment:
      POSTGRES_USER: ${POSTGRES_USER:-audiomuse}
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-audiomusepassword}
      POSTGRES_DB: ${POSTGRES_DB:-audiomusedb}
    ports:
      - "5432:5432" # Expose PostgreSQL port to the host
    volumes:
      - postgres-data:/var/lib/postgresql/data # Persistent storage for PostgreSQL data
    restart: unless-stopped

  # AudioMuse-AI Flask application service (DEV MODE)
  audiomuse-ai-flask:
    image: audiomuse-ai-gpu:test
    container_name: audiomuse-ai-flask-app-dev
    ports:
      - "8000:8000" # Map host port 8000 to container port 8000
    environment:
      SERVICE_TYPE: "flask" # Tells the container to run the Flask app
      MEDIASERVER_TYPE: "jellyfin" # Specify the media server type
      JELLYFIN_USER_ID: "${JELLYFIN_USER_ID}"
      JELLYFIN_TOKEN: "${JELLYFIN_TOKEN}"
      JELLYFIN_URL: "${JELLYFIN_URL}"
      # DATABASE_URL is now constructed by config.py from the following:
      POSTGRES_USER: ${POSTGRES_USER:-audiomuse}
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-audiomusepassword}
      POSTGRES_DB: ${POSTGRES_DB:-audiomusedb}
      POSTGRES_HOST: "postgres" # Service name of the postgres container
      POSTGRES_PORT: "${POSTGRES_PORT:-5432}"
      REDIS_URL: "${REDIS_URL:-redis://redis:6379/0}" # Connects to the 'redis' service
      AI_MODEL_PROVIDER: "${AI_MODEL_PROVIDER}"
      OPENAI_API_KEY: "${OPENAI_API_KEY}"
      OPENAI_SERVER_URL: "${OPENAI_SERVER_URL}"
      OPENAI_MODEL_NAME: "${OPENAI_MODEL_NAME}"
      GEMINI_API_KEY: "${GEMINI_API_KEY}"
      MISTRAL_API_KEY: "${MISTRAL_API_KEY}"
      TEMP_DIR: "/app/temp_audio"
      FLASK_ENV: "development" # Enable Flask debug mode
      FLASK_DEBUG: "1"
    volumes:
      # Mount all application code as volumes for live editing
      - ../app.py:/app/app.py
      - ../app_alchemy.py:/app/app_alchemy.py
      - ../app_analysis.py:/app/app_analysis.py
      - ../app_artist_similarity.py:/app/app_artist_similarity.py
      - ../app_chat.py:/app/app_chat.py
      - ../app_clustering.py:/app/app_clustering.py
      - ../app_collection.py:/app/app_collection.py
      - ../app_cron.py:/app/app_cron.py
      - ../app_extend_playlist.py:/app/app_extend_playlist.py
      - ../app_external.py:/app/app_external.py
      - ../app_helper.py:/app/app_helper.py
      - ../app_helper_artist.py:/app/app_helper_artist.py
      - ../app_map.py:/app/app_map.py
      - ../app_path.py:/app/app_path.py
      - ../app_sonic_fingerprint.py:/app/app_sonic_fingerprint.py
      - ../app_voyager.py:/app/app_voyager.py
      - ../app_waveform.py:/app/app_waveform.py
      - ../ai.py:/app/ai.py
      - ../config.py:/app/config.py
      - ../rq_janitor.py:/app/rq_janitor.py
      - ../rq_worker.py:/app/rq_worker.py
      - ../rq_worker_high_priority.py:/app/rq_worker_high_priority.py
      - ../verify_changes.py:/app/verify_changes.py
      # Mount directories
      - ../static:/app/static
      - ../templates:/app/templates
      - ../tasks:/app/tasks
      - ../query:/app/query
      - ../test:/app/test
      - ../tests:/app/tests
      # Supervisor config
      - ./supervisord.conf:/etc/supervisor/conf.d/supervisord.conf
      # Temp audio directory
      - temp-audio-flask:/app/temp_audio
    depends_on:
      - redis
      - postgres
    restart: unless-stopped

  # AudioMuse-AI RQ Worker service (DEV MODE)
  audiomuse-ai-worker:
    image: audiomuse-ai-gpu:test
    container_name: audiomuse-ai-worker-instance-dev
    environment:
      SERVICE_TYPE: "worker" # Tells the container to run the RQ worker
      MEDIASERVER_TYPE: "jellyfin" # Specify the media server type
      JELLYFIN_USER_ID: "${JELLYFIN_USER_ID}"
      JELLYFIN_TOKEN: "${JELLYFIN_TOKEN}"
      JELLYFIN_URL: "${JELLYFIN_URL}"
      # DATABASE_URL is now constructed by config.py from the following:
      POSTGRES_USER: ${POSTGRES_USER:-audiomuse}
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-audiomusepassword}
      POSTGRES_DB: ${POSTGRES_DB:-audiomusedb}
      POSTGRES_HOST: "postgres" # Service name of the postgres container
      POSTGRES_PORT: "${POSTGRES_PORT:-5432}"
      REDIS_URL: "${REDIS_URL:-redis://redis:6379/0}" # Connects to the 'redis' service
      AI_MODEL_PROVIDER: "${AI_MODEL_PROVIDER}"
      OPENAI_API_KEY: "${OPENAI_API_KEY}"
      OPENAI_SERVER_URL: "${OPENAI_SERVER_URL}"
      OPENAI_MODEL_NAME: "${OPENAI_MODEL_NAME}"
      GEMINI_API_KEY: "${GEMINI_API_KEY}"
      MISTRAL_API_KEY: "${MISTRAL_API_KEY}"
      TEMP_DIR: "/app/temp_audio"
      NVIDIA_VISIBLE_DEVICES: "0"
      NVIDIA_DRIVER_CAPABILITIES: "compute,utility"
      USE_GPU_CLUSTERING: "${USE_GPU_CLUSTERING:-true}"
    volumes:
      # Mount all application code as volumes for live editing
      - ../app.py:/app/app.py
      - ../app_alchemy.py:/app/app_alchemy.py
      - ../app_analysis.py:/app/app_analysis.py
      - ../app_artist_similarity.py:/app/app_artist_similarity.py
      - ../app_chat.py:/app/app_chat.py
      - ../app_clustering.py:/app/app_clustering.py
      - ../app_collection.py:/app/app_collection.py
      - ../app_cron.py:/app/app_cron.py
      - ../app_extend_playlist.py:/app/app_extend_playlist.py
      - ../app_external.py:/app/app_external.py
      - ../app_helper.py:/app/app_helper.py
      - ../app_helper_artist.py:/app/app_helper_artist.py
      - ../app_map.py:/app/app_map.py
      - ../app_path.py:/app/app_path.py
      - ../app_sonic_fingerprint.py:/app/app_sonic_fingerprint.py
      - ../app_voyager.py:/app/app_voyager.py
      - ../app_waveform.py:/app/app_waveform.py
      - ../ai.py:/app/ai.py
      - ../config.py:/app/config.py
      - ../rq_janitor.py:/app/rq_janitor.py
      - ../rq_worker.py:/app/rq_worker.py
      - ../rq_worker_high_priority.py:/app/rq_worker_high_priority.py
      - ../verify_changes.py:/app/verify_changes.py
      # Mount directories
      - ../static:/app/static
      - ../templates:/app/templates
      - ../tasks:/app/tasks
      - ../query:/app/query
      - ../test:/app/test
      - ../tests:/app/tests
      # Supervisor config
      - ./supervisord.conf:/etc/supervisor/conf.d/supervisord.conf
      # Temp audio directory
      - temp-audio-worker:/app/temp_audio
    depends_on:
      - redis
      - postgres
    restart: unless-stopped
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]

# Define volumes for persistent data and temporary files
volumes:
  redis-data:
    external:
      name: audiomuse_redis-data
  postgres-data:
    external:
      name: audiomuse_postgres-data-test
  temp-audio-flask:
    external:
      name: audiomuse_temp-audio-flask
  temp-audio-worker:
    external:
      name: audiomuse_temp-audio-worker

NeptuneHub · 2025-11-29T18:47:48Z

ai.py

    Applies length constraints after getting the name.
    """
-    MIN_LENGTH = 10
+    MIN_LENGTH = 5


In /tests/unit/test_ai.py in def test_applies_length_constraints(self, mock_ollama, mock_clean): at line 502

you need to change this:

mock_ollama.return_value = "Short" mock_clean.return_value = "Short"

with for example this

mock_ollama.return_value = "Test" mock_clean.return_value = "Test"

to reflect the fact that we accept name with minimum 5 char, otherwise unit test fail.

NeptuneHub · 2025-11-29T18:55:02Z

Dockerfile

Now the re-compile time is better. it tooks around 2 minute. Is there something that I can improve in the docker-compose file to achive your 7sec when compile the GPU version?
I still have to check the CPU version how it work.

rendyhd and others added 9 commits November 19, 2025 11:33

Merge pull request #6 from rendyhd/claude/add-openrouter-support-01CF…

63f34e2

…s8mGAn5UghuFemm5ydF3 Claude/add openrouter support 01 c fs8m g an5 ughu femm5yd f3

Merge branch 'NeptuneHub:main' into claude/gpu-clustering-research-01…

7ba7ced

…Ju9JeTX1N4popdd3Kt6FoM

Fix clustering GPU acceleration, but slow docker build

393086c

Improve DockerFile for GPU Clustering build

08ccca0

The new docker image needed packages that where quite large, resulting in a very slow docker build

Merge branch 'NeptuneHub:main' into feat/gpu-clustering

d3a643d

Merge pull request #7 from rendyhd/feat/gpu-clustering

968ad88

Feat/gpu clustering

Fix for openrouter "max retries" and empty responses, included the DM…

849c68f

…R fix

rendyhd mentioned this pull request Nov 20, 2025

Fix/openrouter rate limit and dmr #196

Closed

Changed base image to cudnn

8739f81

NeptuneHub requested changes Nov 21, 2025

View reviewed changes

rendyhd and others added 4 commits November 21, 2025 18:28

further optimized openrouter response handling

bf3079e

Dealing with empty responses, sql handling, and how it retires. The pre-requirements also would fail silently, now that's explicit and it asks for a retry

Merge pull request #8 from rendyhd/feat/gpu-clustering

d38000a

Feat/gpu clustering

Optimize Dockerfile with uv and split requirements

e03e228

rendyhd added 5 commits November 28, 2025 18:16

Optimized dockerfile for compatibility and speed

93f96ae

voyager to common, like OG

d5e72ff

Fix issue with cuda load

6f1b275

latest artist_gmm

db32534

aligned yaml

6880690

Merge remote-tracking branch 'upstream/main'

4354e69

Merge origin/feat/gpu-clustering: preserve GPU optimizations

b4111f9

Kept uv-based Dockerfile with requirements files from feat/gpu-clustering 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

rendyhd mentioned this pull request Nov 29, 2025

Playlist Builder [EXPERIMENTAL] #209

Draft

NeptuneHub approved these changes Nov 29, 2025

View reviewed changes

NeptuneHub merged commit 7cad868 into NeptuneHub:main Nov 29, 2025
1 of 2 checks passed

GPU Accelaration for Clustering #195

GPU Accelaration for Clustering #195

Uh oh!

Conversation

rendyhd commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rendyhd commented Nov 20, 2025

Uh oh!

NeptuneHub commented Nov 20, 2025

Uh oh!

rendyhd commented Nov 20, 2025

Uh oh!

NeptuneHub Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

rendyhd Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

NeptuneHub Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rendyhd Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

NeptuneHub Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

rendyhd Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

NeptuneHub Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NeptuneHub commented Nov 23, 2025

Uh oh!

rendyhd commented Nov 23, 2025

Uh oh!

NeptuneHub commented Nov 23, 2025

Uh oh!

rendyhd commented Nov 26, 2025

Uh oh!

rendyhd commented Nov 29, 2025

Uh oh!

NeptuneHub commented Nov 29, 2025

Uh oh!

NeptuneHub commented Nov 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TEST 1 - BUILD TIME - BETTER, COULD BE IMPROVED?

TEST 2 - K-MEANS RUNNING TIME - GOOD 3X ON GPU VS CPU

TEST 3 - GPU VERSION - OTHER FUNCTIONALITY TEST ON NAVIDROME LIKE (AudioMuse-AI Music Server)

TEST 4 - UNIT TEST

Uh oh!

rendyhd commented Nov 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NeptuneHub Nov 29, 2025

Choose a reason for hiding this comment

Uh oh!

NeptuneHub Nov 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rendyhd commented Nov 20, 2025 •

edited

Loading

NeptuneHub Nov 21, 2025 •

edited

Loading

NeptuneHub Nov 21, 2025 •

edited

Loading

NeptuneHub commented Nov 29, 2025 •

edited

Loading

rendyhd commented Nov 29, 2025 •

edited

Loading