Skip to content

Updates due to renaming of repositories #91

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 15, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .tekton/pipelinerun.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -644,7 +644,7 @@ spec:
secretName: "{{ git_auth_secret }}"
- name: git-auth
secret:
secretName: "git-auth-secret-neuralmagic"
secretName: "git-auth-secret-llm-d"
# - name: registry-secret
# secret:
# secretName: quay-secret-llm-d
Expand Down
4 changes: 2 additions & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,12 @@ RUN dnf install -y gcc-c++ libstdc++ libstdc++-devel clang && dnf clean all

WORKDIR /workspace

## NeuralMagic internal repos pull config
## llm-d internal repos pull config
ARG GIT_NM_USER
ARG NM_TOKEN
### use git token
RUN echo -e "machine github.com\n\tlogin ${GIT_NM_USER}\n\tpassword ${NM_TOKEN}" >> ~/.netrc
ENV GOPRIVATE=github.com/neuralmagic
ENV GOPRIVATE=github.com/llm-d
ENV GIT_TERMINAL_PROMPT=1

# Copy the Go Modules manifests
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ GIE. If you have something that's _llm-d specific_ then it should go here. If
you're not sure whether your feature belongs here or in the GIE, feel free to
create a [discussion] or ask on [Slack].

[create an issue]:https://github.com/neuralmagic/llm-d-inference-scheduler/issues/new
[create an issue]:https://github.com/llm-d/llm-d-inference-scheduler/issues/new
[Gateway API Inference Extension (GIE)]:https://github.com/kubernetes-sigs/gateway-api-inference-extension
[discussion]:https://github.com/neuralmagic/llm-d-inference-scheduler/discussions/new?category=q-a
[discussion]:https://github.com/llm-d/llm-d-inference-scheduler/discussions/new?category=q-a
[Slack]:https://llm-d.slack.com/
6 changes: 3 additions & 3 deletions cmd/epp/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -51,9 +51,9 @@ import (
runserver "sigs.k8s.io/gateway-api-inference-extension/pkg/epp/server"
"sigs.k8s.io/gateway-api-inference-extension/pkg/epp/util/logging"

"github.com/neuralmagic/llm-d-inference-scheduler/internal/controller/runnable"
"github.com/neuralmagic/llm-d-inference-scheduler/pkg/config"
"github.com/neuralmagic/llm-d-inference-scheduler/pkg/scheduling/pd"
"github.com/llm-d/llm-d-inference-scheduler/internal/controller/runnable"
"github.com/llm-d/llm-d-inference-scheduler/pkg/config"
"github.com/llm-d/llm-d-inference-scheduler/pkg/scheduling/pd"
)

const (
Expand Down
4 changes: 2 additions & 2 deletions go.mod
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
module github.com/neuralmagic/llm-d-inference-scheduler
module github.com/llm-d/llm-d-inference-scheduler

go 1.24.1

Expand All @@ -10,7 +10,7 @@ require (
github.com/go-logr/logr v1.4.2
github.com/google/go-cmp v0.7.0
github.com/hashicorp/golang-lru/v2 v2.0.7
github.com/neuralmagic/llm-d-kv-cache-manager v0.0.0-20250508211654-1fbe7c5f15e9
github.com/llm-d/llm-d-kv-cache-manager v0.0.0-20250515082302-b9deb04c44c5
github.com/prometheus/client_golang v1.22.0
github.com/stretchr/testify v1.10.0
go.uber.org/zap v1.27.0
Expand Down
4 changes: 2 additions & 2 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,8 @@ github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY=
github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE=
github.com/kylelemons/godebug v1.1.0 h1:RPNrshWIDI6G2gRW9EHilWtl7Z6Sb1BR0xunSBf0SNc=
github.com/kylelemons/godebug v1.1.0/go.mod h1:9/0rRGxNHcop5bhtWyNeEfOS8JIWk580+fNqagV/RAw=
github.com/llm-d/llm-d-kv-cache-manager v0.0.0-20250515082302-b9deb04c44c5 h1:BB02L+NP4zbsfZ23c5gCeKqxNTmArtzHYwiXIOr91mw=
github.com/llm-d/llm-d-kv-cache-manager v0.0.0-20250515082302-b9deb04c44c5/go.mod h1:Hu7RvpUg5sP1xnQFfO2dbt96AjGPWKuUvWBWiHj/FUU=
github.com/mailru/easyjson v0.7.7 h1:UGYAvKxe3sBsEDzO8ZeWOSlIQfWFlxbzLZe7hwFURr0=
github.com/mailru/easyjson v0.7.7/go.mod h1:xzfreul335JAWq5oZzymOObrkdz5UnU4kGfJJLY9Nlc=
github.com/modern-go/concurrent v0.0.0-20180228061459-e0a39a4cb421/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q=
Expand All @@ -107,8 +109,6 @@ github.com/modern-go/reflect2 v1.0.2 h1:xBagoLtFs94CBntxluKeaWgTMpvLxC4ur3nMaC9G
github.com/modern-go/reflect2 v1.0.2/go.mod h1:yWuevngMOJpCy52FWWMvUC8ws7m/LJsjYzDa0/r8luk=
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 h1:C3w9PqII01/Oq1c1nUAm88MOHcQC9l5mIlSMApZMrHA=
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822/go.mod h1:+n7T8mK8HuQTcFwEeznm/DIxMOiR9yIdICNftLE1DvQ=
github.com/neuralmagic/llm-d-kv-cache-manager v0.0.0-20250508211654-1fbe7c5f15e9 h1:xqVxrZoIDwCZ+065w6XyU27IZJfe6XOUmvhIpbQvoD8=
github.com/neuralmagic/llm-d-kv-cache-manager v0.0.0-20250508211654-1fbe7c5f15e9/go.mod h1:VB+KcEemkO1ZKpz/hgUPQMU9oSLv2uCLW6y6c+r8jkQ=
github.com/onsi/ginkgo/v2 v2.23.4 h1:ktYTpKJAVZnDT4VjxSbiBenUjmlL/5QkBEocaWXiQus=
github.com/onsi/ginkgo/v2 v2.23.4/go.mod h1:Bt66ApGPBFzHyR+JO10Zbt0Gsp4uWxu5mIOTusL46e8=
github.com/onsi/gomega v1.37.0 h1:CdEG8g0S133B4OswTDC/5XPSzE1OeP29QOioj2PID2Y=
Expand Down
4 changes: 2 additions & 2 deletions pkg/scheduling/dual/scheduler.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@ import (
"sigs.k8s.io/gateway-api-inference-extension/pkg/epp/scheduling/types"
logutil "sigs.k8s.io/gateway-api-inference-extension/pkg/epp/util/logging"

"github.com/neuralmagic/llm-d-inference-scheduler/pkg/scheduling/plugins/filter"
"github.com/neuralmagic/llm-d-inference-scheduler/pkg/scheduling/plugins/scorer"
"github.com/llm-d/llm-d-inference-scheduler/pkg/scheduling/plugins/filter"
"github.com/llm-d/llm-d-inference-scheduler/pkg/scheduling/plugins/scorer"
)

// Scheduler implements the dual scheduler concept, along with a threshold
Expand Down
6 changes: 3 additions & 3 deletions pkg/scheduling/pd/scheduler.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,9 @@ import (
"sigs.k8s.io/gateway-api-inference-extension/pkg/epp/scheduling/types"
logutil "sigs.k8s.io/gateway-api-inference-extension/pkg/epp/util/logging"

"github.com/neuralmagic/llm-d-inference-scheduler/pkg/config"
"github.com/neuralmagic/llm-d-inference-scheduler/pkg/scheduling/plugins/filter"
"github.com/neuralmagic/llm-d-inference-scheduler/pkg/scheduling/plugins/scorer"
"github.com/llm-d/llm-d-inference-scheduler/pkg/config"
"github.com/llm-d/llm-d-inference-scheduler/pkg/scheduling/plugins/filter"
"github.com/llm-d/llm-d-inference-scheduler/pkg/scheduling/plugins/scorer"
)

const (
Expand Down
6 changes: 3 additions & 3 deletions pkg/scheduling/pd/scheduler_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,9 @@ import (
backendmetrics "sigs.k8s.io/gateway-api-inference-extension/pkg/epp/backend/metrics" // Import config for thresholds
"sigs.k8s.io/gateway-api-inference-extension/pkg/epp/scheduling/types"

"github.com/neuralmagic/llm-d-inference-scheduler/pkg/config"
"github.com/neuralmagic/llm-d-inference-scheduler/pkg/scheduling/pd"
"github.com/neuralmagic/llm-d-inference-scheduler/pkg/scheduling/plugins/filter"
"github.com/llm-d/llm-d-inference-scheduler/pkg/config"
"github.com/llm-d/llm-d-inference-scheduler/pkg/scheduling/pd"
"github.com/llm-d/llm-d-inference-scheduler/pkg/scheduling/plugins/filter"
)

// Tests the default scheduler configuration and expected behavior.
Expand Down
2 changes: 1 addition & 1 deletion pkg/scheduling/plugins/scorer/kvcache-aware.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ import (
"fmt"
"os"

kvcache "github.com/neuralmagic/llm-d-kv-cache-manager/pkg/kv-cache"
kvcache "github.com/llm-d/llm-d-kv-cache-manager/pkg/kv-cache"

"sigs.k8s.io/controller-runtime/pkg/log"
"sigs.k8s.io/gateway-api-inference-extension/pkg/epp/scheduling/plugins"
Expand Down
2 changes: 1 addition & 1 deletion pkg/scheduling/plugins/scorer/load_aware_scorer_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ import (
"sigs.k8s.io/gateway-api-inference-extension/pkg/epp/scheduling/plugins/picker"
"sigs.k8s.io/gateway-api-inference-extension/pkg/epp/scheduling/types"

"github.com/neuralmagic/llm-d-inference-scheduler/pkg/scheduling/plugins/scorer"
"github.com/llm-d/llm-d-inference-scheduler/pkg/scheduling/plugins/scorer"
)

func TestLoadBasedScorer(t *testing.T) {
Expand Down
2 changes: 1 addition & 1 deletion pkg/scheduling/plugins/scorer/prefix_aware_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ import (
backendmetrics "sigs.k8s.io/gateway-api-inference-extension/pkg/epp/backend/metrics"
"sigs.k8s.io/gateway-api-inference-extension/pkg/epp/scheduling/types"

"github.com/neuralmagic/llm-d-inference-scheduler/pkg/scheduling/plugins/scorer"
"github.com/llm-d/llm-d-inference-scheduler/pkg/scheduling/plugins/scorer"
)

func TestPrefixAwareScorer(t *testing.T) {
Expand Down
2 changes: 1 addition & 1 deletion pkg/scheduling/plugins/scorer/prefix_store_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ import (
k8stypes "k8s.io/apimachinery/pkg/types"
"sigs.k8s.io/controller-runtime/pkg/log"

"github.com/neuralmagic/llm-d-inference-scheduler/pkg/scheduling/plugins/scorer"
"github.com/llm-d/llm-d-inference-scheduler/pkg/scheduling/plugins/scorer"
)

// TestBasicPrefixOperations tests the basic functionality of adding and finding prefixes
Expand Down
2 changes: 1 addition & 1 deletion pkg/scheduling/plugins/scorer/session_affinity.go
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ func (s *SessionAffinity) Score(ctx *types.SchedulingContext, pods []types.Pod)
// PostResponse sets the session header on the response sent to the client
// TODO: this should be using a cookie and ensure not overriding any other
// cookie values if present.
// Tracked in https://github.com/neuralmagic/llm-d-inference-scheduler/issues/28
// Tracked in https://github.com/llm-d/llm-d-inference-scheduler/issues/28
func (s *SessionAffinity) PostResponse(ctx *types.SchedulingContext, pod types.Pod) {
ctx.Req.Headers[sessionTokenHeader] = base64.StdEncoding.EncodeToString([]byte(pod.GetPod().NamespacedName.String()))
}