Skip to content

Conversation

@justinyeh1995
Copy link
Contributor

@justinyeh1995 justinyeh1995 commented Dec 19, 2025

Why are these changes needed?

Add shell completion for two specific commands, kubectl ray get workergroup & kubectl ray get node .

These expected behaviors are as followed.

$ kubectl ray get workergroup <TAB> # list out workergroups in the default namespace
$ kubectl ray get workergroup -n <namespace> <TAB> # list out workergroups in this <namespace>
$ kubectl ray get workergroup --ray-cluster <raycluster name> <TAB> # list out workergroups in this specific RayCluster
$ kubectl ray get workergroup -n <namespace> --ray-cluster <raycluster name> <TAB> # list out workergroups in the specific RayCluster in this namespace
$ kubectl ray get workergroup -A # no shell completion (falls back to file completion)

If no resource is fetched, kubectl will fall back to the default completion behavior showing directories.

The same goes for kubectl ray get node.

Demo video

Screen.Recording.2025-12-20.at.4.12.28.PM.mov

Manual Testing Steps

  1. Build and install the local kubectl-ray binary

From the KubeRay repository root:

pushd kubectl-plugin > /dev/null
go build cmd/kubectl-ray.go
mv kubectl-ray ~/.krew/bin
popd > /dev/null
  1. Follow this doc to enable shell completion for kubectl ray by doing
kubectl krew install --manifest-url https://raw.githubusercontent.com/marckhouzam/kubectl-plugin_completion/v0.1.0/plugin-completion.yaml

kubectl plugin-completion generate
export PATH=$PATH:$HOME/.kubectl-plugin-completion

Related issue number

Closes #3051

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests

@justinyeh1995 justinyeh1995 force-pushed the feature/3051-kubectl-shell-completion-ray-get-workergroups-nodes branch from 01d8732 to b7c557f Compare December 21, 2025 09:47
Comment on lines +174 to +190
// createRayNodeLabelSelectors creates a map of K8s label selectors for Ray nodes
// TODO: duplicated function as in kubectl/pkg/cmd/get/get_nodes.go
func createRayNodeLabelSelectors(cluster string) map[string]string {
labelSelectors := map[string]string{
util.RayIsRayNodeLabelKey: "yes",
}
if cluster != "" {
labelSelectors[util.RayClusterLabelKey] = cluster
}
return labelSelectors
}

Copy link
Contributor Author

@justinyeh1995 justinyeh1995 Dec 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly to joinLabelMap, I grab this function from kubectl-plugin/pkg/cmd/get/get_nodes.go.

Copy link
Collaborator

@machichima machichima Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this one, I think it's ok to expose as a public function to be used in different places. Maybe we can put it under pkg/util then import by others? Would like to know how others think for this one
@seanlaii @troychiu @JiangJiaWei1103 @AndySung320

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that factoring this out into a shared helper makes sense.
The only open question might be what the most appropriate name / package would be, given that this is CLI-specific logic.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed there are many createRayxxxLabelSelectors functions, maybe we can extract these to a shared utility, such as pkg/util/selector.go.

This can be a follow up PR

Copy link
Contributor Author

@justinyeh1995 justinyeh1995 Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thank you for pointing this out. I agree this can be in a follow-up PR.

@justinyeh1995 justinyeh1995 changed the title [Feat][kubectl-plugin][WIP] Add shell completion for for kubectl ray get [workergroups|nodes] [Feat][kubectl-plugin] Add shell completion for for kubectl ray get [workergroups|nodes] Dec 22, 2025
@justinyeh1995 justinyeh1995 force-pushed the feature/3051-kubectl-shell-completion-ray-get-workergroups-nodes branch 2 times, most recently from c202da3 to 6c671e9 Compare December 23, 2025 06:18
@justinyeh1995 justinyeh1995 marked this pull request as ready for review December 23, 2025 07:26
@justinyeh1995
Copy link
Contributor Author

cc @win5923 and @Future-Outlier to review or help redistribute review work.

Copy link
Member

@Future-Outlier Future-Outlier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds shell completion support for kubectl ray get workergroup and kubectl ray get node commands. The completion functions respect namespace filtering (-n/--namespace), cluster filtering (--ray-cluster), and disable completion when --all-namespaces is set, falling back to file completion in that case.

Key changes:

  • Implemented WorkerGroupCompletionFunc and NodeCompletionFunc to provide intelligent shell completion
  • Added comprehensive unit tests covering various scenarios including namespace filtering, cluster filtering, prefix matching, and edge cases
  • Integrated the completion functions into the respective command definitions

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
kubectl-plugin/pkg/util/completion/completion.go Implements worker group and node completion functions with namespace and cluster filtering support
kubectl-plugin/pkg/util/completion/completion_test.go Adds comprehensive unit tests for both completion functions covering multiple scenarios
kubectl-plugin/pkg/cmd/get/get_workergroup.go Integrates WorkerGroupCompletionFunc into the workergroup command
kubectl-plugin/pkg/cmd/get/get_nodes.go Integrates NodeCompletionFunc into the node command

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

namespace = "default"
}

rayClusterList, err := k8sClient.RayClient().RayV1().RayClusters(namespace).List(context.Background(), v1.ListOptions{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
rayClusterList, err := k8sClient.RayClient().RayV1().RayClusters(namespace).List(context.Background(), v1.ListOptions{})
ctx, cancel := context.WithTimeout(context.Background(), 500*time.Millisecond)
defer cancel()
rayClusterList, err := k8sClient.RayClient().RayV1().RayClusters(namespace).List(ctx, v1.ListOptions{})

Since this API call runs during shell completion, would it be safer to guard it with a context timeout ?
If the API server is slow or unreachable, blocking here could freeze the terminal, which can be a pretty rough UX.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a solid suggestion but the common pattern in kubectl-plugin/ seems to be using cmd.Context() which is actually context.Background(). I have yet to figure out why that is. I am open to this change but would like to hear about others thoughts on this one first.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think kubectl default to no timeout and we can set it through --request-timeout (see docs). I think we can just follow the kubectl setting here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, but I think shell completion is a special case here.
When a user types kubectl ray get workergroup <TAB>, the completion code runs in the background before they've finished typing the command. This means:

  • The user can't pass --request-timeout because they haven't completed the command yet.
  • Users generally expect tab completion to be instant, or to fail quickly, especially if the API server is slow or unreachable.

Because of that, it seems safer for the completion path to guard itself with a short timeout, independent of the normal kubectl --request-timeout behavior for command execution.
Happy to hear your thoughts if there's an established pattern I'm missing!

Copy link
Collaborator

@machichima machichima Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense! I think the normal kubectl command have no timeout, so sometimes using auto completion will block terminal for a while. I think having a timeout here would be great. But I got a few concern here:

  1. The timeout we set here will only be applied to querying ray cluster. There's still no timeout for auto completing namespaces (as this is provided by native kubectl), there's a bit inconsistency here
  2. After timeout, what shall we show to the user? When using normal kubectl command, if there's nothing shown when trying to do auto completion, it can be either: 1) the auto completion plugin not installed by user; or 2) the resource not found. I'm just wondering if it will provide any confusion to the user?

Generally I think adding timeout here is a good idea, but just want to prevent any confusion users might have.

cc @seanlaii @troychiu @JiangJiaWei1103 for some ideas here

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I think we can follow the kubectl-plugin pattern.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great find! However, I think the reason why AttachablePodForObjectFn have timeout is because it's watching the pods to meet the specific condition (code path: AttachablePodForObjectFn -> attachablePodForObject -> GetFirstPod -> GetPodList here)

For command like kubectl get pods, they calls resourceTypeAndNameCompletionFunc here, which do not have timeout set.

In this case I think no timeout here aligns with the kubectl get commands behavior

Copy link
Contributor Author

@justinyeh1995 justinyeh1995 Dec 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the follow-up! This is a great find, and it made me dig a little bit deeper.

I think the two scenarios are different after tracing the code path. ref file

The code path that I traced:

ResourceAndPortCompletionFunc | ContainerCompletionFunc
-> convertResourceNameToPodName (100ms timeout)
-> AttachablePodForObjectFn

Here's why,

AttachablePodForObjectFn is used because convertResourceNameToPodName needs to convert a resource reference to a single running pod name. However, after the conversion, kubectl always calls actual resource listing functions like CompGetContainers, CompGetPodContainerPorts, etc with no timeout.

For example,
ContainerCompletionFunc
->convertResourceNameToPodName (100ms timeout) finds the pod
-> CompGetContainers (no timeout) lists containers in that pod

we're directly listing all resources just asCompGetResource does so following what they do seems more reasonable to me. Also, for larger clusters with many resources, this might be slow, and we should avoid false negatives that would confuse users.

Would love to discuss further if you see it differently.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you all for the deep dive and tracing the code paths. This provides great clarity on why kubectl applies timeouts selectively.
I agree that staying consistent with the native CompGetResource behavior makes sense here to avoid false negatives and maintain a predictable user experience.
I think we could proceed without the timeout for now. If we receive feedback regarding terminal hangs in high-latency clusters in the future, we can revisit more advanced completion strategies then.

Thanks again for the great discussion~

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, thanks for the thorough review and discussion!

}

labelSelectors := createRayNodeLabelSelectors(cluster)
pods, err := k8sClient.KubernetesClient().CoreV1().Pods(namespace).List(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same concern applies here as well regarding using a context timeout for this API call during shell completion.

@Future-Outlier
Copy link
Member

cc @JiangJiaWei1103 to take a look, thank you!

@JiangJiaWei1103 JiangJiaWei1103 moved this to Work in progress in My Kuberay & Ray Dec 24, 2025
@JiangJiaWei1103 JiangJiaWei1103 moved this from Work in progress to In review in My Kuberay & Ray Dec 24, 2025
namespace = "default"
}

rayClusterList, err := k8sClient.RayClient().RayV1().RayClusters(namespace).List(context.Background(), v1.ListOptions{})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When listing RayCluster, I think we can provide the cluster name in the list options? So that in the following loop we do not need to use:

        if cluster != "" && rayCluster.Name != cluster {
			continue
		}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion! I'll update it to use FieldSelector in the list option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During the fix, I found an issue with the fake client implementation.

NewSimpleClientset and its successor NewClientset do not support FieldSelector filtering in List operations. To make the unit tests pass, the client-side rayCluster.Name check is still needed as a fallback.
Also, NewSimpleClientset is deprecated in favor of NewClientset for better server-side apply testing. A future PR could migrate all usages.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found this related issue in kubernetes: kubernetes/kubernetes#78824. They suggest using reactors. We can do it in the follow-up PR.
Thank you!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome find! Thanks for pointing this out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add an issue for this follow up. Thanks!

@justinyeh1995 justinyeh1995 force-pushed the feature/3051-kubectl-shell-completion-ray-get-workergroups-nodes branch from 5e1e1b0 to 9a85485 Compare December 25, 2025 14:39
Copy link
Member

@Future-Outlier Future-Outlier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @win5923 to take a look, thank you!

Signed-off-by: justinyeh1995 <justinyeh1995@gmail.com>
Signed-off-by: justinyeh1995 <justinyeh1995@gmail.com>
Signed-off-by: justinyeh1995 <justinyeh1995@gmail.com>
Signed-off-by: justinyeh1995 <justinyeh1995@gmail.com>
…nFunc to accept client.Client parameter for testability. Add initial unit tests.

Signed-off-by: justinyeh1995 <justinyeh1995@gmail.com>
Signed-off-by: justinyeh1995 <justinyeh1995@gmail.com>
justinyeh1995 and others added 4 commits December 26, 2025 23:35
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: JustinYeh <justinyeh1995@gmail.com>
…workergroup deduplication

Signed-off-by: justinyeh1995 <justinyeh1995@gmail.com>
Signed-off-by: justinyeh1995 <justinyeh1995@gmail.com>
Signed-off-by: justinyeh1995 <justinyeh1995@gmail.com>
@justinyeh1995 justinyeh1995 force-pushed the feature/3051-kubectl-shell-completion-ray-get-workergroups-nodes branch from 83d6ccd to 9ac0ee3 Compare December 26, 2025 15:35
Signed-off-by: justinyeh1995 <justinyeh1995@gmail.com>
Copy link
Member

@Future-Outlier Future-Outlier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @justinyeh1995
I can't reproduce your result on my zsh, can you give me a full guide so that I can just copy paste and test this PR on my laptop?

@justinyeh1995
Copy link
Contributor Author

Hi, @justinyeh1995 I can't reproduce your result on my zsh, can you give me a full guide so that I can just copy paste and test this PR on my laptop?

Sure. I'll check again and write a follow-up here. Thanks for reviewing.

@justinyeh1995
Copy link
Contributor Author

justinyeh1995 commented Dec 30, 2025

Hi @Future-Outlier, would you mind trying this out in zsh

pushd kubectl-plugin > /dev/null
go build cmd/kubectl-ray.go
mv kubectl-ray ~/.krew/bin
popd > /dev/null

kubectl krew install --manifest-url https://raw.githubusercontent.com/marckhouzam/kubectl-plugin_completion/v0.1.0/plugin-completion.yaml

kubectl plugin-completion generate
export PATH=$PATH:$HOME/.kubectl-plugin-completion

source ~/.zshrc

Should be able to verify it via:

kubectl ray get workergroup <TAB>
kubectl ray get node <TAB>

Thanks!

Copy link
Member

@Future-Outlier Future-Outlier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, cc @rueian to merge, thank you!

thank you for the guidance @justinyeh1995

image image

@Future-Outlier Future-Outlier moved this from In Progress to can be merged in @Future-Outlier's kuberay project Dec 31, 2025
Copy link
Collaborator

@machichima machichima left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] shell completion for kubectl ray get [workergroups|nodes] CLI commands

5 participants