Skip to content

Conversation

@fg91
Copy link
Member

@fg91 fg91 commented Jan 9, 2026

Why are the changes needed?

Currently, the flytepropeller ray plugin hardcodes the creation of an ingress for the ray cluster's head node, see here.

This is a potential security problem as the ray head node might be exposed to the public internet in clusters with a default ingress class without the user knowing.

What changes were proposed in this pull request?

  • Add field EnableIngress to the ray plugin config
  • Disable the creation of the ingress by default.

How was this patch tested?

  • Added a unit test.

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

flyteorg/flytekit#3365

Signed-off-by: Fabio Grätz <fabio@cusp.ai>
@fg91 fg91 added security Issues related to Security improvements changed For changes in existing functionality labels Jan 9, 2026
@codecov
Copy link

codecov bot commented Jan 9, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 56.96%. Comparing base (14b6bbf) to head (a4280de).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6852      +/-   ##
==========================================
+ Coverage   56.95%   56.96%   +0.01%     
==========================================
  Files         929      929              
  Lines       58152    58152              
==========================================
+ Hits        33121    33127       +6     
+ Misses      21989    21983       -6     
  Partials     3042     3042              
Flag Coverage Δ
unittests-datacatalog 53.51% <ø> (ø)
unittests-flyteadmin 53.14% <ø> (+0.03%) ⬆️
unittests-flytecopilot 43.06% <ø> (ø)
unittests-flytectl 64.02% <ø> (ø)
unittests-flyteidl 75.71% <ø> (ø)
unittests-flyteplugins 60.14% <100.00%> (ø)
unittests-flytepropeller 53.63% <ø> (ø)
unittests-flytestdlib 63.29% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…buf message

Signed-off-by: Fabio Grätz <fabio@cusp.ai>
@fg91 fg91 requested a review from pingsutw January 11, 2026 12:07
enableIngress := true
cfg := GetConfig()

enableIngress := cfg.EnableIngress
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think we can put EnableIngress: & cfg.EnableIngress directly in HeadGroupSpec, as it's the only place that use this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could but the ServiceType - which is also part of the RayCluster's HeadGroupSpec - is also in our top level plugin config. My preference would be to try to keep it consistent here at least?

This is not a strong opinion though.

pingsutw
pingsutw previously approved these changes Jan 13, 2026
@machichima
Copy link
Member

@fg91 I think the flyteplugins test CI failed. Could you please take a look? Thanks!

@fg91
Copy link
Member Author

fg91 commented Feb 10, 2026

#6905 and this PR tackled the same issue (ray clusters being hardcoded to create ingresses) in parallel with the difference that I argue for actually disabling the ingress by default, see PR description:

This is a potential security problem as the ray head node might be exposed to the public internet in clusters with a default ingress class without the user knowing.

As a platform engineer, I'd admittedly be quite alarmed to discover that jobs launched by platform users have been creating potentially public ingresses without me being aware. This is why I don't like the default of enabling the ingress.

@EngHabu here you said

I like booleans to default to false... So I think this should be DisableIngress

upon which you, @mickjermsurawong-openai, changed your initial EnableIngress (with default true) to DisableIngress with (default false).

Would you be ok with merging this PR too which goes back to EnableIngress but with default false?

This PR also adds a unit test for BuildResource + enabling/disabling the ingress.

@fg91 fg91 requested a review from EngHabu February 10, 2026 09:43
@mickjermsurawong-openai
Copy link
Contributor

Fabio, curious without ingress, is Ray dashboard still available? My understanding is no (pls correct me i just started using the ray plugin). My thought is i'd setup KubeRay API for dashboard instead, although it's experimental feature, and we can set pod-template for ray job to add the right annotation.

@mickjermsurawong-openai
Copy link
Contributor

I'm fine changing the flag back to EnableIngress!

@Sovietaced Sovietaced added the breaking Breaking changes label Feb 10, 2026
Copy link
Member

@Sovietaced Sovietaced left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking this as a breaking change since the behavior does change and requires folks to update things.

@machichima
Copy link
Member

Fabio, curious without ingress, is Ray dashboard still available? My understanding is no (pls correct me i just started using the ray plugin). My thought is i'd setup KubeRay API for dashboard instead, although it's experimental feature, and we can set pod-template for ray job to add the right annotation.

Hi @mickjermsurawong-openai,
Do you mean KubeRay "APIServer"? I think by using APIServer you can connect to Ray dashboard without needed to set the annotation

@machichima
Copy link
Member

As a platform engineer, I'd admittedly be quite alarmed to discover that jobs launched by platform users have been creating potentially public ingresses without me being aware. This is why I don't like the default of enabling the ingress.

I think it's safer to keep the default as enable that aligns with the current behavior, to prevent users's confusion. For example when they upgrade the newer flyte version, and Ray Dashboard cannot be accessed, they might not know what's happening

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking Breaking changes changed For changes in existing functionality security Issues related to Security improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants