Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

api: Strict port number handling and --enable-rpc deprecation #6459

Merged
merged 4 commits into from
Sep 14, 2021
Merged

api: Strict port number handling and --enable-rpc deprecation #6459

merged 4 commits into from
Sep 14, 2021

Conversation

ahmetb
Copy link
Contributor

@ahmetb ahmetb commented Aug 18, 2021

Fixes #6425
Fixes #6417
Fixes #6375
Related: #6425 (proposal), supersedes #6443

Description

This path implements the proposal in #6425 and contains various breaking
changes to most users of Skaffold APIs regarding enabling the Skaffold API,
and how port numbers behave.

This patch also fixes the test flakes for the following integration tests (each
ran 100 times) by behaving more strictly around busy ports.

User facing changes (remove if N/A)

  • API is no longer enabled by default for "dev" and "debug" commands.
    This might be a breaking change for some users who were relying on the
    behavior.

  • APIs are only enabled if --rpc-port or --rpc-http-port (or both) are set.
    These flags no longer have defaults 50051 and 50052.

  • --enable-rpc flag is now marked as "deprecated" for 1.31 release.
    It will be removed in 3 months or 1 release (whichever is longer) since the
    Skaffold API is a beta feature.

  • If given port is busy, skaffold now fails with an error instead of trying
    to find an available port and printing it to logs.

Follow-up Work

Since Skaffold API feature is beta, --enable-rpc flag deprecation will
start in the release this is incorporated in, and will be removed in the next
release or 3 months (whichever is longer) per deprecation policy.

Subsequently, we can remove the tests and code looking for this flag.

@ahmetb ahmetb requested a review from a team as a code owner August 18, 2021 22:35
@ahmetb ahmetb requested a review from aaron-prindle August 18, 2021 22:35
@google-cla google-cla bot added the cla: yes label Aug 18, 2021
@codecov
Copy link

codecov bot commented Aug 18, 2021

Codecov Report

Merging #6459 (4fb717b) into main (290280e) will decrease coverage by 0.03%.
The diff coverage is 76.78%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #6459      +/-   ##
==========================================
- Coverage   70.48%   70.44%   -0.04%     
==========================================
  Files         515      517       +2     
  Lines       23150    23298     +148     
==========================================
+ Hits        16317    16413      +96     
- Misses       5776     5826      +50     
- Partials     1057     1059       +2     
Impacted Files Coverage Δ
cmd/skaffold/app/cmd/flags.go 89.00% <0.00%> (-1.82%) ⬇️
cmd/skaffold/skaffold.go 0.00% <ø> (ø)
pkg/skaffold/build/buildpacks/logger.go 0.00% <ø> (ø)
pkg/skaffold/config/options.go 100.00% <ø> (ø)
pkg/skaffold/hooks/env.go 100.00% <ø> (ø)
pkg/skaffold/server/server.go 49.61% <60.86%> (+1.99%) ⬆️
pkg/skaffold/event/v2/event.go 84.22% <66.66%> (-0.37%) ⬇️
cmd/skaffold/app/cmd/cmd.go 70.32% <100.00%> (-0.73%) ⬇️
pkg/skaffold/build/buildpacks/lifecycle.go 83.59% <100.00%> (ø)
pkg/skaffold/config/types.go 100.00% <100.00%> (ø)
... and 37 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 76cee85...4fb717b. Read the comment docs.

@ahmetb ahmetb changed the title Strict port number handling and --enable-rpc deprecation api: Strict port number handling and --enable-rpc deprecation Aug 18, 2021
@ahmetb ahmetb added the kokoro:force-run forces a kokoro re-run on a PR label Aug 19, 2021
@kokoro-team kokoro-team removed the kokoro:force-run forces a kokoro re-run on a PR label Aug 19, 2021
@@ -189,29 +191,24 @@ func newGRPCServer(preferredPort int, usedPorts *util.PortSet) (func() error, in
}, port, nil
}

func newHTTPServer(preferredPort, proxyPort int, usedPorts *util.PortSet) (func() error, error) {
Copy link
Member

@briandealwis briandealwis Aug 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT: should we add the ports to the usedPorts set? It's a convenience but prevents our port-finding from trying these ports. I don't feel strongly about this though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably not. in existing code, the PortSet instance was not shared with the other PortSet instance used in port-forwrding, so it already had no use. the code on the other side already checks for availibility, too, so it seemed redundant to me.


func (s *IntOrUndefined) String() string {
if s.value == nil {
return ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the value used in the help text. So you could use <unset> or something like that to help users:

      --rpc-http-port=: tcp port to expose the Skaffold API over HTTP REST
      --rpc-port=: tcp port to expose the Skaffold API over gRPC

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually wanted to get rid of the =: part (just need :), but could not figure it out. I'll probably do what you said.

Comment on lines 100 to 123
if !opts.EnableRPC && opts.RPCPort.Value() == nil && opts.RPCHTTPPort.Value() == nil {
logrus.Debug("skaffold API not starting as it's not requested")
return emptyCallback, nil
}

var usedPorts util.PortSet

grpcCallback, rpcPort, err := newGRPCServer(opts.RPCPort, &usedPorts)
preferredGRPCPort := 0 // bind to an available port atomically
if opts.RPCPort.Value() != nil {
preferredGRPCPort = *opts.RPCPort.Value()
}
grpcCallback, grpcPort, err := newGRPCServer(preferredGRPCPort)
if err != nil {
return grpcCallback, fmt.Errorf("starting gRPC server: %w", err)
}
httpCallback := emptyCallback

httpCallback, err := newHTTPServer(opts.RPCHTTPPort, rpcPort, &usedPorts)
if opts.RPCHTTPPort.Value() != nil {
httpCallback, err = newHTTPServer(*opts.RPCHTTPPort.Value(), grpcPort)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If --enable-rpc was specified alone, then we need to implicitly work as if the user specified --rpc-port=50051 --rpc-http-port=50052.

Since you're disabling the --enable-rpc default for dev and debug, failing if the ports are unavailable is the right thing to do in this situation since the user can't really figure out the ports anyways.

I think you should extract this port-checking-and-determination to a separate function that can be tested. Something like

grpcPort, httpPort := evaluateOptions(opts)
if grpcPort < 0 && httpPort < 0 {
    logrus.Debug("skaffold API not starting as it's not requested")
    return emptyCallback, nil
}

Port 0 may make sense for users who don't care about the port but are streaming events for debugging purposes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that --enable-rpc is gone, here are the possible scenarios:

input1 input2 http port grpc port
-rpc-port=unset -http-port=unset not started not started
-rpc-port=X -http-port=unset not started X
-rpc-port=X -http-port=Y Y X
-rpc-port=unset -http-port=Y Y bind to a random port with :0

We also don't need 50051 & 50052 defaults anymore (CC does not rely on these).

We can extract to a fn and try to test it but the check is basically:

if both grpc and http port opts are unset {
   return
}

@briandealwis thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This table is great it needs --enable-rpc too: we need to support --enable-rpc until we remove --enable-rpc after the deprecation period.

We also don't need 50051 & 50052 defaults anymore (CC does not rely on these).

But users may. CC is not the only user of the gRPC + REST APIs.

Copy link
Contributor Author

@ahmetb ahmetb Aug 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This table is great it needs --enable-rpc too: we need to support --enable-rpc until we remove --enable-rpc after the deprecation period.

Currently, this is what I had in mind without incorporating the default ports

--enable-rpc --rpc-port --http-port http port grpc port
false * * not started not started
true unset unset bind random (:0) 2 bind random (:0) 2
true or unset X unset not started X
true or unset X Y Y X
true or unset unset Y Y bind random (:0)

1: This is breaking change for users who were relying on skaffold dev/debug commands starting the server implicitly by default.

2: This is breaking change for users who were relying on default ports. But this change is default as the --enable-rpc will go away soon.

But users may. CC is not the only user of the gRPC + REST APIs.

Given the complexity of the table above, I'm not sure if we can roll it forward without breaking anyone. Let me know if you see a way. My plan was to roll this forward in lockstep with CC team.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refined the table above, make sure to refer to the one on GitHub UI.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my suggestion

--enable-rpc --rpc-port --http-port http port grpc port
false * * not started not started
true unset unset bind random (:0) port 50052, fail if unavailable bind random (:0) port 50051, fail if unavailable
true X unset port 50052, fail if unavailable X
true unset Y Y port 50051, fail if unavailable
true or unset X unset not started X
true or unset X Y Y X
true or unset unset Y Y bind random (:0)

So the only change is that --enable-rpc will fail when either port 50051 or 50052 are unavailable whereas we previously allocating a new nearby port. The warning text can recommend using --rpc-port=0 or --rpc-http-port=0 for a randomly assigned port.

We could even collapse the --enable-rpc == false with the --enable-rpc == unset cases

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I think this agrees with Nick)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@briandealwis problem with the changes suggested in rows 3 and 4 is that we are removing "if busy find adjacent port" logic.

So if you start HTTP server on :X, but we default gRPC to port 50051, the second instance of skaffold running simultaneously will fail to launch.

Therefore, I recommend retaining the recommended behavior in rows 3 and 4.

We can implement your recommendation in row 2 and add a warning message for the case both ports are unset.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

problem with the changes suggested in rows 3 and 4 is that we are removing "if busy find adjacent port" logic.

To be clear, this is going to happen if someone's explicitly setting the --enable-rpc (i.e. IDE integration or otherwise) and they're running multiple skaffold processes. Cloud Code currently does this (it specifies either --rpc-port or --rpc-http-port), so if they somehow they start a 2nd skaffold instance, we'd fail if we choose the legacy default port and another process was using it.

Copy link
Contributor Author

@ahmetb ahmetb Sep 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@briandealwis @nkubala I am inclined to keep the initially proposed behavior here

So the only change is that --enable-rpc will fail when either port 50051 or 50052 are unavailable whereas we previously allocating a new nearby port. The warning text can recommend using --rpc-port=0 or --rpc-http-port=0 for a randomly assigned port.

This will be a problem for two reasons.

  1. if we keep such default ports, they can't be "default values" on the flags, because setting the flag now means enable the API, which we're moving away from
  2. it's problematic for Cloud Code implementations such as VSCode, where they choose -rpc-http-port explicitly. if "we choose -rpc-port as 50051 and fail if busy", then multiple skaffold instances ran by CC for VSC will fail, we don't want this during the rollout of this.

The only thing I want to add is: If --enable-rpc=true is specified but no ports whatsoever, print a warning saying skaffold is choosing ports.

This path implements the proposal in issue 6425 and contains various breaking
changes to most users of Skaffold APIs regarding enabling the API.

* API is no longer enabled by default for "dev" and "debug" commands.
  This might be a breaking change for some users who were relying on the
  behavior.

* APIs are only enabled if --rpc-port or --rpc-http-port (or both) are set.
  These flags no longer have defaults 50051 and 50052.

* --enable-rpc flag is now marked as "deprecated" for 1.31 release.
  It will be removed in 3 months or 1 release (whichever is longer) since the
  Skaffold API is a beta feature.

* If given port is busy, skaffold now fails with an error instead of trying
  to find an available port and printing it to logs.

This patch also fixes the test flakes for the following integration tests (each
ran 100 times) by behaving more strictly around busy ports.

- TestEventsRPC
- TestRunPortForward

Signed-off-by: Ahmet Alp Balkan <ahmetb@google.com>
Signed-off-by: Ahmet Alp Balkan <ahmetb@google.com>
@ahmetb ahmetb added kokoro:run runs the kokoro jobs on a PR kokoro:force-run forces a kokoro re-run on a PR and removed kokoro:run runs the kokoro jobs on a PR labels Sep 2, 2021
@kokoro-team kokoro-team removed the kokoro:force-run forces a kokoro re-run on a PR label Sep 2, 2021
@ahmetb
Copy link
Contributor Author

ahmetb commented Sep 7, 2021

I added 1fd70e2 which provides the warning message.
@nkubala @briandealwis take another look please, this should be ok to move forward.

Copy link
Contributor

@tejal29 tejal29 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets make it more user friendly by showing the random port instead of warning message forcing users to quit skaffold.

Signed-off-by: Ahmet Alp Balkan <ahmetb@google.com>
@ahmetb
Copy link
Contributor Author

ahmetb commented Sep 13, 2021

Can we just show the random port assigned in warn instead of info if the opts.EnableRPC && opts.RPCPort.Value() == nil && opts.RPCHTTPPort.Value() == nil and remove this warn message here.

@tejal29 Ack. This only applies to the GRPC port, since we don't start HTTP by default as part of the proposal. But I updated the warning in d81eb12 + 4fb717b.

that way "-enable-rpc -rpc-http-port=X" users don't see a warning.

Signed-off-by: Ahmet Alp Balkan <ahmetb@google.com>
@ahmetb
Copy link
Contributor Author

ahmetb commented Sep 14, 2021

@tejal29 PTAL if you can, it seems like today is the release cutoff date. cc: @nkubala

@tejal29 tejal29 merged commit d64637c into GoogleContainerTools:main Sep 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants