Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Flake] TestEventsRPC is flaky #6417

Closed
tejal29 opened this issue Aug 11, 2021 · 3 comments · Fixed by #6459
Closed

[Flake] TestEventsRPC is flaky #6417

tejal29 opened this issue Aug 11, 2021 · 3 comments · Fixed by #6459
Labels
kind/tech-debt meta/test-flake tech-debt Issues that relate to paying back technical debt.

Comments

@tejal29
Copy link
Contributor

tejal29 commented Aug 11, 2021

I have seen this multiple times

rpc_test.go:93: error retrieving event log: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp :36288: connect: connection refused"
    helper.go:194: Skaffold log:
         Listing files to watch...
        >  - test-dev
        > Generating tags...
        >  - test-dev -> gcr.io/k8s-skaffold/test-dev:v1.29.0-35-gba4e2a8-dirty
        > Starting build...
        > Found [kind-kind] context, using local docker daemon.
        > Building [test-dev]...
        > Sending build context to Docker daemon  3.072kB
        > Step 1/3 : FROM busybox
        >  ---> 69593048aa3a
        > Step 2/3 : COPY foo /foo
        >  ---> Using cache
        >  ---> 608862fe6e54
        > Step 3/3 : CMD while true; do cat /foo; sleep 1; done
        >  ---> Using cache
        >  ---> c302772802f6
        > Successfully built c302772802f6
        > Successfully tagged gcr.io/k8s-skaffold/test-dev:v1.29.0-35-gba4e2a8-dirty
        > Starting test...
        > Tags used in deployment:
        >  - test-dev -> gcr.io/k8s-skaffold/test-dev:c302772802f6c4ce0c7d4d200f03fdcd0cdac4ac74ae980763254fdd55ebb205
        > Starting deploy...
        > Loading images into kind cluster nodes...
        >  - gcr.io/k8s-skaffold/test-dev:c302772802f6c4ce0c7d4d200f03fdcd0cdac4ac74ae980763254fdd55ebb205 -> Found
        > Images loaded in 194.78357ms
        >  - deployment.apps/test-dev created
        > Press Ctrl+C to exit
        > Watching for changes...
        > [test-dev]foo
        > [test-dev]foo
        > [test-dev]foo
        > [test-dev]foo
        > [test-dev]foo
        > [test-dev]foo
        > [test-dev]foo
        > [test-dev]foo
        > [test-dev]foo
        > [test-dev]foo
        > [test-dev]foo
        > [test-dev]foo
        > [test-dev]foo
        > [test-dev]foo
        > [test-dev]foo
        > [test-dev]foo
        > 
--- FAIL: TestEventsRPC (20.84s)
@tejal29 tejal29 added kind/tech-debt meta/test-flake tech-debt Issues that relate to paying back technical debt. labels Aug 11, 2021
@tejal29 tejal29 changed the title [Flake [Flake] TestEventsRPC is flaky Aug 11, 2021
@ahmetb
Copy link
Contributor

ahmetb commented Aug 12, 2021

I've reproed this locally with -count=100.

The issue seems to be what I mentioned the other day in the meeting: When you specify a PORT and it is occupied, skaffold chooses to pick another PORT.

So the output just before what you pasted above is likely something like:

time="2021-08-12T13:51:28-07:00" level=info msg="Running [skaffold dev --namespace skaffoldrshwc --default-repo gcr.io/k8s-skaffold --cache-artifacts=false --rpc-port 59013 --status-check=false] in testdata/dev"
time="2021-08-12T13:51:28-07:00" level=warning msg="starting gRPC server on port 59014. (59013 is already in use)"

as a result it will loop to connect and it will fail:

    rpc_test.go:87: waiting for connection...
    rpc_test.go:87: waiting for connection...
    rpc_test.go:93: error retrieving event log: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp :59013: connect: connection refused"

Do we have any strong feelings around keeping this "skaffold picks a different available port" logic? IMO it makes the tool less predictable and there's no way to communicate the picked port to the caller (short of caller parsing the unstructured output = undesirable?). @nkubala @tejal29

@ahmetb
Copy link
Contributor

ahmetb commented Aug 12, 2021

/assign

@ahmetb
Copy link
Contributor

ahmetb commented Aug 12, 2021

Same root cause as #6375.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/tech-debt meta/test-flake tech-debt Issues that relate to paying back technical debt.
Projects
None yet
2 participants