Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

porch: kpt alpha rpkg get fails when a couple hundred branches #3882

Open
johnbelamaric opened this issue Mar 14, 2023 · 7 comments
Open

porch: kpt alpha rpkg get fails when a couple hundred branches #3882

johnbelamaric opened this issue Mar 14, 2023 · 7 comments
Labels
area/porch bug Something isn't working triaged Issue has been triaged by adding an `area/` label

Comments

@johnbelamaric
Copy link
Contributor

Expected behavior

Valid list of package revisions is returned.

Actual behavior

jbelamaric@jbelamaric:~/proj/tmp/cachingdns-topology$ kpt alpha rpkg get
Error: Get "https://35.192.14.90/apis/porch.kpt.dev/v1alpha1/namespaces/default/packagerevisions": stream error: stream ID 1; INTERNAL_ERROR; received from peer 
jbelamaric@jbelamaric:~/proj/tmp/cachingdns-topology$ k get packagerevisions
Unable to connect to the server: stream error: stream ID 1; INTERNAL_ERROR; received from peer
jbelamaric@jbelamaric:~/proj/tmp/cachingdns-topology$ k get po -n porch-system
NAME                                 READY   STATUS    RESTARTS       AGE
function-runner-77946d6686-jv8kk     1/1     Running   0              5d18h
function-runner-77946d6686-rn57r     1/1     Running   0              5d18h
porch-controllers-5d67bb9fdf-4fs4l   1/1     Running   0              22h
porch-server-78dd559589-qmrvl        1/1     Running   17 (23h ago)   5d

Information

Due to #3877 there are a couple hundred branches after running overnight (see image below).

Porch v0.0.15
kpt v1.0.0-beta.23

image

Steps to reproduce the behavior

@johnbelamaric johnbelamaric added the bug Something isn't working label Mar 14, 2023
@johnbelamaric
Copy link
Contributor Author

porch-server.log

@johnbelamaric
Copy link
Contributor Author

I didn't see any obvious crashes in the porch server logs.

@johnbelamaric
Copy link
Contributor Author

FYI, I manually deleted all those 200+ branches and now it's working again.

@natasha41575
Copy link
Contributor

natasha41575 commented Mar 16, 2023

Hmm, not able to reproduce this one either. I thought maybe your packages might be too large but they all seem reasonably small. I tried to reproduce with https://github.com/natasha41575/blueprints (which has 333 branches atm) and it does take a second or two, but kpt alpha rpkg get still works with porch both running in kind and locally.

Might this be similar to #3877 (comment), that porch may have entered a strange error state near the beginning? Would you be able to recreate the 200 branches and see if the issue is still there?

If you need a quick way to create the branches, I created my 200 branches by setting in my PV deletionPolicy: orphan and running for i in {1..200}; do kubectl delete -f packagevariant.yaml; sleep 0.5; kubectl apply -f packagevariant.yaml; sleep 0.5; done.

@natasha41575 natasha41575 added triaged Issue has been triaged by adding an `area/` label norepro the issue was investigated but could not be reproduced. labels Mar 16, 2023
@johnbelamaric
Copy link
Contributor Author

I wonder if it has to do with running on an autopilot cluster with guaranteed pods (not burstable):

        name: porch-server
        resources:
          limits:
            cpu: 250m
            ephemeral-storage: 1Gi
            memory: 512Mi
          requests:
            cpu: 250m
            ephemeral-storage: 1Gi
            memory: 512Mi

@natasha41575 natasha41575 removed the norepro the issue was investigated but could not be reproduced. label Mar 16, 2023
@natasha41575
Copy link
Contributor

natasha41575 commented Mar 16, 2023

Could you share the memory utilization of your pods to see if it is going over the limits? I spun up an autopilot cluster with the same limits to try it out but again did not hit the same issue.

@natasha41575
Copy link
Contributor

I said this on the other issue too, but I'm going to try to reproduce your setup with the script you sent me so I can investigate more productively.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/porch bug Something isn't working triaged Issue has been triaged by adding an `area/` label
Projects
None yet
Development

No branches or pull requests

3 participants