Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

request help: pkg/apisix/upstream.go List Returns when an error is encountered #814

Open
allenhaozi opened this issue Dec 26, 2021 · 15 comments
Labels
discuss question Further information is requested

Comments

@allenhaozi
Copy link

Issue description

The following logic, if I run into an error, causes all nodes after this node to be skipped

Is this by design

        var items []*v1.Upstream
	for i, item := range upsItems.Node.Items {
		ups, err := item.upstream()
		if err != nil {
			log.Errorw("failed to convert upstream item",
				zap.String("url", u.url),
				zap.String("upstream_key", item.Key),
				zap.Error(err),
			)
			return nil, err
		}
		items = append(items, ups)
		log.Debugf("list upstream #%d, body: %s", i, string(item.Value))
	}
	return items, nil
2021-12-26T10:34:56+08:00	error	apisix/upstream.go:122	failed to convert upstream item	{"url": "http://apisix-admin.ingress-apisix.svc.cluster.local:9180/apisix/admin/upstreams", "upstream_key": "/apisix/upstreams/387087894457615097", "error": "unmarshal upstream failure,err:unexpected non-empty object,value:{\"timeout\":{\"send\":6,\"connect\":6,\"read\":6},\"type\":\"roundrobin\",\"scheme\":\"http\",\"update_time\":1640251964,\"id\":\"387087894457615097\",\"pass_host\":\"pass\",\"nodes\":{\"10.42.8.234:80\":100},\"name\":\"test_iam-frontend_80\",\"create_time\":1640251964,\"keepalive_pool\":{\"size\":320,\"idle_timeout\":60,\"requests\":1000}}: unexpected non-empty object", "errorVerbose": "unexpected non-empty object\nunmarshal upstream failure,err:unexpected non-empty object,value:{\"timeout\":{\"send\":6,\"connect\":6,\"read\":6},\"type\":\"roundrobin\",\"scheme\":\"http\",\"update_time\":1640251964,\"id\":\"387087894457615097\",\"pass_host\":\"pass\",\"nodes\":{\"10.42.8.234:80\":100},\"name\":\"test_iam-frontend_80\",\"create_time\":1640251964,\"keepalive_pool\":{\"size\":320,\"idle_timeout\":60,\"requests\":1000}}\ngithub.com/apache/apisix-ingress-controller/pkg/apisix.(*item).upstream\n\t/build/pkg/apisix/resource.go:132\ngithub.com/apache/apisix-ingress-controller/pkg/apisix.(*upstreamClient).List\n\t/build/pkg/apisix/upstream.go:120\ngithub.com/apache/apisix-ingress-controller/pkg/apisix.(*cluster).syncCacheOnce\n\t/build/pkg/apisix/cluster.go:211\ngithub.com/apache/apisix-ingress-controller/pkg/apisix.(*cluster).syncCache.func2\n\t/build/pkg/apisix/cluster.go:185\nk8s.io/apimachinery/pkg/util/wait.ConditionFunc.WithContext.func1\n\t/go/pkg/mod/k8s.io/apimachinery@v0.22.4/pkg/util/wait/wait.go:217\nk8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext\n\t/go/pkg/mod/k8s.io/apimachinery@v0.22.4/pkg/util/wait/wait.go:230\nk8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtection\n\t/go/pkg/mod/k8s.io/apimachinery@v0.22.4/pkg/util/wait/wait.go:223\nk8s.io/apimachinery/pkg/util/wait.ExponentialBackoff\n\t/go/pkg/mod/k8s.io/apimachinery@v0.22.4/pkg/util/wait/wait.go:418\ngithub.com/apache/apisix-ingress-controller/pkg/apisix.(*cluster).syncCache\n\t/build/pkg/apisix/cluster.go:182\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1371"}

Environment

  • current master
@tao12345666333 tao12345666333 added the question Further information is requested label Dec 26, 2021
@tao12345666333
Copy link
Member

Is this by design

yes. If it doesn't return directly here, the user may not find this error.

@tokers
Copy link
Contributor

tokers commented Dec 26, 2021

And it looks like the problem with the old type format of apisix.

@allenhaozi
Copy link
Author

A node problem causes other nodes to be unavailable. Whether this logic makes sense ?

Looking for an error of one node can provide other reasonable ways ?

@pottekkat
Copy link
Member

pottekkat commented Feb 24, 2022

@tao12345666333 Can't the error just be logged and the return skipped? What is the issue with that approach?

@tao12345666333
Copy link
Member

@tao12345666333 Can't the error just be logged and the return skipped? What is the issue with that approach?

Making this modification is not difficult.
Can you describe your scenario and expectations?

@pottekkat
Copy link
Member

Just removing the return statement and only logging the errors. So, even if one node fails, it will log the error but the loop will continue for other nodes.

@tao12345666333
Copy link
Member

Obviously we should make some changes.

Do you remember why this is done here? @gxthrj @tokers

@tokers
Copy link
Contributor

tokers commented Feb 24, 2022

Obviously we should make some changes.

Do you remember why this is done here? @gxthrj @tokers

A half-successful upstream set could cause other problems, which will be hard to troubleshoot, just let it fail can pin the problem and won't propagate.

@gxthrj
Copy link
Contributor

gxthrj commented Mar 2, 2022

I think the issue is related to #712, they have similar points:
The id in upstream is a string type, but the id from upstreams get from the List is int.
We need to consider the compatibility of this field.

@tao12345666333
Copy link
Member

@gxthrj For #712 I don't think we need to provide compatibility with dashboard upstream resources. The project directly manipulates etcd, and in the new roadmap of APISIX I see that this function will be weakened. Compatible with it probably doesn't make sense.

I have seen some issues recently. Most of the scenarios are to use ApisixUpstream resources to perceive changes in Kubernetes services/endpoints, and then use it with Dashboard to create routes.

@gxthrj
Copy link
Contributor

gxthrj commented Mar 6, 2022

I have seen some issues recently. Most of the scenarios are to use ApisixUpstream resources to perceive changes in Kubernetes services/endpoints, and then use it with Dashboard to create routes.

Yes, That is true.

@gxthrj
Copy link
Contributor

gxthrj commented Mar 6, 2022

For #712 I don't think we need to provide compatibility with dashboard upstream resources. The project directly manipulates etcd, and in the new roadmap of APISIX I see that this function will be weakened. Compatible with it probably doesn't make sense.

If that, we can not resolve this issue and #712. What is the plan about this issue?

@tao12345666333
Copy link
Member

As I said above, there will not be full support for the scenarios mentioned by #712.

But we can consider using APISIX Ingress to define ApisixUpstream, and route resources are maintained by others.

@gxthrj
Copy link
Contributor

gxthrj commented Mar 6, 2022

As I said above, there will not be full support for the scenarios mentioned by #712.

But we can consider using APISIX Ingress to define ApisixUpstream, and route resources are maintained by others.

OK, I see, so the ApisixUpstream resource will be supported first.

@tao12345666333
Copy link
Member

Yes. I've seen a lot of feedback on that issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants