Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove curl as an Envoy dependency #11816

Open
htuch opened this issue Jun 30, 2020 · 38 comments
Open

Remove curl as an Envoy dependency #11816

htuch opened this issue Jun 30, 2020 · 38 comments

Comments

@htuch
Copy link
Member

htuch commented Jun 30, 2020

Currently we depend on libcurl for URL fetching in AWS extension common utils

static size_t curlCallback(char* ptr, size_t, size_t nmemb, void* data) {
and I think OpenCensus (@g-easy can you confirm?)

The use of curl is largely redundant, since Envoy itself can do HTTP fetch. In addition, Curl does not have a compatible threading and observability model with Envoy. The recent disclosures of CVE-2020-8169 and CVE-2020-8177 provide an example of why we should eliminate this from our trusted compute base.

Opening this tracking ticket to discuss further whether we can remove this dep.

@htuch htuch changed the title Remove curl as an Envoy depedency Remove curl as an Envoy dependency Jun 30, 2020
@mattklein123
Copy link
Member

+1 for removal. I was against adding in the first place.

@moderation
Copy link
Contributor

+1 for removal

@g-easy
Copy link
Contributor

g-easy commented Jul 1, 2020

Yes, OpenCensus uses curl in its zipkin exporter (to push traces to zipkin).

@kyessenov
Copy link
Contributor

Having an indirect networking API instead of curl in libraries would also help compiling them to Wasm binaries for sandboxing. The API would be implemented via Wasm ABI.

@htuch
Copy link
Member Author

htuch commented Jul 1, 2020

@g-easy would it be possible to modify the Zipkin exporter to work with some more generic HTTP interface that Envoy can supply?

@moderation
Copy link
Contributor

As per #9958 at some point OpenCensus should be replaced by the new OpenTracing + OpenCensus project called OpenTelemetry. One would hope that OpenTelemetry doesn't require curl

@dio
Copy link
Member

dio commented Jul 5, 2020

For opencensus, since the "issue" is with the Zipkin span exporter implementation (opencensus_exporter_zipkin). I think we could own Envoy-specific implementation of that in our tree (I can try to own it if it is desired)? At minimum:

class ZipkinSpanExporterHandler : public ::opencensus::trace::exporter::SpanExporter::Handler,
                                  public Http::AsyncClient::Callbacks {
  ...
};

Which basically "retrieves" the spans stored by opencensus-cpp worker thread implementation and send that to Envoy's thread which has access to a dispatcher (I think server's dispatcher is OK to use here). As a "bonus", we also need to serialize the opencensus::trace::exporter::SpanData to Zipkin's v2 via ProtobufWkt::Struct (instead of rapidjson employed by the current opencensus-cpp Zipkin span exporter implementation).

However, this will "force" the configuration to require a defined collector cluster to let Envoy's async-client to work (I think this is also relevant to the approach for the "aws" utility). Even the dynamic forward proxy requires a cluster definition. I wonder if we can/are allowed to somehow construct a cluster without defining in the config, but build it on the fly when it is required (#3479 cc. @ramaraochavali).

@moderation, when we are needed to "upgrade" to OpenTelemetry, if the approach is the same (i.e. data is gathered in worker thread managed by the "tracer" library) and we want to support Zipkin exporter, we still need to provide an Envoy's async-client-friendly implementation for that.

@g-easy
Copy link
Contributor

g-easy commented Jul 6, 2020

One would hope that OpenTelemetry doesn't require curl

This is tracked in open-telemetry/opentelemetry-cpp#6

@g-easy
Copy link
Contributor

g-easy commented Jul 6, 2020

#11816 (comment) sounds pretty great to me.

Note that opencensus's existing background thread calls into Handler::Export() periodically.

@dio
Copy link
Member

dio commented Jul 21, 2020

@mattklein123 what is the best way of resolving URI for async client calls without explicitly defining a cluster for it? Is #3479 the way to do it? Need your advice on this. Thank you!

@mattklein123
Copy link
Member

@dio yeah we would probably need to add new functionality to AsyncClient to optionally expose the dynamic forward proxy functionality.

@wrowe
Copy link
Contributor

wrowe commented Sep 11, 2020

PR13063 is another example, which picks up the fix of

CVE-2020-8231: libcurl: wrong connect-only connection

As Envoy does NOT use the affected CURLOPT_CONNECT_ONLY toggle,
this change simply satisfies overly-simplistic audits.

htuch pushed a commit that referenced this issue Sep 16, 2020
Picks up fix of CVE-2020-8231: libcurl: wrong connect-only connection
(Envoy does NOT use the affected CURLOPT_CONNECT_ONLY toggle,
this change simply satisfies overly-simplistic audits, and this further
reinforces issue #11816 )

Resolve premature force-AF_UNIX toggle in our working branch and
disables UNIX_SOCKETS on windows, until afunix.h is used for sys/un.h
(Breakage is between curl 7.69 and curl 7.72, which doesn't pick up afunix.h
on windows; will address and resolve upstream before revisiting at envoy.)

Work around misplaced -MD cflag on Windows build
See bazel-contrib/rules_foreign_cc#426
for comprehensive discussion of the defect. This workaround
could be dropped if rules_foreign_cc works around it for us.

Preparing for specific corrections to get curl building curl.lib
with cmake using clang-cl on Windows, starting at envoy and pushing
the fix(es) to CMakeLists.txt upstream.

Risk Level: low
Testing: local
Docs Changes: n/a
Release Notes: n/a

Signed-off-by: William A Rowe Jr <wrowe@vmware.com>
Co-authored-by: Sunjay Bhatia <sunjayb@vmware.com>
@htuch
Copy link
Member Author

htuch commented Oct 6, 2020

As part of my EnvoyCon talk, I looked at all CVEs for our dependencies since 2018. Curl is by far the worst here, with 14 CVEs and 23 releases since 2018. Further evidence that we really want to make some progress here.

htuch added a commit to htuch/envoy that referenced this issue Jan 6, 2021
Prefer to have operators use Envoy's native Zipkin tracer, since Zipkin
implies libcurl in opencensus. Once we complete a deprecation cycle
here, we should be able to remove this source of libcurl dependency.

Part of envoyproxy#11816.

Risk level: Low
Testing: bazel test //test/...

Signed-off-by: Harvey Tuch <htuch@google.com>
htuch added a commit that referenced this issue Jan 7, 2021
Prefer to have operators use Envoy's native Zipkin tracer, since Zipkin
implies libcurl in opencensus. Once we complete a deprecation cycle
here, we should be able to remove this source of libcurl dependency.

Part of #11816.

Risk level: Low
Testing: bazel test //test/...

Signed-off-by: Harvey Tuch <htuch@google.com>
@htuch
Copy link
Member Author

htuch commented Apr 8, 2021

Bumping priority given the high CVE rate (almost always false positives as well).

@moderation
Copy link
Contributor

OpenTelemetry went 1.0.0 is Feb - https://medium.com/opentelemetry/opentelemetry-specification-v1-0-0-tracing-edition-72dd08936978. Looks the Zipkin exporter still needs Curl

I was thinking we could deprecate OpenCensus in favor of OT but if Curl is still present it doesn't look like it will help us. Hopefully the OT experts can provide some more context

@tonya11en
Copy link
Member

You can assign this to me. It looks like it's mostly the AWS filters.

@wrowe
Copy link
Contributor

wrowe commented May 17, 2021

Its because http[s]_proxy vars are a very well accepted convention, and it makes the stock envoy config more mutable. That doesn't make it ideal, but you raised the question why, and I think we can and should revisit the earlier design decisions

@zirain
Copy link
Contributor

zirain commented Dec 25, 2021

@tonya11en are you still working on this?

@tonya11en
Copy link
Member

@zirain no, this turned out to be more work than I was expecting to do in my free time :(. Feel free to pick it up and I'll happily help review.

@alyssawilk
Copy link
Contributor

@lavignes as the other AWS filter owner, would you be able to take a look? If not we may need to bump the aws filter to contrib/

@suniltheta
Copy link
Contributor

If we decide to move it to contrib due to not able to provide fix at the right time, I can take up the action item to move it to contrib.

@alyssawilk
Copy link
Contributor

Awesome, thank you!

@suniltheta
Copy link
Contributor

We are planning to move ahead with removing libcurl dependency from AWS extensions. I have put a small doc to indicate the approach and consequences.

https://docs.google.com/document/d/1MxREfLX4BONhOCNr7dvHwdr1_kArLb3OOSvhb-b02oQ

I look forward to all of your opinions on this. I am mainly seeing your thoughts on

  1. Whether (Option #2, don’t need explicitly defined cluster config) mentioned in the doc is a workable solution.
  2. Put the existing config/code path behind a feature flag ( marked as deprecated to be cleaned in future ) so that users of these extensions transitioning from earlier versions are not forced to instantly switch to the new configuration model.

@daixiang0
Copy link
Member

Is it still in process?

@suniltheta
Copy link
Contributor

suniltheta commented Apr 13, 2022

Yes, working on this. At testing stage now. I will start with couple of PRs in next few weeks. Sorry for the delay.

@mr-miles
Copy link

Sorry to bug - do you think it will land? I've ended up here via detective work trying to find out why IRSA wasn't working as I expected! (... and I've learnt a lot about aws in the process)

@suniltheta
Copy link
Contributor

suniltheta commented Oct 14, 2022

I am still working on the changes. I haven't got enough time to dedicate to this task.

In the current state I am able to make http calls to fetch the credentials but don't know how it can be fetched in main thread and updated on all the worker thread. main...suniltheta:envoy:cm_ref_2

Also, the change is no where close to the version where I would be able to raise the PR.


If you want to make IRSA work, then I can point you towards a patch #23408. But unfortunate that still uses curl.

@mr-miles
Copy link

Thank you so much for the quick response. I would offer to help but this is miles away from my knowledge area...

I'm sure you thought of this, but just for my own knwoledge - since the lambda filter is still beta and already uses curl, is there a downside of incorporating the patch in the interim, while libcurl is being expunged? The patch didn't seem to be increasing the problem, but my reading of the patch is pretty basic

@suniltheta
Copy link
Contributor

Made some progress. Still working on improving the unit testing. I will be raise the PRs shortly after finalizing the PR breakup.

@suniltheta
Copy link
Contributor

/assign

@mr-miles
Copy link

mr-miles commented Dec 9, 2022

thanks for the update! if you're able to publish an image somewhere i can get at from a helm chart then can give it a spin

@ceastman-r7
Copy link

@suniltheta Any update on this?

@suniltheta
Copy link
Contributor

suniltheta commented Oct 19, 2023

Currently in progress in #29880.

@ceastman-r7 are you using any particular filter?

@ceastman-r7
Copy link

This was more of a concern of vulnerability scans flagging curl. for example:

CVE-2023-38545 curl:7.81.0-1ubuntu1.13 HIGH SOCKS5 heap buffer overflow

CVE-2023-38545 curl:7.81.0-1ubuntu1.13 HIGH SOCKS5 heap buffer overflow

which I was removing curl from the istio-pilot operating system image but another istio developer questioned if that mattered since curl was built into envoy.

@phlax
Copy link
Member

phlax commented Oct 23, 2023

... questioned if that mattered since curl was built into envoy.

it doesnt matter to envoy - as said its compiled in

not sure where curl is coming from - we dont seem to have it any of our images

my advice would be to use the distroless base so you dont have to worry about extraneous deps, not sure how istio pilot is built - for some reason i thought they did use a distroless base

htuch pushed a commit that referenced this issue Nov 7, 2023
Following the merge of #29880 and #30626 we can mark the curl usage as deprecated. Meanwhile bazel/repositories.bzl had stale info that OpenCensus tracer was still using libcurl.

We can continue to keep the Issue #11816 open until curl is removed entirely after the deprecation time (Probably for v1.31 release).

Signed-off-by: Sunil Narasimhamurthy <sunnrs@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests