Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Fetch Cilium Observe data to monitoring Cilium Service network traffic #12393

Merged
merged 11 commits into from
Jul 2, 2024

Conversation

mrproliu
Copy link
Contributor

@mrproliu mrproliu commented Jul 1, 2024

In this PR, I mainly implemented the following features:

  1. Combining data collection using Peer and Flow services provided by Cilium itself and converting it to SkyWalking data, finally displaying it in UI.
  2. Added a labelCount function in the OAL engine for creating count statistics based on tags.
  3. Rewrite FieldsHelper to support ALS as well as Cilium.

NOTE: Following cilium/cilium#32175, we found that some flow data would be missing some metadata, so we cannot get both side metrics. Then, we decide to ignore the server-side data and convert the client-side data to the server-side.

Screenshots

All Cilium Services Overview

image
image
image

Service Overview

image
image
image
image

Service Relation Overview

image
image

Service Instance Overview

image
image

Service Instance Relaiton Overview

image

Endpoint Overview

image
image

@mrproliu mrproliu marked this pull request as draft July 1, 2024 10:46
@mrproliu mrproliu added UI UI and front end related. feature New feature submodule update Change submodule commit ID, including network protocol, query GraphQL and UI labels Jul 1, 2024
@mrproliu mrproliu added this to the 10.1.0 milestone Jul 1, 2024
@mrproliu mrproliu marked this pull request as ready for review July 1, 2024 11:36
@wu-sheng
Copy link
Member

wu-sheng commented Jul 1, 2024

UI submodule seems including more changes. Could you update the changelogs of UI side?

@wu-sheng
Copy link
Member

wu-sheng commented Jul 1, 2024

How about e2e of Cilium?

@wu-sheng
Copy link
Member

wu-sheng commented Jul 1, 2024

Cilium integration docs should be added in the marketplace, https://skywalking.apache.org/docs/main/next/en/setup/backend/backend-k8s-monitoring/.
And the dashboard on the UI should point to it.

The docs should include

  1. Integration architecture and data flow
  2. API dependencies
  3. Process logic and the brief about how we process the data
  4. How OAP cluster works in fetching mode.

Comment on lines 48 to 49
public class FieldsHelper {
private static final Map<Class<?>, FieldsHelper> HELPER_MAP = new ConcurrentHashMap<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain this? It seems to you expand the scope of this class to cover more field processes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Yes, Before we need to use FieldsHelper.SINGLE for getting the instance, but it only works for the ALS message. But for now, the cilium fetcher still needs this for the inflate metadata, so I make it an instance with the target inflate class. When the user needs to get the instance of FieldsHelper needs to call forClass(Class).

Comment on lines +1 to +2
// SPDX-License-Identifier: Apache-2.0
// Copyright Authors of Cilium
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should update the LICENSE of source codes, as we copied these from a new project.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have update the LICENSE file, please check.

@mrproliu
Copy link
Contributor Author

mrproliu commented Jul 2, 2024

UI submodule seems including more changes. Could you update the changelogs of UI side?

Updated.

@mrproliu
Copy link
Contributor Author

mrproliu commented Jul 2, 2024

How about e2e of Cilium?

I will trying to create other PR for the E2E Testing.

@mrproliu
Copy link
Contributor Author

mrproliu commented Jul 2, 2024

Cilium integration docs should be added in the marketplace, https://skywalking.apache.org/docs/main/next/en/setup/backend/backend-k8s-monitoring/. And the dashboard on the UI should point to it.

The docs should include

  1. Integration architecture and data flow
  2. API dependencies
  3. Process logic and the brief about how we process the data
  4. How OAP cluster works in fetching mode.

Thanks. I have update the documentation following your mentioned.

| Call CPM | Count | HTTP Request calls per minutes. |
| Duration | Nanoseconds | Total HTTP Response use duration. |
| Success CPM | Count | Total HTTP Response success(status < 500) count. |
| Status 1/2/3/4/5xx | Count | HTTP Response status code group by 1xx/2xx/3xx/4xx/5xx. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 2xx in the same graph? 2xx will be super high.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, They are in the same graph.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should list all traffic in one graph and not-2xx in one graph. This should be more clear. Traffic and Anormal Traffic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like this?
image

@wu-sheng
Copy link
Member

wu-sheng commented Jul 2, 2024

Two questions

  1. Is Hubble peer always deployed as one pod? Because I only see one hostname and one port.
  2. How does OAP cluster work with Hubble peer? How many pods of OAP are going to fetch and process metrics from Hubble peer?

@mrproliu
Copy link
Contributor Author

mrproliu commented Jul 2, 2024

  1. Is Hubble peer always deployed as one pod? Because I only see one hostname and one port.

Hubble peer service is correlated with the cilium node, not a only pod. And Kubernetes will load balance with wich pod should be listen.

  1. How does OAP cluster work with Hubble peer? How many pods of OAP are going to fetch and process metrics from Hubble peer?

I have write a selection about this: Load Balance for Cilium Fetcher with OAP cluster. All the OAP nodes will listen bubble peer service for getting nodes. And each OAP node will calucate how many cilium node should be monitoring.

@wu-sheng wu-sheng merged commit 2e8b4f9 into apache:master Jul 2, 2024
156 checks passed
@mrproliu mrproliu deleted the cilium branch July 2, 2024 07:42
Comment on lines +117 to +162
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-dependency-plugin</artifactId>
<executions>
<execution>
<id>unpack</id>
<phase>generate-sources</phase>
<goals>
<goal>unpack</goal>
</goals>
<configuration>
<artifactItems>
<artifactItem>
<groupId>com.github.davidmoten</groupId>
<artifactId>flatbuffers-compiler</artifactId>
<version>1.12.0.1</version>
<type>${fbs.compiler.artifact.type}</type>
<classifier>distribution-${os.detected.name}</classifier>
<overWrite>true</overWrite>
<outputDirectory>${project.build.directory}</outputDirectory>
</artifactItem>
</artifactItems>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>build-helper-maven-plugin</artifactId>
<version>${build-helper-maven-plugin.version}</version>
<executions>
<execution>
<id>add-source</id>
<phase>generate-sources</phase>
<goals>
<goal>add-source</goal>
</goals>
<configuration>
<sources>
<source>${fbs.generated.sources}</source>
</sources>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this bunch of plugins here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fbs and flatbuffers can be deleted.

})
@ToString(of = {
"address",
"connected"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field doesn’t even exist in the class

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

connected can be deleted, I think I just forget to delete it when field changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these file need to be added to the LICENSE file

skywalking/LICENSE

Lines 215 to 226 in 87b73cb

The following components are provided under the Apache License. See project link for details.
The text of each license is the standard Apache 2.0 license.
proto files from cncf/udpa: https://github.com/cncf/udpa Apache 2.0
proto files from envoyproxy/data-plane-api: https://github.com/envoyproxy/data-plane-api Apache 2.0
proto files from prometheus/client_model: https://github.com/prometheus/client_model Apache 2.0
proto files from opentelemetry: https://github.com/open-telemetry/opentelemetry-proto/tree/main/opentelemetry/proto Apache 2.0
proto files from opentelemetry: https://github.com/open-telemetry/opentelemetry-proto/tree/v0.7.0 Apache 2.0
flatbuffers files from istio/proxy: https://github.com/istio/proxy Apache 2.0
mvnw files from https://github.com/apache/maven-wrapper Apache 2.0
svg files from skywalking-ui/src/assets/icons: https://github.com/google/material-design-icons Apache 2.0
ZipkinQueryHandler.java reference from zipkin2.server.internal.ZipkinQueryApiV2: https://github.com/openzipkin/zipkin Apache 2.0

kezhenxu94 added a commit to kezhenxu94/skywalking that referenced this pull request Sep 7, 2024
kezhenxu94 added a commit to kezhenxu94/skywalking that referenced this pull request Sep 7, 2024
kezhenxu94 added a commit to kezhenxu94/skywalking that referenced this pull request Sep 7, 2024
kezhenxu94 added a commit that referenced this pull request Sep 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature submodule update Change submodule commit ID, including network protocol, query GraphQL and UI UI UI and front end related.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants