-
Notifications
You must be signed in to change notification settings - Fork 41
NETOBSERV-584: Support for writing logs to Loki (distributor) using gRPC #1973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1973 +/- ##
==========================================
- Coverage 71.02% 70.27% -0.75%
==========================================
Files 75 75
Lines 10011 10143 +132
==========================================
+ Hits 7110 7128 +18
- Misses 2513 2622 +109
- Partials 388 393 +5
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
New images:
They will expire after two weeks. To deploy this build: # Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:7c27f48 make deploy
# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-sha-7c27f48 Or as a Catalog Source: apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: netobserv-dev
namespace: openshift-marketplace
spec:
sourceType: grpc
image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-sha-7c27f48
displayName: NetObserv development catalog
publisher: Me
updateStrategy:
registryPoll:
interval: 1m |
New images:
They will expire after two weeks. To deploy this build: # Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:6425002 make deploy
# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-sha-6425002 Or as a Catalog Source: apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: netobserv-dev
namespace: openshift-marketplace
spec:
sourceType: grpc
image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-sha-6425002
displayName: NetObserv development catalog
publisher: Me
updateStrategy:
registryPoll:
interval: 1m |
6cd213f
to
582bdd9
Compare
ce60a96
to
9200b33
Compare
@leandroberetta: This pull request references NETOBSERV-584 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
This PR adds support for writing logs to Loki using gRPC. In terms of performance, results are pretty similar to the existing http implementation. I was investigating this and the current implementation uses http but messages are protocol buffers compressed (snappy), so it's already using less memory I guess. In other hands, gRPC establishes a connection and has mechanisms to reconnect, keep alive. That take more resources (according to what I read). Probably we can do a better performance testing until we can decide. |
New images:
They will expire after two weeks. To deploy this build: # Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:770611c make deploy
# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-sha-770611c Or as a Catalog Source: apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: netobserv-dev
namespace: openshift-marketplace
spec:
sourceType: grpc
image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-sha-770611c
displayName: NetObserv development catalog
publisher: Me
updateStrategy:
registryPoll:
interval: 1m |
type LokiGRPCConfig struct { | ||
//+kubebuilder:validation:Minimum=1024 | ||
//+kubebuilder:validation:Maximum=67108864 | ||
//+kubebuilder:default:=67108864 | ||
// `maxRecvMsgSize` is the maximum message size in bytes the gRPC client can receive. Default: 64MB. | ||
MaxRecvMsgSize int `json:"maxRecvMsgSize,omitempty"` | ||
|
||
//+kubebuilder:validation:Minimum=1024 | ||
//+kubebuilder:validation:Maximum=67108864 | ||
//+kubebuilder:default:=16777216 | ||
// `maxSendMsgSize` is the maximum message size in bytes the gRPC client can send. Default: 16MB. | ||
MaxSendMsgSize int `json:"maxSendMsgSize,omitempty"` | ||
|
||
//+kubebuilder:default:="30s" | ||
// `keepAlive` is the gRPC keep-alive interval. | ||
KeepAlive *metav1.Duration `json:"keepAlive,omitempty"` | ||
|
||
//+kubebuilder:default:="5s" | ||
// `keepAliveTimeout` is the gRPC keep-alive timeout. | ||
KeepAliveTimeout *metav1.Duration `json:"keepAliveTimeout,omitempty"` | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we need those exposed in the API.
On top of that, what's the difference between maxSendMsgSize
and the existing writeBatchSize
?
Maybe we can reuse the same field here and put the other options as Env map[string]string
in the advanced
section
Here is an example for eBPF agent consuming envs: https://github.com/netobserv/network-observability-operator/blob/main/internal/controller/ebpf/agent_controller.go#L33-L75
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think we can probably start with not exposing anything at all (not even the clientType
) and have it under a env-based feature gate. Probably users want us to make the choice for them, whether to use it or not.
If after more tests we see that there are no clear winner, just pros and cons, then maybe we will expose it?
b3004b4
to
188c7f5
Compare
188c7f5
to
7f98af0
Compare
f944dbe
to
6b4e3db
Compare
6b4e3db
to
120a626
Compare
@leandroberetta: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Description
Support for writing logs to Loki (distributor) using gRPC.
Ref: https://issues.redhat.com/browse/NETOBSERV-584
Dependencies
netobserv/loki-client-go#3
netobserv/flowlogs-pipeline#1086
netobserv/network-observability-console-plugin#1021
Checklist
If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.