diff --git a/docs/processing.md b/docs/processing.md index 3780dc61747..cae4f47ca0c 100644 --- a/docs/processing.md +++ b/docs/processing.md @@ -147,49 +147,54 @@ the design phase. The Python-like language currently assumes statements can only Remove a forbidden attribute such as `http.request.header.authorization` from spans only -`delete(attributes["http.request.header.authorization"]) from traces` -`delete(attributes["http.request.header.authorization"])` +`delete(attributes["http.request.header.authorization"]) from traces` +`delete(attributes["http.request.header.authorization"])` Remove all attributes except for some -`keep_keys(attributes, "http.method", "http.status_code") from metrics` -`keep_keys(attributes, "http.method", "http.status_code")` +`keep_keys(attributes, "http.method", "http.status_code") from metrics` +`keep_keys(attributes, "http.method", "http.status_code")` Reduce cardinality of an attribute -`replace_wildcards("/user/*/list/*", "/user/{userId}/list/{listId}", attributes["http.target"]) from traces` -`replace_wildcards("/user/*/list/*", "/user/{userId}/list/{listId}", attributes["http.target"])` +`replace_match(attributes["http.target"], "/user/*/list/*", "/user/{userId}/list/{listId}") from traces` +`replace_match(attributes["http.target"], "/user/*/list/*", "/user/{userId}/list/{listId}")` Reduce cardinality of a span name -`replace_wildcards("GET /user/*/list/*", "GET /user/{userId}/list/{listId}", name) from traces` -`replace_wildcards("GET /user/*/list/*", "GET /user/{userId}/list/{listId}", name)` +`replace_match(name, "GET /user/*/list/*", "GET /user/{userId}/list/{listId}") from traces` +`replace_match(name, "GET /user/*/list/*", "GET /user/{userId}/list/{listId}")` + +Reduce cardinality of any matching attribute + +`replace_all_matches(attributes, "/user/*/list/*", "/user/{userId}/list/{listId}") from traces` +`replace_all_matches(attributes, "/user/*/list/*", "/user/{userId}/list/{listId}")` Decrease the size of the telemetry payload by removing large resource attributes -`delete(resource.attributes["process.command_line"]) from traces` -`delete(resource.attributes["process.command_line"])` +`delete(resource.attributes["process.command_line"]) from traces` +`delete(resource.attributes["process.command_line"])` Filtering out signals such as by removing all metrics with a `http.target` of `/health` -`drop() where attributes["http.target"] = "/health" from metrics` -`if attributes["http.target"] = "/health": drop()` +`drop() where attributes["http.target"] = "/health" from metrics` +`if attributes["http.target"] = "/health": drop()` Attach information from resource into telemetry, for example adding certain resource fields as metric attributes -`set(attributes["k8s_pod"], resource.attributes["k8s.pod.name"]) from metrics` -`attributes["k8s_pod"] = resource.attributes["k8s.pod.name"]` +`set(attributes["k8s_pod"], resource.attributes["k8s.pod.name"]) from metrics` +`attributes["k8s_pod"] = resource.attributes["k8s.pod.name"]` Group spans by trace ID -`group_by(trace_id, 2m) from traces` -`group_by(trace_id, 2m)` +`group_by(trace_id, 2m) from traces` +`group_by(trace_id, 2m)` Create utilization metric from base metrics. Because navigation expressions only operate on a single piece of telemetry, helper functions for reading values from other metrics need to be provided. -`create_gauge("pod.cpu.utilized", read_gauge("pod.cpu.usage") / read_gauge("node.cpu.limit") from metric` -`create_gauge("pod.cpu.utilized", read_gauge("pod.cpu.usage") / read_gauge("node.cpu.limit")` +`create_gauge("pod.cpu.utilized", read_gauge("pod.cpu.usage") / read_gauge("node.cpu.limit") from metric` +`create_gauge("pod.cpu.utilized", read_gauge("pod.cpu.usage") / read_gauge("node.cpu.limit")` A lot of processing. Queries are executed in order. While initially performance may degrade compared to more specialized processors, the expectation is that over time, the query processor's engine would improve to be able to apply optimizations @@ -254,27 +259,27 @@ implementation, it would likely be little overhead to support a YAML approach in Functions should be named and formatted according to the following standards. - Function names MUST start with a verb. - Function names that contain multiple words MUST separate those words with `_`. -- Functions that interact with multiple items MUST have plurality in the name. Ex: `truncate_all`, `keep_keys`, `replace_wildcards`. -- Functions that interact with a single item MUST NOT have plurality in the name. If a function would interact with multiple items due to a condition, like `where`, it is still considered singular. Ex: `set`, `delete`, `drop`. +- Functions that interact with multiple items MUST have plurality in the name. Ex: `truncate_all`, `keep_keys`, `replace_all_matches`. +- Functions that interact with a single item MUST NOT have plurality in the name. If a function would interact with multiple items due to a condition, like `where`, it is still considered singular. Ex: `set`, `delete`, `drop`, `replace_match`. - Functions that change a specific target MUST set the target as the first parameter. - Functions that take a list MUST set the list as the last parameter. ## Implementing a processor function -The `replace_wildcards` function may look like this. +The `replace_match` function may look like this. ```go -package replacewildcards +package replaceMatch import "regexp" import "github.com/open-telemetry/opentelemetry/processors" // Assuming this is not in "core" -processors.register("replace_wildcards", replace_wildcards) +processors.register("replace_match", replace_match) -func replace_wildcards(pattern regexp.Regexp, replacement string, path processors.TelemetryPath) processors.Result { +func replace_match(path processors.TelemetryPath, pattern regexp.Regexp, replacement string) processors.Result { val := path.Get() if val == nil { return processors.CONTINUE @@ -287,7 +292,7 @@ func replace_wildcards(pattern regexp.Regexp, replacement string, path processor } ``` -Here, the processor framework recognizes the first parameter of the function is `regexp.Regexp` so will compile the string +Here, the processor framework recognizes the second parameter of the function is `regexp.Regexp` so will compile the string provided by the user in the config when processing it. Similarly for `path`, it recognizes properties of type `TelemetryPath` and will resolve it to the path within a matched telemetry during execution and pass it to the function. The path allows scalar operations on the field within the telemetry. The processor does not need to be aware of telemetry filtering,