Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pkg/ottl] add User Agent parsing #34172

Merged
merged 20 commits into from
Aug 21, 2024
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
2031ca3
Initial naive impl for user-agent parsing
pchila Jul 10, 2024
379e954
add uap-go dependency for user-agent parsing
pchila Jul 10, 2024
0e6bf76
Initial implementation of user-agent parsing
pchila Jul 11, 2024
186f7b3
wire up UserAgentFactory and add e2e test
pchila Jul 12, 2024
b63cd55
add documentation for UserAgent converter
pchila Jul 23, 2024
850b781
Merge remote-tracking branch 'upstream/main' into user-agent-parsing
pchila Aug 9, 2024
6ac23c0
Move User-Agent parser initialization out of the `UserAgent` function
pchila Aug 12, 2024
7538cf0
Add unit test for parsing collector UserAgent string
pchila Aug 12, 2024
bb5c618
Update pkg/ottl/ottlfuncs/README.md
pchila Aug 12, 2024
aa9d317
Merge remote-tracking branch 'upstream/main' into user-agent-parsing
pchila Aug 12, 2024
90fd086
fixup! Merge remote-tracking branch 'upstream/main' into user-agent-p…
pchila Aug 12, 2024
095d8e4
Move User-Agent parser initialization out of the user agent converter
pchila Aug 13, 2024
50abd0d
Merge remote-tracking branch 'upstream/main' into user-agent-parsing
pchila Aug 13, 2024
fd20a75
Fix otelcontribcol build
pchila Aug 15, 2024
1fe0f3a
fixup! add documentation for UserAgent converter
pchila Aug 15, 2024
767d528
Merge remote-tracking branch 'upstream/main' into user-agent-parsing
pchila Aug 19, 2024
33c58f2
make gotidy
pchila Aug 19, 2024
f6b67c3
make generate
pchila Aug 19, 2024
347e1aa
Added changelog
pchila Aug 20, 2024
7da12e4
Merge remote-tracking branch 'upstream/main' into user-agent-parsing
pchila Aug 21, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions cmd/otelcontribcol/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -738,6 +738,7 @@ require (
github.com/tinylib/msgp v1.2.0 // indirect
github.com/tklauser/go-sysconf v0.3.14 // indirect
github.com/tklauser/numcpus v0.8.0 // indirect
github.com/ua-parser/uap-go v0.0.0-20240611065828-3a4781585db6 // indirect
github.com/valyala/fastjson v1.6.4 // indirect
github.com/vincent-petithory/dataurl v1.0.0 // indirect
github.com/vishvananda/netlink v1.1.1-0.20201029203352-d40f9887b852 // indirect
Expand Down
3 changes: 3 additions & 0 deletions cmd/otelcontribcol/go.sum

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 9 additions & 0 deletions pkg/ottl/e2e/e2e_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -774,6 +774,15 @@ func Test_e2e_converters(t *testing.T) {
m.PutStr("bar", "pass")
},
},
{
statement: `set(attributes["test"], UserAgent("curl/7.81.0"))`,
want: func(tCtx ottllog.TransformContext) {
m := tCtx.GetLogRecord().Attributes().PutEmptyMap("test")
m.PutStr("user_agent.original", "curl/7.81.0")
m.PutStr("user_agent.name", "curl")
m.PutStr("user_agent.version", "7.81.0")
},
},
}

for _, tt := range tests {
Expand Down
3 changes: 3 additions & 0 deletions pkg/ottl/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ require (
github.com/open-telemetry/opentelemetry-collector-contrib/internal/coreinternal v0.107.0
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/pdatatest v0.107.0
github.com/stretchr/testify v1.9.0
github.com/ua-parser/uap-go v0.0.0-20240611065828-3a4781585db6
go.opentelemetry.io/collector/component v0.107.0
go.opentelemetry.io/collector/pdata v1.13.0
go.opentelemetry.io/collector/semconv v0.107.0
Expand All @@ -31,6 +32,7 @@ require (
github.com/go-logr/logr v1.4.2 // indirect
github.com/go-logr/stdr v1.2.2 // indirect
github.com/gogo/protobuf v1.3.2 // indirect
github.com/hashicorp/golang-lru v0.5.4 // indirect
github.com/magefile/mage v1.15.0 // indirect
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
github.com/modern-go/reflect2 v1.0.2 // indirect
Expand All @@ -51,6 +53,7 @@ require (
google.golang.org/genproto/googleapis/rpc v0.0.0-20240701130421-f6361c86f094 // indirect
google.golang.org/grpc v1.65.0 // indirect
google.golang.org/protobuf v1.34.2 // indirect
gopkg.in/yaml.v2 v2.4.0 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
)

Expand Down
7 changes: 7 additions & 0 deletions pkg/ottl/go.sum

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

29 changes: 29 additions & 0 deletions pkg/ottl/ottlfuncs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -461,6 +461,7 @@ Available Converters:
- [UnixMilli](#unixmilli)
- [UnixNano](#unixnano)
- [UnixSeconds](#unixseconds)
- [UserAgent](#useragent)
- [UUID](#UUID)
- [Year](#year)

Expand Down Expand Up @@ -1568,6 +1569,34 @@ Examples:

- `UnixSeconds(Time("02/04/2023", "%m/%d/%Y"))`

### UserAgent

`UserAgent(value)`

The `UserAgent` Converter parses the string argument trying to match it against well-known user-agent strings.

`value` is a string or a path to a string. If `value` is not a string an error is returned.

The results of the parsing are returned as a map containing `user_agent.name`, `user_agent.version` and `user_agent.original`
as defined in semconv v1.25.0.

Parsing is done using the [uap-go package](https://github.com/ua-parser/uap-go). The specific formats it recognizes can be found [here](https://github.com/ua-parser/uap-core/blob/master/regexes.yaml).

Examples:

- `UserAgent("curl/7.81.0")`
```yaml
"user_agent.name": "curl"
"user_agent.version": "7.81.0"
"user_agent.original": "curl/7.81.0"
```
- `Mozilla/5.0 (X11; Linux x86_64; rv:126.0) Gecko/20100101 Firefox/126.0`
```yaml
"user_agent.name": "Firefox"
"user_agent.version": "126.0"
"user_agent.original": "Mozilla/5.0 (X11; Linux x86_64; rv:126.0) Gecko/20100101 Firefox/126.0"
```

### URL

`URL(url_string)`
Expand Down
46 changes: 46 additions & 0 deletions pkg/ottl/ottlfuncs/func_useragent.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
// Copyright The OpenTelemetry Authors
// SPDX-License-Identifier: Apache-2.0

package ottlfuncs // import "github.com/open-telemetry/opentelemetry-collector-contrib/pkg/ottl/ottlfuncs"
import (
"context"
"fmt"

"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/ottl"
"github.com/ua-parser/uap-go/uaparser"

Check failure on line 10 in pkg/ottl/ottlfuncs/func_useragent.go

View workflow job for this annotation

GitHub Actions / govulncheck (connector)

could not import github.com/ua-parser/uap-go/uaparser (invalid package name: "")

Check failure on line 10 in pkg/ottl/ottlfuncs/func_useragent.go

View workflow job for this annotation

GitHub Actions / govulncheck (processor-0)

could not import github.com/ua-parser/uap-go/uaparser (invalid package name: "")

Check failure on line 10 in pkg/ottl/ottlfuncs/func_useragent.go

View workflow job for this annotation

GitHub Actions / govulncheck (processor-1)

could not import github.com/ua-parser/uap-go/uaparser (invalid package name: "")

Check failure on line 10 in pkg/ottl/ottlfuncs/func_useragent.go

View workflow job for this annotation

GitHub Actions / govulncheck (cmd-1)

could not import github.com/ua-parser/uap-go/uaparser (invalid package name: "")

Check failure on line 10 in pkg/ottl/ottlfuncs/func_useragent.go

View workflow job for this annotation

GitHub Actions / govulncheck (internal)

could not import github.com/ua-parser/uap-go/uaparser (invalid package name: "")

Check failure on line 10 in pkg/ottl/ottlfuncs/func_useragent.go

View workflow job for this annotation

GitHub Actions / govulncheck (exporter-1)

could not import github.com/ua-parser/uap-go/uaparser (invalid package name: "")
semconv "go.opentelemetry.io/collector/semconv/v1.25.0"
)

type UserAgentArguments[K any] struct {
UserAgent ottl.StringGetter[K]
}

func NewUserAgentFactory[K any]() ottl.Factory[K] {
return ottl.NewFactory("UserAgent", &UserAgentArguments[K]{}, createUserAgentFunction[K])
}

func createUserAgentFunction[K any](_ ottl.FunctionContext, oArgs ottl.Arguments) (ottl.ExprFunc[K], error) {
args, ok := oArgs.(*UserAgentArguments[K])
if !ok {
return nil, fmt.Errorf("URLFactory args must be of type *URLArguments[K]")
}

return userAgent[K](args.UserAgent), nil
}

func userAgent[K any](userAgentSource ottl.StringGetter[K]) ottl.ExprFunc[K] { //revive:disable-line:var-naming
parser := uaparser.NewFromSaved()

return func(ctx context.Context, tCtx K) (any, error) {
userAgentString, err := userAgentSource.Get(ctx, tCtx)
if err != nil {
return nil, err
}
parsedUserAgent := parser.ParseUserAgent(userAgentString)
return map[string]any{
semconv.AttributeUserAgentName: parsedUserAgent.Family,
semconv.AttributeUserAgentOriginal: userAgentString,
semconv.AttributeUserAgentVersion: parsedUserAgent.ToVersionString(),
}, nil
}
}
117 changes: 117 additions & 0 deletions pkg/ottl/ottlfuncs/func_useragent_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
// Copyright The OpenTelemetry Authors
// SPDX-License-Identifier: Apache-2.0

package ottlfuncs // import "github.com/open-telemetry/opentelemetry-collector-contrib/pkg/ottl/ottlfuncs"
import (
"context"
"testing"

"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/ottl"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
semconv "go.opentelemetry.io/collector/semconv/v1.25.0"
)

func TestUserAgentParser(t *testing.T) {
testCases := []struct {
TylerHelmuth marked this conversation as resolved.
Show resolved Hide resolved
Name string
UAString string
ExpectedMap map[string]any
}{
{
Name: "Firefox",
UAString: "Mozilla/5.0 (X11; Linux x86_64; rv:126.0) Gecko/20100101 Firefox/126.0",
ExpectedMap: map[string]any{
semconv.AttributeUserAgentOriginal: "Mozilla/5.0 (X11; Linux x86_64; rv:126.0) Gecko/20100101 Firefox/126.0",
semconv.AttributeUserAgentName: "Firefox",
semconv.AttributeUserAgentVersion: "126.0",
},
},
{
Name: "Chrome",
UAString: "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36",
ExpectedMap: map[string]any{
semconv.AttributeUserAgentOriginal: "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36",
semconv.AttributeUserAgentName: "Chrome",
semconv.AttributeUserAgentVersion: "51.0.2704",
},
},
{
Name: "Mobile Safari",
UAString: "Mozilla/5.0 (iPhone; CPU iPhone OS 13_5_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Mobile/15E148 Safari/604.1",
ExpectedMap: map[string]any{
semconv.AttributeUserAgentOriginal: "Mozilla/5.0 (iPhone; CPU iPhone OS 13_5_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Mobile/15E148 Safari/604.1",
semconv.AttributeUserAgentName: "Mobile Safari",
semconv.AttributeUserAgentVersion: "13.1.1",
},
},
{
Name: "Edge",
UAString: "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36 Edg/91.0.864.59",
ExpectedMap: map[string]any{
semconv.AttributeUserAgentOriginal: "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36 Edg/91.0.864.59",
semconv.AttributeUserAgentName: "Edge",
semconv.AttributeUserAgentVersion: "91.0.864",
},
},
{
Name: "Opera",
UAString: "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.106 Safari/537.36 OPR/38.0.2220.41",
ExpectedMap: map[string]any{
semconv.AttributeUserAgentOriginal: "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.106 Safari/537.36 OPR/38.0.2220.41",
semconv.AttributeUserAgentName: "Opera",
semconv.AttributeUserAgentVersion: "38.0.2220",
},
},
{
Name: "curl",
UAString: "curl/7.81.0",
ExpectedMap: map[string]any{
semconv.AttributeUserAgentOriginal: "curl/7.81.0",
semconv.AttributeUserAgentName: "curl",
semconv.AttributeUserAgentVersion: "7.81.0",
},
},
{
Name: "Unknown user agent",
UAString: "foobar/1.2.3 (foo; bar baz)",
ExpectedMap: map[string]any{
semconv.AttributeUserAgentOriginal: "foobar/1.2.3 (foo; bar baz)",
semconv.AttributeUserAgentName: "Other",
semconv.AttributeUserAgentVersion: "",
},
},
{
Name: "Otel collector 0.106.1 linux/amd64 user agent",
UAString: "OpenTelemetry Collector Contrib/0.106.1 (linux/amd64)",
ExpectedMap: map[string]any{
semconv.AttributeUserAgentOriginal: "OpenTelemetry Collector Contrib/0.106.1 (linux/amd64)",
semconv.AttributeUserAgentName: "Other",
semconv.AttributeUserAgentVersion: "",
},
},
}

for _, tt := range testCases {
t.Run(tt.Name, func(t *testing.T) {
source := &ottl.StandardStringGetter[any]{
Getter: func(_ context.Context, _ any) (any, error) {
return tt.UAString, nil
},
}

exprFunc := userAgent[any](source) //revive:disable-line:var-naming
res, err := exprFunc(context.Background(), nil)
require.NoError(t, err)
require.IsType(t, map[string]any{}, res)
resMap := res.(map[string]any)
assert.Equal(t, tt.ExpectedMap, resMap)
assert.Len(t, resMap, len(tt.ExpectedMap))
for k, v := range tt.ExpectedMap {
if assert.Containsf(t, resMap, k, "key not found %q", k) {
assert.Equal(t, v, resMap[k])
}
}
})
}
}
1 change: 1 addition & 0 deletions pkg/ottl/ottlfuncs/functions.go
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,7 @@ func converters[K any]() []ottl.Factory[K] {
NewUnixSecondsFactory[K](),
NewUUIDFactory[K](),
NewURLFactory[K](),
NewUserAgentFactory[K](),
NewAppendFactory[K](),
NewYearFactory[K](),
NewHexFactory[K](),
Expand Down
Loading