Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow storing tags as object fields in ES - better kibana support #1018

Merged

Conversation

pavolloffay
Copy link
Member

@pavolloffay pavolloffay commented Aug 23, 2018

Resolves #906
Supersedes #980, #982

  • Store configured or all tags as object fields. For example: {"tag": {"http@method": "GET", "span@kind": "server"}}.
  • Binary tags are always stored as nested objects - there is no way how to reproduce the type - it would be always a string.
  • The default behaviour does not use tag fields. It's a subject to a change. By default we could store only standard tags as object field - no breaking change
  • with default index.mapping.total_fields.limit" it is possible to store ~940 unique span tag keys. If the limit is reached no new unique tag can be stored - ES throws an exception. It basically means span will not be stored in the current index
  • tag with float value 1.0 will be always returned as int 123 Stored number as interface is always a float olivere/elastic#891

Introduced 3 flags:

      --es.tag-as-fields                                      Store all span and process tags as object fields. If true .tag-file-path is ignored.
      --es.tag-file-path string                               Optional path to a file containing tag keys which will be stored as object fields. Each key should be on a separate line.
      --es.tag-key-de-dot-char string                         The character used to replace dots (".") in tag keys stored as object fields. (default "@")

Other changes

  • copied json model to plugin/storage/es/spanwriter/dbmodel

Todos after this is merged:

Adding screenshot from kibana. The search pattern is tag.span@kind:server AND process.tag.ip:10.33.144.223
screenshot of kibana

@codecov
Copy link

codecov bot commented Aug 23, 2018

Codecov Report

Merging #1018 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #1018    +/-   ##
=======================================
  Coverage     100%    100%            
=======================================
  Files         139     141     +2     
  Lines        6419    6604   +185     
=======================================
+ Hits         6419    6604   +185
Impacted Files Coverage Δ
model/converter/thrift/jaeger/to_domain.go 100% <ø> (ø) ⬆️
plugin/storage/es/spanstore/dbmodel/from_domain.go 100% <100%> (ø)
plugin/storage/es/spanstore/dbmodel/to_domain.go 100% <100%> (ø)
.../storage/es/spanstore/dbmodel/process_hashtable.go 100% <100%> (ø)
plugin/storage/es/factory.go 100% <100%> (ø) ⬆️
plugin/storage/es/spanstore/writer.go 100% <100%> (ø) ⬆️
plugin/storage/es/options.go 100% <100%> (ø) ⬆️
plugin/storage/es/spanstore/reader.go 100% <100%> (ø) ⬆️
plugin/storage/es/spanstore/service_operation.go 100% <100%> (ø) ⬆️
cmd/ingester/app/consumer/offset/manager.go 100% <0%> (ø) ⬆️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5a67504...8a713a0. Read the comment docs.

@pavolloffay pavolloffay changed the title WIP Es schema object disable raw WIP: Change ES tag schema to support tag search in Kibana - final Aug 23, 2018
@pavolloffay pavolloffay force-pushed the es-schema-object-disable-raw branch 4 times, most recently from 6bb7f85 to 524b0a6 Compare August 29, 2018 09:20
@pavolloffay pavolloffay changed the title WIP: Change ES tag schema to support tag search in Kibana - final Allow storing tags as object fields in ES - better kibana support Aug 29, 2018
Tags []KeyValue `json:"tags"`
ServiceName string `json:"serviceName"`
Tags []KeyValue `json:"tags"`
TagsMap map[string]interface{} `json:"tagsMap,omitempty"`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please comment if you know a better name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally the name used in the kibana query would be tags, but I guess then that may cause backward compatibility issues?
Another alternative would be to use tag instead of tagsMap - so when referenced it would be tag.orderId (for example).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we cannot use tags as it is used for legacy tags (as nested objects).

+1 on tag.

dKey := strings.Replace(k, ":", ".", -1)
tag := model.KeyValue{Key: dKey}
// TODO v.(int/64/32) didn't work it is always a float
if pInt, err := strconv.ParseInt(fmt.Sprintf("%v", v), 10, 64); err == nil {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ES always returns float64 as a number - even if it was stored as int. This works but it's not that pretty.

}

// TODO we could query tagsMap only if the tag is configured
// but configuration can change over time
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean what if previous deployment was storing tags as object fields. If the new deployment is not using tags as fields query would not find old tags.

@pavolloffay
Copy link
Member Author

This is ready for review

@@ -37,6 +37,9 @@ const (
suffixBulkActions = ".bulk.actions"
suffixBulkFlushInterval = ".bulk.flush-interval"
suffixIndexPrefix = ".index-prefix"
suffixTagsFile = ".tag-file-path"
suffixAllTagsAsObject = ".tag-as-fields"
suffixTagDeDotChar = ".tag-key-de-dot-char"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure '-de-' part is required in the name? I think tag-key-dot-charis good enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -75,6 +78,7 @@ func NewOptions(primaryNamespace string, otherNamespaces ...string) *Options {
BulkWorkers: 1,
BulkActions: 1000,
BulkFlushInterval: time.Millisecond * 200,
TagDeDotChar: "@",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the issue with using ':'?

From my ES past, I vaguely remember there being able to mark a field as not_analyzed so it was indexed but using the actual value without being processed. I see that this has been replaced by the concept of keyword in 5.x. Doesn't this avoid the issue of using '.' in the tag name?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the issue with using ':'?

It would have to be escaped. It's used in kibana query tag.key:foo. To avoid dots in field name we would have to set enabled:false on the field.

@pavolloffay
Copy link
Member Author

I have published docker images in https://hub.docker.com/u/pavolloffay/ with tag PR1018. For example pavolloffay:jaeger-collector:PR1018. Or use docker run --net=host -e SPAN_STORAGE_TYPE=elasticsearch -e ES_TAG_AS_FIELDS=true pavolloffay/all-in-one:PR1018 to simply run it (it expects ES to be running on the host).

Copy link
Contributor

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't finished yet, but wanted to share my review sooner.

if kvs == nil {
kvs = make([]KeyValue, 0)
}
return kvs, tagsMap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If keyValues is empty (Len=0), you'll end up with a kvs with Len=0 and a nil tagsMap. Wouldn't it make more sense to initialize the tagsMap on line 89?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, it make sense if tagMap is used by default. By default it's not used and not present in the index.

@@ -0,0 +1,137 @@
// Copyright (c) 2017 Uber Technologies, Inc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copyright :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(as well as other files in this PR)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is wrong with it? It's automatically handled by make fmt

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/2017 Uber Technologies, Inc/2018 The Jaeger Authors/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updadeLicense script seems to be correct. Actually these files were copied (with changes) from domain package. Do you want to change the license?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If they are just copies, then no, keep the old copyright.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if with changes you can add 2018 The Jaeger Authors line above

// References and ProcessID, respectively.
// When converting to ES model, ProcessID and Warnings should be omitted. Even if
// included, ES with dynamic settings off will automatically ignore unneeded fields.
type Span struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this struct specific to ES?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it's in es/dbmodel package. I think the comment is not accurate. I will fix it

var expectedSpan Span
require.NoError(t, json.Unmarshal(jsonStr, &expectedSpan))

testJSONEncoding(t, i, jsonStr, embeddedSpan)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you wrap each test in a t.Run() with an appropriate name? Like, "TestFromDomainEmbedProcess#fixture%d"

}

type toDomain struct{}
// ToDomain is used to convert Span to model.Span
type ToDomain interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as before: is this specific to ES? Couldn't this be made at a higher level, to be used by other storage backends? Otherwise, what's the point of changing from a struct to an interface?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is ES specific. Each storage define it's own model e.g. see C* package.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The biggest advantage is keeping the struct private and exposing only the interface with builder function. It makes it clear how it is constructed.

for k, v := range tagsMap {
dKey := strings.Replace(k, deDotChar, ".", -1)
tag := model.KeyValue{Key: dKey}
// TODO v.(int/64/32) didn't work it is always a float
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the TODO here? Sounds more like a regular comment to me :)

if pInt, err := strconv.ParseInt(fmt.Sprintf("%v", v), 10, 64); err == nil {
tag.VType = model.Int64Type
tag.VInt64 = pInt
} else if val, ok := v.(float64); ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it more idiomatic in Go to use a switch/case with the type?

switch v.(type) {
case float64:
  tag.VType = model.Float64Type
  tag.VFloat64 = val
  ...
}

} else if val, ok := v.([]byte); ok {
tag.VType = model.BinaryType
tag.VBinary = val
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory, it can also be something else. How about adding a log statement for that case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should not really happen. In this case it's better to return an error.

plugin/storage/es/spanstore/reader_test.go Show resolved Hide resolved
plugin/storage/es/spanstore/writer_test.go Show resolved Hide resolved
s := &model.Span{Tags: []model.KeyValue{{Key: "foo", VStr: "bar"}},
Process: &model.Process{Tags: []model.KeyValue{{Key: "bar", VStr: "baz"}}}}
for _, test := range testCases {
mSpan := test.writer.spanConverter.FromDomainEmbedProcess(s)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrap every iteration with its own t.Run() with an appropriate name. In case of a failure, it should be easy to identify what's the combination that failed.

@@ -34,11 +34,18 @@ func CompareSliceOfTraces(t *testing.T, expected []*model.Trace, actual []*model
}
if !assert.EqualValues(t, expected, actual) {
for _, err := range pretty.Diff(expected, actual) {
//fmt.Println("Actual")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this be removed before merging?

@pavolloffay
Copy link
Member Author

@Monnoroch @golonzovsky could you please review/run it?

@yurishkuro
Copy link
Member

@pavolloffay as I understand it, this is backwards compatible?

Regarding the CLI flags, proposal for minor adjustment:

      --es.tags-as-fields.always                                      Store all span and process tags as object fields. If true .tag-file-path is ignored.
      --es.tags-as-fields.config-file string                          Optional path to a file containing tag keys which will be stored as object fields. Each key should be on a separate line.
      --es.tags-as-fields.dot-replacement string                      The character used to replace dots (".") in tag keys stored as object fields. (default "@")

This way if people are using config files instead of CLI then all the new options are in the same section.

@black-adder can you review this?

@pavolloffay
Copy link
Member Author

pavolloffay commented Sep 5, 2018

It is backward compatible. The proposed flags look fine except I would rename .always to all.

// v.(int/64/32) does not work. It is always a float
if pInt, err := strconv.ParseInt(fmt.Sprintf("%v", v), 10, 64); err == nil {
tag.VType = model.Int64Type
tag.VInt64 = pInt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should use constructor functions from model instead of direct struct manipulation, e.g.

return model.Int64(k, pint)

return kvs, nil
}

func convertTagField(v interface{}, k string, deDotChar string) (*model.KeyValue, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return by value, no need to cause heap allocations: (model.KeyValue, error)

func SpanToDomain(span *json.Span) (*model.Span, error) {
return toDomain{}.spanToDomain(span)
// NewToDomain creates ToDomain
func NewToDomain(deDotChar string) ToDomain {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this struct public? Would it be enough to have a public function SpanToDomain()?

MetricsFactory metrics.Factory
serviceOperationStorage *ServiceOperationStorage
IndexPrefix string
TagDeDotChar string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we name this consistent with the flag, i.e. TagDotReplacement

plugin/storage/es/spanstore/reader.go Show resolved Hide resolved
OperationName string `json:"operationName"`
References []Reference `json:"references"`
StartTime uint64 `json:"startTime"` // microseconds since Unix epoch
// Span adds a StartTimeMillis field to the standard JSON span.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rm this line

Tags []KeyValue `json:"tags"`
Tag map[string]interface{} `json:"tag,omitempty"`
Logs []Log `json:"logs"`
Process *Process `json:"process,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason why this has to be a pointer and 'omitempty'? it's not optional here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, we can make it mandatory it was just copied from json package.

StartTimeMillis uint64 `json:"startTimeMillis"`
Duration uint64 `json:"duration"` // microseconds
Tags []KeyValue `json:"tags"`
Tag map[string]interface{} `json:"tag,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I recommend moving StartTimeMillis and Tag fields to the bottom of the struct and adding comments to Tag.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This affects how the json is stored to ES. I would like to keep tags and tag close to each other.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order the fields are defined in the Go struct are indicative of how they are stored in ES? That's good to know! Would probably be nice to have a code comment with this remark.

type Process struct {
ServiceName string `json:"serviceName"`
Tags []KeyValue `json:"tags"`
Tag map[string]interface{} `json:"tag,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment

"github.com/jaegertracing/jaeger/model"
)

type processHashtable struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when is this used? if we're embedding the Process in the Span we don't need to do deduping. Are we deduping at query time? Do we need to? I think domain-to-ui converter will dedupe anyway.

@codecov-io
Copy link

codecov-io commented Sep 5, 2018

Codecov Report

Merging #1018 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #1018    +/-   ##
=======================================
  Coverage     100%    100%            
=======================================
  Files         139     140     +1     
  Lines        6451    6620   +169     
=======================================
+ Hits         6451    6620   +169
Impacted Files Coverage Δ
model/converter/thrift/jaeger/to_domain.go 100% <ø> (ø) ⬆️
plugin/storage/es/spanstore/dbmodel/from_domain.go 100% <100%> (ø)
plugin/storage/es/spanstore/dbmodel/to_domain.go 100% <100%> (ø)
plugin/storage/es/factory.go 100% <100%> (ø) ⬆️
plugin/storage/es/spanstore/writer.go 100% <100%> (ø) ⬆️
plugin/storage/es/options.go 100% <100%> (ø) ⬆️
plugin/storage/es/spanstore/reader.go 100% <100%> (ø) ⬆️
plugin/storage/es/spanstore/service_operation.go 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 774c01b...0ca2187. Read the comment docs.

@pavolloffay pavolloffay force-pushed the es-schema-object-disable-raw branch 2 times, most recently from 2777b56 to a422014 Compare September 6, 2018 07:40
@pavolloffay
Copy link
Member Author

can somebody re-review/approve?

@kacper-jackiewicz
Copy link

@pavolloffay
I am just testing Jaeger with your new images. Everything seems to work fine on Kibana side(searches/visualizations). Just one remark on Jaeger-Query side. Searching for tags works only if we use new format so for ex. http@method="GET", but the tags are shown with old dotted format, which might be misleading for users.

@pavolloffay
Copy link
Member Author

@kacper-jackiewicz thanks for running it. It's a bug query side should use the original key format.

Signed-off-by: Pavol Loffay <ploffay@redhat.com>
Signed-off-by: Pavol Loffay <ploffay@redhat.com>
Signed-off-by: Pavol Loffay <ploffay@redhat.com>
@pavolloffay
Copy link
Member Author

Can somebody re-review so it can be merged?

cc) @jpkrohling @yurishkuro @black-adder

Copy link
Contributor

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and I think it's good to go for a RC.

return nil, err
}
}
p := esSpanStore.SpanWriterParams{Client: f.primaryClient,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to change IMO, but I'm wondering why did you assign it to a var instead of passing it directly to the NewSpanWriter like you did with the other calls.

StartTimeMillis uint64 `json:"startTimeMillis"`
Duration uint64 `json:"duration"` // microseconds
Tags []KeyValue `json:"tags"`
Tag map[string]interface{} `json:"tag,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order the fields are defined in the Go struct are indicative of how they are stored in ES? That's good to know! Would probably be nice to have a code comment with this remark.

@yurishkuro
Copy link
Member

I have published docker images in https://hub.docker.com/u/pavolloffay/ with tag PR1018.

@pavolloffay can you ask someone to test with those images? Ideally existing users, as well as those who asked for the new functionality

@masteinhauser
Copy link
Member

I have published docker images in https://hub.docker.com/u/pavolloffay/ with tag PR1018.

@pavolloffay can you ask someone to test with those images? Ideally existing users, as well as those who asked for the new functionality

Hi, how can I help? We have been using the new Elasticsearch backend for ~100 days now with a fairly significant amount of span volume and lots of additional tags being added. This feature is incredibly relevant for us.

Signed-off-by: Pavol Loffay <ploffay@redhat.com>
@pavolloffay
Copy link
Member Author

pavolloffay commented Sep 12, 2018

@masteinhauser or anybody else could you please test this PR?

Images with this functionality are available in pavolloffay/ with tag PR1018 https://hub.docker.com/u/pavolloffay/. Here are binaries https://drive.google.com/open?id=15jujV5Y-lEXaXAgsQgWh2Y-urxrgenqFhttps://drive.google.com/open?id=15jujV5Y-lEXaXAgsQgWh2Y-urxrgenqF I could not upload binaries here due to the size limit.

Use --es.tags-as-fields.all flag to enable this feature.

Signed-off-by: Pavol Loffay <ploffay@redhat.com>
@masteinhauser
Copy link
Member

I finally had some time to roll this out and test it this morning. Everything appears to be working as expected, once I actually set the correct environment variables for the collector. No other config changes appeared to be needed, although I deployed agent, collector, and query based on the images made available here https://hub.docker.com/u/pavolloffay/

This of course only works for newly collected spans, which is perfectly acceptable for us. We'll likely be maintaining this image and deployment in production until the next Jaeger release is made with this available.

Incredibly useful feature for our developers. Thank you for the work on getting this done!

@pavolloffay
Copy link
Member Author

@masteinhauser thanks for trying it out! I guess we can merge it and release it.

@pavolloffay pavolloffay merged commit 3fd0677 into jaegertracing:master Sep 18, 2018
@ghost ghost removed the review label Sep 18, 2018
Client: f.primaryClient,
Logger: f.logger,
MetricsFactory: f.metricsFactory,
MaxLookback: cfg.GetMaxSpanAge(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pavolloffay why did we rename it? It's confusing that config params do not match the storage params.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we renamed anything in this PR

The flag name is --es.max-span-age and configuration cfg.GetMaxSpanAge(). MaxLookback is only used in reader maybe also writer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should change everything to maxSpanAge

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change it, there are also other refactoring opportunities

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants