Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for dropwizard format #2846

Merged
merged 6 commits into from
Jan 8, 2018
Merged

Conversation

atzoum
Copy link
Contributor

@atzoum atzoum commented May 23, 2017

Required for all PRs:

  • Signed CLA.
  • Associated README.md updated.
  • Has appropriate unit tests.

@atzoum
Copy link
Contributor Author

atzoum commented May 24, 2017

docker image available here

@nhaugo
Copy link
Contributor

nhaugo commented Jun 1, 2017

Can you please provide additional information as to what dropwizard is, including documentation.

@atzoum
Copy link
Contributor Author

atzoum commented Jun 1, 2017

Dropwizard metrics, previously known as codehale or yammer metrics, is a popular java library for capturing jvm and application level metrics. Apache Kafka for instance is a well known open source project which is internally using the dropwizard (yammer at the time) library to collect and report its own metrics.
Dropwizard has support for a number of different metric types such as gauges, meters, histograms, etc. and provides a set of ready-to-use metric reporters including the well-known jmx reporter.
However, dropwizard metrics also provide a "standard" json format for representing its metrics registry and this is what this pull request is all about:
The implemented telegraf dropwizard format is able to parse a full dropwizard metric registry represented in the "standard" json format at once (eg. using a kafka or http input) in a much more efficient manner than scraping custom jmx attributes through telegraf's jmx input plugin.
Finally, it is worth noting that the current implementation is already used in our production environment for consuming dropwizard metrics stored in json format from kafka topics with much success.

@danielnelson
Copy link
Contributor

@atzoum I have some config changes I'm working on that will improve the way we do Parsers/Serializers and allow us to add new ones more easily, without modifying the code in internal. Lets revisit this once this PR once I finish that.

@atzoum
Copy link
Contributor Author

atzoum commented Jun 8, 2017

@danielnelson no problem, do you have a PR already that I can track?

@danielnelson
Copy link
Contributor

No code yet, still in early stages. The issue for it is #272, I'll try to remember to update here as well otherwise ping me again in about a month.

@danielnelson danielnelson added feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin and removed triage labels Sep 13, 2017
@danielnelson
Copy link
Contributor

@atzoum Thanks for your patience, due to how long it's taking me to finish the config changes I think we should go ahead and work on getting this included. I noticed that #2369 and #3292 could both make use of this Parser, in addition to it's use with inputs supporting data_format. Before I look over the code though, I'd like to propose a few changes to the schema.

One issue with the hierarchical measurements dropwizard uses is that they can be difficult to query because all of the information is embedded in the measurement name. We should try to normalize information out of the measurement name into other areas of the metric to simplify queries.

To start off with I would move the metric type into a tag:

- counter.measurement,tag1=green,tag2=yellow count=1 1487766783662000000
+ measurement,tag1=green,tag2=yellow,metric_type=counter count=1 1487766783662000000

Another change would be similar to the graphite parser
where we use a template system to translate the metrics. I believe we could use this with few modifications. This would allow us to make translations like:

- vm.memory.pools.Compressed-Class-Space.max value=0
- vm.memory.pools.Compressed-Class-Space.used value=1
+ vm,memory_pool=Compressed-Class-Space max=0 used=1

@atzoum
Copy link
Contributor Author

atzoum commented Oct 4, 2017

@danielnelson indeed moving the metric type into a tag is an improvement and I can do this asap.

With regards to using a template system similar to the one that graphite parser is using, I am afraid I don't understand how this should work in practice since in dropwizard, with the exception of counters and gauges, all metric types have multiple fields. Thus, I don't think that we can really aggregate multiple dropwizard measurements into a single telegraf/influx measurement. I guess we could benefit from a templating mechanism for extracting tags from dropwizard's measurement names, however the current behavior described in the documentation seems to at least cover all the use cases that we have used it for

Tags are parsed from metric names as if they were actual influxdb line protocol keys (measurement<,tag_set>)

@danielnelson
Copy link
Contributor

So, if I recall correctly, all the segments that you classify as field are concatenated together using the separator and if there are multiple named fields they are appended as well.

separator = "_"
templates = [
    "measurement.measurement.field.field"
]
{
    "metrics": {
        "counters": {
            "jenkins.job.building.duration": {
                "max": 75.0,
                "mean": 42.0
            }
        }
    }
}
jenkins_job,metric_type=timer building_duration_mean=42.0 building_duration_max=75.0

I missed the "measurement,tag1=green" style format on my first read through. This is a pretty nice way to encode tags and is similar to our extensions to statsd format, but I think this style is probably very rare when working with existing services where you don't have full control over the measurement names.

@ctrlok
Copy link
Contributor

ctrlok commented Oct 5, 2017

I'm not sure that we can do measurement like this from dropwizzard, because "jenkins.job.building.duration" is a full metric name and can be anything like just "duration" or "pokemons". So, from dropwizzard you can only set full metric name as measurement. I.E.

jenkins_job_building_duration,metric_type=timer mean=42.0 max=75.0

BTW, I can do what you want in my Jenkins plugin without using dropwizzard input parser.

@atzoum
Copy link
Contributor Author

atzoum commented Oct 5, 2017

@ctrlok is correct, field names are readily available in the dropwizard format, I don't see any reason why we should do anything special in order to parse them from the measurement name.
The only place where a template could be valuable would be to extract tags from the measurement name, if measurement names don't already adhere to the default line protocol format.

@danielnelson
Copy link
Contributor

I agree that you can't apply a complex template in general, but that's why the template is an option that can be specialized for particular services. The default template would be "measurement*" which would give the metric like above.

My Jenkins example is bad perhaps, but with the Jenkins input wouldn't you rather, at the very least, have the data categorized into measurement names such as jenkins, http, and vm just for organization? I worry that have so many measurement names will make using the measurements difficult especially in GUIs like Chronograf or Grafana.

jenkins,metric=job_building_duration,metric_type=timer mean=42.0 max=75.0

In addition to organization, if a user can converge multiple counters and gauges into a fewer series it will decrease the cost of storage and memory use on the database and you can only use functions and operators easily with fields on the same series.

Here is an example in the Jenkins input where several series could be reduced easily:

vm.memory.heap.committed.value
vm.memory.heap.init.value
vm.memory.heap.max.value
vm.memory.heap.usage.value
vm.memory.heap.used.value
vm.memory.non-heap.committed.value
vm.memory.non-heap.init.value
vm.memory.non-heap.max.value
vm.memory.non-heap.usage.value
vm.memory.non-heap.used.value
vm_memory,pool=heap committed=1,init=2,max=3,usage=4,used=5
vm_memory,pool=non-heap committed=1,init=2,max=3,usage=4,used=5

@atzoum
Copy link
Contributor Author

atzoum commented Oct 6, 2017

ok, I haven't seen the templating code in graphite yet, do we want to reuse, is this easy, or simply duplicate and modify appropriately?

@danielnelson
Copy link
Contributor

Honestly, it is probably not trivial. The code is in plugins/parsers/graphite/parser.go, perhaps we can extract the matcher and template types into internal/templates.go and share it. Might also want to look at plugins/inputs/statsd/statsd.go.

@grange74
Copy link
Contributor

grange74 commented Dec 1, 2017

Sometimes a Dropwizard app returns a metric float value which is a very small exponential number (e.g 5.647645996854652E-23). It is also discussed here dropwizard/metrics#1106.

It would be nice if it would possible to format the float values (or atleast be able to specify the number of decimal places).

@danielnelson
Copy link
Contributor

@grange74 This is annoying, and I know with the line protocol serializer we currently will output it in non scientific notation which can result in very long lines. I think this should probably be the job of the output plugin to handle. What outputs/data_formats are you using where this is a problem?

@grange74
Copy link
Contributor

grange74 commented Dec 4, 2017

@danielnelson we are only currently using the influxdb output so we aren't getting any problems like the cloudwatch ones referenced in dropwizard/metrics#1106.

It would just be nice to be able to format the float values to reduce the payloads that we send to influxdb as in most cases we only need 3 decimal places instead of the 15-18 that are often returned.

Here is an example from a basic dropwizard app:

meters" : {
    "ch.qos.logback.core.Appender.all" : {
      "count" : 16,
      "m15_rate" : 3.112334326772315,
      "m1_rate" : 2.109570016641421,
      "m5_rate" : 2.9441421268138344,
      "mean_rate" : 0.5247118152721372,
      "units" : "events/second"
    }
}

"timers" : {
    "io.dropwizard.jetty.MutableServletContextHandler.get-requests" : {
      "count" : 2,
      "max" : 0.10200000000000001,
      "mean" : 0.059500000000000004,
      "min" : 0.017,
      "p50" : 0.10200000000000001,
      "p75" : 0.10200000000000001,
      "p95" : 0.10200000000000001,
      "p98" : 0.10200000000000001,
      "p99" : 0.10200000000000001,
      "p999" : 0.10200000000000001,
      "stddev" : 0.0425,
      "m15_rate" : 0.002191574188921952,
      "m1_rate" : 0.027072376727683665,
      "m5_rate" : 0.006394670392516757,
      "mean_rate" : 0.0677765092949584,
      "duration_units" : "seconds",
      "rate_units" : "calls/second"
    }
}

@danielnelson
Copy link
Contributor

We could add an option to control the max precision of the line protocol serializer, but I'm not sure if it is useful enough to warrant extending the interface. What would be interesting check is if we took a batch of 1000 average lines, and compared the size with with 3 digits of precision vs 15. We should compare the size with gzip compression (make sure you are using this if you are concerned about payload size).

@atzoum atzoum force-pushed the dropwizard branch 5 times, most recently from d5bddb4 to 5f54499 Compare January 4, 2018 17:32
@atzoum
Copy link
Contributor Author

atzoum commented Jan 4, 2018

@danielnelson I managed to find some time to work on the agreed changes:

  • Move the metric type into a tag.
  • Extract the template matching engine from Graphite and move it to internal/templating.
  • Reuse the extracted templating engine in Dropwizard parser.
  • Update the documentation accordingly.

Please review

@atzoum atzoum force-pushed the dropwizard branch 2 times, most recently from e221c75 to a173866 Compare January 5, 2018 09:11
.gitignore Outdated
@@ -5,3 +5,4 @@ tivan
.idea
*~
*#
.vscode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove this, you could put it into a ~/.gitignore file and add an excludesfile to your ~/.gitconfig. I'm going to remove some of the other entries in here that don't belong.

if timeFormat == "" {
timeFormat = time.RFC3339
}
time, err := time.Parse(timeFormat, gjson.GetBytes(buf, p.TimePath).String())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be an error if the time path is not found, or if the time is wrong format.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

var jsonOut map[string]interface{}
err := json.Unmarshal(registryBytes, &jsonOut)
if err != nil {
err = fmt.Errorf("unable to parse dropwizard metric registry from JSON document, %s", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see a more precise error message if the path is not found.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

dropwizard_time_format = "2006-01-02T15:04:05Z07:00"
dropwizard_tags_path = "tags"
## tag paths per tag are supported too, eg.
#[inputs.yourinput.dropwizard_tag_paths]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it common for tags to be distributed throughout the document? We may want to remove this functionality and use the measurement filtering options (taginclude/tagexclude) if all we need to filter the tags.

Copy link
Contributor Author

@atzoum atzoum Jan 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is not about filtering tags. The purpose of this optional configuration option is to allow the user to add tags which are not included in the measurement name but are contained in another property within the same json file.
I have an example in test case TestParseValidEmbeddedCounterJSON where the following json contains both arbitraty tags and a dropwizard metric registry like below

{
	"time" : "2017-02-22T14:33:03.662+02:00",
	"tags" : {
		"tag1" : "green",
		"tag2" : "yellow"
	},
	"metrics" : {
		"counters" : 	{ 
			"measurement" : {
				"count" : 1
			}
		},
		"meters" : 		{},
		"gauges" : 		{},
		"histograms" : 	{},
		"timers" : 		{}
	}
}

We are already using this functionality in our prod environment and it is necessary for our use cases. I don't expect it to be frequently used by other users, however it is a nice addition and I don't see any harm in keeping it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example could be done with dropwizard_tags_path = "tags", do you have an example where you need dropwizard_tag_paths?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets say we want the value inside tag1 but want to rename the tag to mytag:

[inputs.yourinput.dropwizard_tag_paths]
  mytag = "tags.tag1"

The above config would add a tag mytag=green in all measurements

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you see it being used outside of renaming? Is this something you are using right now?

As an aside, we will probably want to have a general purpose renaming method in the future, probably as a processor plugin, but no ETA on that.

Copy link
Contributor Author

@atzoum atzoum Jan 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that filtering can be done already through taginclude and tagexclude, renaming is the only use case for it. Yes we are using this on two occasions to consolidate metrics from different sources (produced by different teams).
Indeed, tag renaming is a cross-cutting concern.

@danielnelson danielnelson added this to the 1.6.0 milestone Jan 5, 2018
@danielnelson danielnelson merged commit 317de40 into influxdata:master Jan 8, 2018
@atzoum atzoum deleted the dropwizard branch January 9, 2018 07:22
maxunt pushed a commit that referenced this pull request Jun 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants