Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(models): filter out reserved tag keys in points parser #16412

Merged
merged 2 commits into from
Jan 24, 2020

Conversation

GeorgeMac
Copy link
Contributor

@GeorgeMac GeorgeMac commented Jan 6, 2020

This is an improvement to the validation done when parsing points. Certain tag keys are reserved and should not be used. This adds a validation step to the points parsing in the models package.
This step throws an error whenever any of the following tag keys are used:
"\xff"
"\x00"
"_measurement"
"_field"
"time"

Before this change was introduced the parsing benchmarks looked like this:

goos: darwin
goarch: amd64
pkg: github.com/influxdata/influxdb/models
BenchmarkParsePointsTagsSorted2-16       	 2000000	       664 ns/op	  76.76 MB/s
BenchmarkParsePointsTagsSorted5-16       	 2000000	       855 ns/op	  97.07 MB/s
BenchmarkParsePointsTagsSorted10-16      	 1000000	      1206 ns/op	 118.54 MB/s
BenchmarkParsePointsTagsUnSorted2-16     	 2000000	       789 ns/op	  64.57 MB/s
BenchmarkParsePointsTagsUnSorted5-16     	 1000000	      1117 ns/op	  74.26 MB/s
BenchmarkParsePointsTagsUnSorted10-16    	 1000000	      1838 ns/op	  77.78 MB/s
PASS
ok  	github.com/influxdata/influxdb/models	11.309s

Afterwards they look like this:

goos: darwin
goarch: amd64
pkg: github.com/influxdata/influxdb/models
BenchmarkParsePointsTagsSorted2-16       	 2000000	       745 ns/op	  68.38 MB/s
BenchmarkParsePointsTagsSorted5-16       	 2000000	       993 ns/op	  83.52 MB/s
BenchmarkParsePointsTagsSorted10-16      	 1000000	      1475 ns/op	  96.94 MB/s
BenchmarkParsePointsTagsUnSorted2-16     	 2000000	       874 ns/op	  58.33 MB/s
BenchmarkParsePointsTagsUnSorted5-16     	 1000000	      1243 ns/op	  66.73 MB/s
BenchmarkParsePointsTagsUnSorted10-16    	 1000000	      2037 ns/op	  70.18 MB/s
PASS
ok  	github.com/influxdata/influxdb/models	12.796s

So there is some slow down (roughly 10 - 15% slower). Any suggestions for improvements welcomed.

  • CHANGELOG.md updated with a link to the PR (not the Issue)
  • Well-formatted commit messages
  • Rebased/mergeable
  • Tests pass
  • http/swagger.yml updated (if modified Go structs or API)
  • Documentation updated or issue created (provide link to issue/pr)
  • Signed CLA (if not already signed)

@GeorgeMac GeorgeMac marked this pull request as ready for review January 7, 2020 15:00
Copy link
Contributor

@stuartcarnie stuartcarnie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GeorgeMac this looks good to me 👍

@GeorgeMac
Copy link
Contributor Author

@stuartcarnie Is there anything we need to do or be prepared for in order to roll this out safely?

Copy link
Contributor

@e-dard e-dard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GeorgeMac this seems like a pretty reasonable change, and the slowdown is to be expected. I'm not overly concerned about it since parsing is never really the ingest bottleneck.

It might be possible to eek a few more ns out by moving the checks into scanTagsKey and perhaps do them inline, but I wouldn't worry about that now.

One other suggestion: benchstat is good for statistically comparing results: $ go get golang.org/x/perf/cmd/benchstat.

Something like:

$ go -benchmem -count 10 -run=^$ github.com/influxdata/influxdb/models -bench ^BenchmarkParsePointsTags > /tmp/master.txt
$ go -benchmem -count 10 -run=^$ github.com/influxdata/influxdb/models -bench ^BenchmarkParsePointsTags > /tmp/gm.txt
$
$ benchstat /tmp/master.txt /tmp/gm.txt
name                         old time/op    new time/op    delta
ParsePointsTagsSorted2-8        704ns ± 1%     720ns ± 1%   +2.24%  (p=0.000 n=9+8)
ParsePointsTagsSorted5-8        893ns ± 0%     979ns ± 1%   +9.64%  (p=0.000 n=10+10)
ParsePointsTagsSorted10-8      1.21µs ± 1%    1.49µs ± 0%  +22.98%  (p=0.000 n=10+7)
ParsePointsTagsUnSorted2-8      817ns ± 1%     833ns ± 1%   +1.96%  (p=0.000 n=10+9)
ParsePointsTagsUnSorted5-8     1.12µs ± 0%    1.20µs ± 2%   +7.27%  (p=0.000 n=9+10)
ParsePointsTagsUnSorted10-8    1.86µs ± 1%    2.02µs ± 1%   +8.72%  (p=0.000 n=10+9)

name                         old speed      new speed      delta
ParsePointsTagsSorted2-8     72.3MB/s ± 0%  70.8MB/s ± 1%   -2.12%  (p=0.000 n=8+8)
ParsePointsTagsSorted5-8     92.9MB/s ± 0%  84.7MB/s ± 1%   -8.80%  (p=0.000 n=10+10)
ParsePointsTagsSorted10-8     118MB/s ± 1%    96MB/s ± 0%  -18.68%  (p=0.000 n=10+7)
ParsePointsTagsUnSorted2-8   62.4MB/s ± 1%  61.1MB/s ± 1%   -1.95%  (p=0.000 n=10+9)
ParsePointsTagsUnSorted5-8   73.8MB/s ± 1%  68.9MB/s ± 2%   -6.66%  (p=0.000 n=10+10)
ParsePointsTagsUnSorted10-8  77.0MB/s ± 1%  70.8MB/s ± 1%   -8.02%  (p=0.000 n=10+9)

name                         old alloc/op   new alloc/op   delta
ParsePointsTagsSorted2-8         368B ± 0%      368B ± 0%     ~     (all equal)
ParsePointsTagsSorted5-8         432B ± 0%      432B ± 0%     ~     (all equal)
ParsePointsTagsSorted10-8        544B ± 0%      544B ± 0%     ~     (all equal)
ParsePointsTagsUnSorted2-8       400B ± 0%      400B ± 0%     ~     (all equal)
ParsePointsTagsUnSorted5-8       496B ± 0%      496B ± 0%     ~     (all equal)
ParsePointsTagsUnSorted10-8      672B ± 0%      672B ± 0%     ~     (all equal)

name                         old allocs/op  new allocs/op  delta
ParsePointsTagsSorted2-8         4.00 ± 0%      4.00 ± 0%     ~     (all equal)
ParsePointsTagsSorted5-8         4.00 ± 0%      4.00 ± 0%     ~     (all equal)
ParsePointsTagsSorted10-8        4.00 ± 0%      4.00 ± 0%     ~     (all equal)
ParsePointsTagsUnSorted2-8       5.00 ± 0%      5.00 ± 0%     ~     (all equal)
ParsePointsTagsUnSorted5-8       5.00 ± 0%      5.00 ± 0%     ~     (all equal)
ParsePointsTagsUnSorted10-8      5.00 ± 0%      5.00 ± 0%     ~     (all equal)

@e-dard
Copy link
Contributor

e-dard commented Jan 8, 2020

@GeorgeMac in terms of preparing for a deployment - we probably need to think about the fact this may break existing users or clients. Please check with @bpalani and explain the impact of this change.

@GeorgeMac
Copy link
Contributor Author

GeorgeMac commented Jan 8, 2020

neat will crack out benchstat! Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants