-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add an API to send multiple samples at once #235
feat: Add an API to send multiple samples at once #235
Conversation
recordDistributionValues is similar to recordDistributionValue but it lets the client sends multiple samples in one message using dogstatsd 1.1 protocol. Because this is a shift compared to how other methods are behaving, these methods are provided only in DirectStatsDClient which provides direct access to some low level dogstatsd protocol features. This is recommended in high performance cases were the overhead of the statsd library might be significant and the sampling is already done by the client. Port of DataDog/datadog-go#296.
|
||
@Override | ||
protected void writeValuesTo(StringBuilder builder) { | ||
for (long value : values) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if there are more values than we have space in the payload for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@iksaif how are you handling that on the go side? As far as I have seen there is no specific handling.
So in this case, I suppose we will throw an exception, but I'm not sure to see how it differs from someone sending a lot a tags for instance, are we bounding that? And it is explicitly stated in the javadoc that it is the responsibility of the caller to bound the number of values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussing with @iksaif, I will port something similar to https://github.com/DataDog/datadog-go/blob/master/statsd/buffer.go#L75.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e5d517f
to
15f60b4
Compare
When we send multiple samples at once, handle gracefully the case where the output payload is bigger than the max packet size by splitting the samples into several payloads.
15f60b4
to
55ebdc5
Compare
int previousLength = builder.length(); | ||
writeHeadMetadata(builder); | ||
writeTailMetadata(builder, containerID); | ||
metadataSize = builder.length() - previousLength; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This counts characters, while the capacity passed into writeTo
is in bytes, which can cause payload drops if the direct client is used with non-ascii data. Would it be possible to account for encoded length here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! I have fixed that in 37f858d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have again pushed the commit to use the target encoding:
final int previousEncodedLength = builder.toString().getBytes(StandardCharsets.UTF_8).length;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is allocating, if it is a blocker, I can intern this code from guava: https://github.com/google/guava/blob/master/guava/src/com/google/common/base/Utf8.java#L49
3fbd013
to
ec54735
Compare
@vickenty does the PR looks good to you now and can be merged?
|
2e4d9c3
to
0f0ee89
Compare
src/main/java/com/timgroup/statsd/NonBlockingDirectStatsDClient.java
Outdated
Show resolved
Hide resolved
To compute the available capacity in multi valued messages use the encoded length of the metadata instead of the length of the string builder as a character can be encoded on several bytes if not ascii.
0f0ee89
to
82eea75
Compare
writeValueTo(builder, i); | ||
if (builder.length() > maxLength) { | ||
builder.setLength(previousLength); | ||
offset += i; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could be missing something, but this looks like it will skip values for chunks after second one: let's say we start with offset=3 and write 4 values at positions 3, 4, 5 and 6 before hitting overflow at i=7. This sets offset to 12 (3+7), loosing samples 7-11.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, I have updated the test and the implementation to fix that in d4f3cf8 as i
is tracking the absolute index in the array not the relative one compare to the offset.
When we need to split multi valued messages in more than two output messages to enforce the max packet size there is an issue in the way offset in the values array is tracked.
recordDistributionValues is similar to recordDistributionValue but it lets the client sends multiple samples in one message using dogstatsd 1.1 protocol.
Because this is a shift compared to how other methods are behaving, these methods are provided only in DirectStatsDClient which provides direct access to some low level dogstatsd protocol features.
This is recommended in high performance cases were the overhead of the statsd library might be significant and the sampling is already done by the client.
Port of DataDog/datadog-go#296.