-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
generate error/fault metrics by aws sdk status code #924
Conversation
|
f0a872e
to
ab85418
Compare
@@ -56,6 +61,18 @@ private enum ExpectedStatusMetric { | |||
|
|||
private AwsSpanMetricsProcessor awsSpanMetricsProcessor; | |||
|
|||
static class ThrowableWithMethodGetStatusCode extends Throwable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets add a test covering a Throwable with a statusCode
method and a Throwable with neither statusCode
nor getStatusCode
methods.
return; | ||
httpStatusCode = getAwsStatusCode(spanData); | ||
|
||
if (httpStatusCode == null || httpStatusCode <= 0L) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The range of HTTP status codes is defined in https://datatracker.ietf.org/doc/html/rfc9110#name-status-codes and could be checked as well.
All valid status codes are within the range of 100 to 599, inclusive.
ab85418
to
527e00c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Hi @trask , all other approval requirements met, please take a look when you have a chance. |
Hi @jack-berg , all other approval requirements met, please take a look when you have a chance. |
@Nullable | ||
private static Long getAwsStatusCode(SpanData spanData) { | ||
InstrumentationScopeInfo scopeInfo = spanData.getInstrumentationScopeInfo(); | ||
if (scopeInfo == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unnecessary as spanData.getInstrumentationScopeInfo() is not annotated with @Nullable
and does not return null.
} | ||
|
||
String scopeName = scopeInfo.getName(); | ||
if (scopeName == null || !scopeName.contains("aws-sdk")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, no null check required for scopeInfo.getName()
.
Throwable throwable = exceptionEvent.getException(); | ||
|
||
try { | ||
Method method = throwable.getClass().getMethod("getStatusCode", new Class<?>[] {}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code owners for this are @wangzlei and @srprash. Seems like a bad idea to do reflection on the hot path.
The javadoc for the method above mentions that this is a short term fix until this issue is resolved. I won't block this if the code owners want to proceed with this solution, but it would be nice to see a solution proposed to address the root of the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
527e00c
to
352dbc5
Compare
Hi @trask , @jack-berg , could you please take a look when you have a chance. |
Description:
A short term solution to generate error/fault metrics by aws sdk status code.
When the AWS SDK calls an API then returns non-200 status code, the AWS SDK simply throws an exception. The exception thrown by the AWS SDK is stored within the produced spans and is accessible in the AwsSpanMetricsProcessor, where we generate Fault/Error metrics.
We attempt to get the throwable object and call its getStatusCode method to obtain the status code, then generate metrics based on the corresponding status code.
Existing Issue(s):
http.status_code attribute in span was not being populated when AWS APIs were returning non-200 status codes
Testing:
unit tests
Documentation:
N/A
Outstanding items:
N/A