-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[main][stats beat] implement stats beat in application insights #2489
base: main
Are you sure you want to change the base?
Conversation
Current StatsBeat Logic Cons StatsBeat Data Handling |
_networkCounter.totalRequest++; | ||
_networkCounter.requestDuration += utcNow() - payloadData["statsBeatData"]["startTime"]; | ||
} | ||
let retryArray = [401, 403, 408, 429, 500, 503]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I followed the status code from https://github.com/aep-health-and-standards/Telemetry-Collection-Spec/blob/main/ApplicationInsights/statsbeat/statsbeat.md, I could update the code there and specify its for js use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not specifically for js use, all languages should be doing so. Let's merge the spec change and then you can make the change here too.
} else if (throttleArray.includes(status)) { | ||
_networkCounter.throttle++; | ||
} else if (status !== 307 && status !== 308) { | ||
_networkCounter.failure[status] = (_networkCounter.failure[status] || 0) + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing logic for partial success 206
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the new update, I am treating 206 neither a success or a failure (it would not be counted in any category except totalReuqeustCounter)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, Python is treating 206 as success, I guess we did not explicitly specify what this is. @JacksonWeber @TimothyMothra @jeanbisutti do we know how each language treats 206 in terms of success/failure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lzchen we treat 206 as retriable and persist the failed envelopes to disk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect a 206 to be treated as a success as it most likely means that breeze has either
- applied server side sampling
- As our data is going to be mixed in with theirs (the single transport) we may be getting throttled, so when we extend this to client side details, we would also want to treat this as success -- unless we get told which ones failed and we add code to "parse" out the dropped events
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be a little more simplistic, I would expect any 200 (200-299) response to be treated as a success, as a 204 response (no content) is a valid response that we get from the 1DS Collector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it looks like Node.js is out of spec treating these as retriable. I'll create a work item.
|
||
|
||
|
||
_self.countException = (endpoint: string, exceptionType: string) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not super clear on this logic, should this be only incremented on exception?
|
||
function _setupTimer() { | ||
if (!_timeoutHandle) { | ||
_timeoutHandle = scheduleTimeout(() => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirm that scheduleTimeout
returns ms
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, based on the comment in ts-utils
* @param timeout - The time, in milliseconds that the timer should wait before the specified
_cikey = ikey; | ||
_language = STATSBEAT_LANGUAGE; | ||
_os = STATSBEAT_TYPE; | ||
_runTimeVersion = STATSBEAT_TYPE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a more useful value we can track for runtime version for web?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be able to parse the userAgent to get this information. This might be good default information though if none can be retrieved from that header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this component is in the core we don't really have access to the existing code that parses out the user-agent, we do a bunch of stuff in the 1DS code and some additional in the OsPlugin.
We could look at enhancing this in a later release -- probably as part of adding the OsPlugin by default (which we are planning in a future release) - not sure of the timing now though.
"runtimeVersion": _runTimeVersion, | ||
"os": _os, | ||
"language": _language, | ||
"version": _sdkVersion, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the other sdks, we do not populate this with sdkVersion
but rather JUST the version of the SDK.
iKey: INSTRUMENTATION_KEY, | ||
name: name, | ||
baseData: { | ||
name: name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the correct way to build the TelemetryItem
? In Python we have type(TelemetryItem.baseData) = MetricsData
and type(MetricsData.metrics) = List[MetricDataPoint]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, as we will do some envelop transform before we sent out telemetry, I will post some screenshot of how the telemetry will look like when go out of door here later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
let _os: string; | ||
let _runTimeVersion: string; | ||
dynamicProto(Statsbeat, this, (_self, _base) => { | ||
_self.initialize = (core: IAppInsightsCore, ikey: string, endpoint: string, version?: string) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets update the signature to pass the values as a config object (iKey, endpoint, version) as I'm feeling that we might want to update this in the future.
We could also then make these dynamic, so we could "update" then for things like passing in the browser version etc. So basically follow the normal type of "onConfigChange()" with setting defaults -- the code "supports" defining this for internal types (not part of the global config object (IConfiguration)), but it may need some signature updates or we fudge it.
For now just create an interface with these 3 and pass them in (ie. don't treat as dynamic or use the onConfigChange)
@@ -2,8 +2,8 @@ | |||
"name": "Telemetry Viewer - M3", | |||
"short_name": "Telemetry Viewer M3", | |||
"description": "A browser extension that provides a real time view of what's happening in Application Insights including what telemetry is being logged by the web application", | |||
"version": "0.7.5", | |||
"version_name": "0.7.5", | |||
"version": "0.7.6", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes to this file really shouldn't be in this PR.
@@ -283,6 +288,11 @@ export class Sender extends BaseTelemetryPlugin implements IChannelControls { | |||
} | |||
} | |||
|
|||
if (!config.disableStatsBeat && _statsBeat && !_statsBeat.isInitialized()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to "set" a default otherwise disableStatsBeat
will be undefined and not dynamic.
I'm also in 2 minds on this as we currently don't have any throttling in place, so once someone turns this one (by looking at the code or the tests), this will spam (overload) our statsbeat subscription with no option to stop it....
So we "might" want to check this code in with all of the "enable" commented out so there is zero chance of someone accidently enabling this. And then as part of the follow up PR to add the throttling we then add the config and support for "enabling" / "disabling".
Basically, we don't want someone turning this on a killing ALL of our statsbeat metrics. As the end-user (customers) subscription won't see any of this data.
No description provided.