-
Notifications
You must be signed in to change notification settings - Fork 225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: changes to metadata.cloud.project.* fields collected for GCP #3618
Conversation
- cloud.project.id per the shared APM spec change - fix the bignum parsing of instance.id (we were losing precision before this change) - add instance.name (missed in the original impl?) Closes: #3614
const URL = require('url').URL; | ||
const JSONBigInt = require('json-bigint'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewer note: This is the same lib that OTel is indirectly using. They use the Google-provided "gcp-metadata" package. I didn't want to use that because it adds yet another HTTP lib that we don't particularly need (gaxios, a wrapper around node-fetch that provides an interface like axios
).
The "bignumber.js" dep (used by json-bigint
) adds about 1MB to the install size, I think, which is unfortunate. We could get into shenanigans like using a regex to parse out the "id". :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking the metadata docs and also the JSON fixture you added I think we can be sure that there are only two numeric values in the response: numericProjectId
and id
(within instance)
{
"instance": {
"attributes": {},
"id": 5737554347302044216,
...
},
"project": {
"attributes": {
...
},
"numericProjectId": 523926462582,
"projectId": "acme-eng"
...
}
}
Since it is so specific I think we can opt-in to a custom solution to capture that info. You mentioned RegExp
and maybe is the way to go using capture groups with String.match
. We can capture the numeric values with something similar to the snippet below and use JSON.parse
for the rest
const respBody = '...' // The response text with all the metadata
const instanceIdRegExp = /"id":(\d+),/;
const projectIdRegExp = /"numericProjectId":(\d+),/;
const instanceIdMatch = respBody.match(instanceIdRegExp)
const projectIdMatch = respBody.match(projectIdRegExp)
// The match is already an String so no need to convert
const instanceId = instanceIdMatch && instanceMatch[1];
const projectId = projectIdMatch && projectIdMatch[1];
NOTE: regexes should be reviewed since they do not cover the case of being the last property of the object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure if numeric attributes can be added. E.g. could a user have a project.attributes.id: <some numeric>
. I think regex matching would still be fine... we'd just assume it is the first "id":...
match.
Playing with this a little bit, I added metadata with gcloud compute project-info --project "[REDACTED]" add-metadata --metadata=trentmfoo=42
and it at least defaults to setting metadata values as a string:
{
"key": "trentmfoo",
"value": "42"
}
So I think it would be fine.
@david-luna Do you have a preference either way? (Regex or using json-bigint?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
David and I discussed. The added "bignumber.js" dep is a bit of size, but still relatively low compared to a number of other deps, so we think it is better to have the clearer implementation and not get into possibly brittle regex usage just to save some install size.
// Note: We could also guard on the response having the | ||
// 'Metadata-Flavor: Google' header as done by: | ||
// https://github.com/googleapis/gcp-metadata/blob/v6.0.0/src/index.ts#L109-L112 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewer note: Add a sanity guard on the response from the Google metadata API. If it isn't a 200, we shouldn't get into parsing it. This is my justification for dropping some of the (unnecessary, IMO) defensive coding and testing cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is my justification for dropping some of the (unnecessary, IMO) defensive coding and testing cases.
I agree that we can assume safely that a 200 status comes with the proper metadata described in their docs. Also it simplifies the code so IMO this is a good change :)
instance: { | ||
id: null, | ||
id: data.instance.id.toString(), // We expect this to be a BigInt. | ||
name: data.instance.name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewer note: This function has been simplified (IMO). The instance.name
had been missed before (or perhaps the APM spec changed after original impl?). The data.instance.id
is the interesting BigInt case.
guestAttributes: {}, | ||
hostname: | ||
'username-temp-delete-me-cloud-metadata.c.elastic-apm.internal', | ||
id: 7684572792595385000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewer note: I've changed this test fixture to use an unparsed actual response on a test GCP VM. The problem with having an already parsed object, is that it accidentally included the BigInt loss of precision. That 7684572792595385000
is wrong. The actual instance.is was some 7684572792595385???
value with something other than zeros. It was just not noticed originally.
@@ -254,28 +254,6 @@ tape('cloud metadata: aws empty data', function (t) { | |||
}); | |||
}); | |||
|
|||
tape('cloud metadata: gcp empty data', function (t) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewer note: I've dropped a couple unnecessarily defensive test cases, IMHO. See the note above about the statusCode==200 guard.
// Note: We could also guard on the response having the | ||
// 'Metadata-Flavor: Google' header as done by: | ||
// https://github.com/googleapis/gcp-metadata/blob/v6.0.0/src/index.ts#L109-L112 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is my justification for dropping some of the (unnecessary, IMO) defensive coding and testing cases.
I agree that we can assume safely that a 200 status comes with the proper metadata described in their docs. Also it simplifies the code so IMO this is a good change :)
const URL = require('url').URL; | ||
const JSONBigInt = require('json-bigint'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking the metadata docs and also the JSON fixture you added I think we can be sure that there are only two numeric values in the response: numericProjectId
and id
(within instance)
{
"instance": {
"attributes": {},
"id": 5737554347302044216,
...
},
"project": {
"attributes": {
...
},
"numericProjectId": 523926462582,
"projectId": "acme-eng"
...
}
}
Since it is so specific I think we can opt-in to a custom solution to capture that info. You mentioned RegExp
and maybe is the way to go using capture groups with String.match
. We can capture the numeric values with something similar to the snippet below and use JSON.parse
for the rest
const respBody = '...' // The response text with all the metadata
const instanceIdRegExp = /"id":(\d+),/;
const projectIdRegExp = /"numericProjectId":(\d+),/;
const instanceIdMatch = respBody.match(instanceIdRegExp)
const projectIdMatch = respBody.match(projectIdRegExp)
// The match is already an String so no need to convert
const instanceId = instanceIdMatch && instanceMatch[1];
const projectId = projectIdMatch && projectIdMatch[1];
NOTE: regexes should be reviewed since they do not cover the case of being the last property of the object
…astic#3618) - cloud.project.id per the shared APM spec change - fix the bignum parsing of instance.id (we were losing precision before this change) - add instance.name (missed in the original impl) Closes: elastic#3614
this change)
Closes: #3614