Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: changes to metadata.cloud.project.* fields collected for GCP #3618

Merged
merged 1 commit into from
Sep 13, 2023

Conversation

trentm
Copy link
Member

@trentm trentm commented Sep 8, 2023

  • cloud.project.id per the shared APM spec change
  • fix the bignum parsing of instance.id (we were losing precision before
    this change)
  • add instance.name (missed in the original impl?)

Closes: #3614

- cloud.project.id per the shared APM spec change
- fix the bignum parsing of instance.id (we were losing precision before
  this change)
- add instance.name (missed in the original impl?)

Closes: #3614
@trentm trentm self-assigned this Sep 8, 2023
@github-actions github-actions bot added the agent-nodejs Make available for APM Agents project planning. label Sep 8, 2023
const URL = require('url').URL;
const JSONBigInt = require('json-bigint');
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewer note: This is the same lib that OTel is indirectly using. They use the Google-provided "gcp-metadata" package. I didn't want to use that because it adds yet another HTTP lib that we don't particularly need (gaxios, a wrapper around node-fetch that provides an interface like axios).

The "bignumber.js" dep (used by json-bigint) adds about 1MB to the install size, I think, which is unfortunate. We could get into shenanigans like using a regex to parse out the "id". :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking the metadata docs and also the JSON fixture you added I think we can be sure that there are only two numeric values in the response: numericProjectId and id (within instance)

{
  "instance": {
    "attributes": {},
    "id": 5737554347302044216,
    ...
  },
  "project": {
    "attributes": {
      ...
    },
    "numericProjectId": 523926462582,
    "projectId": "acme-eng"
    ...
  }
}

Since it is so specific I think we can opt-in to a custom solution to capture that info. You mentioned RegExp and maybe is the way to go using capture groups with String.match. We can capture the numeric values with something similar to the snippet below and use JSON.parse for the rest

const respBody = '...' // The response text with all the metadata

const instanceIdRegExp = /"id":(\d+),/;
const projectIdRegExp = /"numericProjectId":(\d+),/;

const instanceIdMatch = respBody.match(instanceIdRegExp)
const projectIdMatch = respBody.match(projectIdRegExp)

// The match is already an String so no need to convert
const instanceId = instanceIdMatch && instanceMatch[1];
const projectId = projectIdMatch && projectIdMatch[1];

NOTE: regexes should be reviewed since they do not cover the case of being the last property of the object

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure if numeric attributes can be added. E.g. could a user have a project.attributes.id: <some numeric>. I think regex matching would still be fine... we'd just assume it is the first "id":... match.

Playing with this a little bit, I added metadata with gcloud compute project-info --project "[REDACTED]" add-metadata --metadata=trentmfoo=42 and it at least defaults to setting metadata values as a string:

      {
        "key": "trentmfoo",
        "value": "42"
      }

So I think it would be fine.

@david-luna Do you have a preference either way? (Regex or using json-bigint?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

David and I discussed. The added "bignumber.js" dep is a bit of size, but still relatively low compared to a number of other deps, so we think it is better to have the clearer implementation and not get into possibly brittle regex usage just to save some install size.

// Note: We could also guard on the response having the
// 'Metadata-Flavor: Google' header as done by:
// https://github.com/googleapis/gcp-metadata/blob/v6.0.0/src/index.ts#L109-L112

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewer note: Add a sanity guard on the response from the Google metadata API. If it isn't a 200, we shouldn't get into parsing it. This is my justification for dropping some of the (unnecessary, IMO) defensive coding and testing cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my justification for dropping some of the (unnecessary, IMO) defensive coding and testing cases.

I agree that we can assume safely that a 200 status comes with the proper metadata described in their docs. Also it simplifies the code so IMO this is a good change :)

instance: {
id: null,
id: data.instance.id.toString(), // We expect this to be a BigInt.
name: data.instance.name,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewer note: This function has been simplified (IMO). The instance.name had been missed before (or perhaps the APM spec changed after original impl?). The data.instance.id is the interesting BigInt case.

guestAttributes: {},
hostname:
'username-temp-delete-me-cloud-metadata.c.elastic-apm.internal',
id: 7684572792595385000,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewer note: I've changed this test fixture to use an unparsed actual response on a test GCP VM. The problem with having an already parsed object, is that it accidentally included the BigInt loss of precision. That 7684572792595385000 is wrong. The actual instance.is was some 7684572792595385??? value with something other than zeros. It was just not noticed originally.

@@ -254,28 +254,6 @@ tape('cloud metadata: aws empty data', function (t) {
});
});

tape('cloud metadata: gcp empty data', function (t) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewer note: I've dropped a couple unnecessarily defensive test cases, IMHO. See the note above about the statusCode==200 guard.

// Note: We could also guard on the response having the
// 'Metadata-Flavor: Google' header as done by:
// https://github.com/googleapis/gcp-metadata/blob/v6.0.0/src/index.ts#L109-L112

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my justification for dropping some of the (unnecessary, IMO) defensive coding and testing cases.

I agree that we can assume safely that a 200 status comes with the proper metadata described in their docs. Also it simplifies the code so IMO this is a good change :)

const URL = require('url').URL;
const JSONBigInt = require('json-bigint');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking the metadata docs and also the JSON fixture you added I think we can be sure that there are only two numeric values in the response: numericProjectId and id (within instance)

{
  "instance": {
    "attributes": {},
    "id": 5737554347302044216,
    ...
  },
  "project": {
    "attributes": {
      ...
    },
    "numericProjectId": 523926462582,
    "projectId": "acme-eng"
    ...
  }
}

Since it is so specific I think we can opt-in to a custom solution to capture that info. You mentioned RegExp and maybe is the way to go using capture groups with String.match. We can capture the numeric values with something similar to the snippet below and use JSON.parse for the rest

const respBody = '...' // The response text with all the metadata

const instanceIdRegExp = /"id":(\d+),/;
const projectIdRegExp = /"numericProjectId":(\d+),/;

const instanceIdMatch = respBody.match(instanceIdRegExp)
const projectIdMatch = respBody.match(projectIdRegExp)

// The match is already an String so no need to convert
const instanceId = instanceIdMatch && instanceMatch[1];
const projectId = projectIdMatch && projectIdMatch[1];

NOTE: regexes should be reviewed since they do not cover the case of being the last property of the object

@trentm trentm merged commit e07d826 into main Sep 13, 2023
21 checks passed
@trentm trentm deleted the trentm/gcp-metadata-cloud-project-id branch September 13, 2023 21:15
fpm-peter pushed a commit to fpm-git/apm-agent-nodejs that referenced this pull request Aug 20, 2024
…astic#3618)

- cloud.project.id per the shared APM spec change
- fix the bignum parsing of instance.id (we were losing precision before
  this change)
- add instance.name (missed in the original impl)

Closes: elastic#3614
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agent-nodejs Make available for APM Agents project planning.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[META 826] changes to metadata.cloud.project.* fields collected for GCP
2 participants