Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

module: Add SourceMap.findOrigin #47790

Merged
merged 1 commit into from
Jun 23, 2023

Conversation

isaacs
Copy link
Contributor

@isaacs isaacs commented Apr 30, 2023

This adds the SourceMap.findOrigin(lineNumber, columnNumber) method, for finding the origin source file and 1-indexed line and column numbers corresponding to the 1-indexed line and column numbers from a call site in generated source code.

Fix: #47770

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/loaders
  • @nodejs/modules

@nodejs-github-bot nodejs-github-bot added the needs-ci PRs that need a full CI run. label Apr 30, 2023
* @return {?Array}
* @param {number} 0-indexed line offset in compiled resource
* @param {number} 0-indexed column offset in compiled resource
* @return {object} representing start of range if found, or empty object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The object type doesn't say much so I think it's better to declare an object with the properties that will/might be returned.

For example:

/**
 * @returns {{
 *   foo: string;
 *   bar?: number;
 * }}
 */

/**
* @param {number} 1-indexed line number in compiled resource call site
* @param {number} 1-indexed column number in compiled resource call site
* @return {object} representing origin call site if found, or empty object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

Copy link
Member

@benjamingr benjamingr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a good start, looks like there are some lint errors

This adds the `SourceMap.findOrigin(lineNumber, columnNumber)` method,
for finding the origin source file and 1-indexed line and column numbers
corresponding to the 1-indexed line and column numbers from a call site
in generated source code.

Fix: nodejs#47770
@isaacs isaacs force-pushed the isaacs/source-map-find-origin branch from 9f3cf34 to d4c62c0 Compare May 1, 2023 04:20
@isaacs
Copy link
Contributor Author

isaacs commented May 1, 2023

Fixed the lint errors.

@VoltrexKeyva I followed the same pattern used throughout the file. (Eg, I didn't actually edit the annotations for findEntry except to add that it is 0-indexed line/column offset, rather than a 1-indexed line/column number.)

I agree that object is not particularly informative, but I'm not familiar with the format used, and since this is not making it worse, I don't think it should be a reason to block merging.

An effort to improve the type annotations throughout this API would certainly be worthwhile. If anyone can provide guidance as to what those annotations should look like, then I have no objection, of course.

Copy link
Member

@benjamingr benjamingr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (I'm comfortable with improvements going in other PRs)

Comment on lines +215 to +224
* generatedLine: {number} The line offset of the start of the
range in the generated source
* generatedColumn: {number} The column offset of start of the
range in the generated source
* originalSource: {string} The file name of the original source,
as reported in the SourceMap
* originalLine: {number} The line offset of the start of the
range in the original source
* originalColumn: {number} The column offset of start of the
range in the original source
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The column limit is 130 now, I think. You could use more space to make this more readable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes the most sense to follow the conventions in the file when adding new content. It's a bit weird to have some of the docs wrapped at 130, and others at 80, no? And even worse to have a commit that touches the whole file when only a single section was added/edited.

Better to reformat markdown as a subsequent PR, or better yet, reformat all the markdown in the project in one atomic commit that does nothing else.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it’s more readable when wrapped at 80 chars, at least it is on the GitHub web UI because they wrap suggestions at 80 chars 🤷‍♂️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clearly, what we need is more input on this important matter. 😅

@benjamingr benjamingr added the request-ci Add this label to start a Jenkins CI on a PR. label May 1, 2023
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label May 1, 2023
@nodejs-github-bot
Copy link
Collaborator

return {};
}
const lineOffset = lineNumber - range.generatedLine;
const columnOffset = columnNumber - range.generatedColumn;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this PR out locally and noticed that if you create a source map from an existing coverage report some inputs return negative column numbers. Is this function expected to work with the source map data generated via NODE_V8_COVERAGE?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's weird. No, the column number should be at minimum 1. Can you share the steps you used to get to a negative column number? Maybe you're passing it offsets instead of column numbers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But yes, I'm using this now in tap on coverage map data produced by V8, and it seems to work fine, so if you found a weird edge case, I'm very interested.

Copy link
Contributor

@cjihrig cjihrig May 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@isaacs - I'll have to see if I still have the code on a branch, but the gist of it was:

Given a line in the generated file, I was trying to find all of the corresponding lines in the original source file. I created a SourceMap from the V8 coverage data and called map.findOrigin(row, 1) and map.findOrigin(row, Infinity). Note - I'm not sure if Infinity should be supported here, but I did try valid finite numbers. Some of the returned values had negative column offsets. Using zero-based rows and columns, this did seem to work correctly with map.findEntry(), including passing Infinity.

I think this is related to the fact that V8 ranges can start with whitespace on a non-zero column (for example around the { in if (foo) {).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would assume that findOrigin(row, Infinity) would produce a very strange value indeed, but I'd expect a column of Infinity rather than a negative value.

What it's doing is calculating the row and column deltas from the start of the generated line/column offsets to the 1-indexed line/column numbers you give it, then applying that same delta to the source line/column values from the map entry. So if you told it your origin is line: 100, column: Infinity then findEntry should give you a generatedLine of some number less than 100 (whatever line that range starts on), and a generatedColumn of the 0-index start of that range. The line offset then would be some number less than 100, and the column offset would be Infinity - (some finite n), so Infinity. The source line and column would be some arbitrary number corresponding to the start of that range, and then it would add the line delta (some number less than 100) to the entry's sourceLine, and the column delta (Infinity) to the entry sourceColumn.

I'm not sure just looking for line, 0 and line, Infinity would give you all ranges that affect that line, though, since more than 2 ranges can exist on a given line, and they might be arbitrarily intermixed.

As I describe this, I'm realizing that this strategy might only work because TypeScript doesn't change token lengths, though.

Like, let's say you had a source like

const someLongName = otherLongName.x()

And that got minified to

const _l=_o.x()

And let's say the error being thrown is that otherLongName/_o is undefined, so _o.x() throws an error.

Lines are the same, so no issue there.

But the range will start at {generatedColumn:0,sourceColumn:0}, and the error origin callsite will be column 13:

can't call method x() of undefined blah blah blah
const _l=_o.x()
            ^

If we add 13 to the sourceColumn, that isn't going to point at the right spot, and you'll get an output like:

const someLongName = otherLongName.x()
            ^

It'll work if the minifier is smart enough to make a new range where the two start matching, then it's fine, but I think the heuristic might have to be a little bit smarter, take into account the end of the range as well, and idk, say "It's probably somewhere in the general vicinity of these characters, good luck".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so the problem is with findEntry. There's no entry in the sourcemap that covers offset 10, 0. So, findEntry returns the entry at 9,30, since it uses the highest start of the range that is less than the provided values.

The delta from the range 9,30 to the call site location 11,1 is +2,-29. But really, that's invalid, because 10, 0 isn't covered by the range that starts at offset 9,30. There's no way to ever have a call site that references line 11 column 1, and even if it did, that doesn't map to anything in the source file. It's an invalid input, so you get an invalid result.

Similarly, the range offset 10, Infinity (incorrectly) returns the range at 10, 1 (because Infinity is greater than the last range in line 9, and 9 is less than the first offset in line 10, so you get that one). This maps to the source offset of 9,1. The delta from 10, 1 to 11, Infinity is +1, +Infinity, so you get a resulting origin call site of 10, Infinity.

I think that findEntry should return an empty object (like it does when it can't find a range) if the provided offsets are outside the end of the offset it finds. The code seems to assume that there's no un-mapped ranges in the generated file, and that's not the case.

Also, just to un-XY this problem, if your goal is "find all the lines in the origin source that map to a given line in the generated source", this strategy of checking column 1 and column Infinity is not ever going to work reliably. Any line in the generated source may be composed of an arbitrary number of entries from an arbitrary set of locations within the origin source, so if you only test the first and last in the line, you'll miss everything in the middle, and there's no guarantee that they line up. Consider, for example, a munger that shuffles functions around, or a minifier that puts every line in the origin into a single line in the output.

What you need to do is get the set of offset ranges from the sourcemap entries that have a line offset of line - 1, and then add 1 to each of the origin lines of those entries, and that'd give you the result. However, since sourceMap.#mappings is private, and only exposed via the very limited findEntry method, that's not possible without parsing the source map in userland, unless we also add a way to get at the raw entries themselves.

Copy link
Contributor Author

@isaacs isaacs May 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ie, suggesting this change to findEntry, to have it return an empty range if the supplied offsets are not within the entry it finds (haven't tested, this might break other stuff, idk)

diff --git a/lib/internal/source_map/source_map.js b/lib/internal/source_map/source_map.js
index 3112a026b64..08ad65b919a 100644
--- a/lib/internal/source_map/source_map.js
+++ b/lib/internal/source_map/source_map.js
@@ -189,10 +189,7 @@ class SourceMap {
       }
     }
     const entry = this.#mappings[first];
-    if (!first && entry && (lineOffset < entry[0] ||
-        (lineOffset === entry[0] && columnOffset < entry[1]))) {
-      return {};
-    } else if (!entry) {
+    if (!entry || lineOffset !== entry[0] || columnOffset < entry[1]) {
       return {};
     }
     return {

With that, it's much more clear that the approach of testing column 1 and column Infinity is invalid, because you walked off the source map.

Copy link
Contributor Author

@isaacs isaacs May 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, that's only half of the fix, because we're not tracking the end of the range in this.#mappings, only the start, so the test program you shared will find 2 obviously missing results, and 2 that are confusingly infinite.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, just to un-XY this problem, if your goal is "find all the lines in the origin source that map to a given line in the generated source", this strategy of checking column 1 and column Infinity is not ever going to work reliably.

So this was just me doing some random testing, not an actual goal. I agree that it won't work for anything useful - a generated file can be a single line, which would then map to every line in the original source, making it pretty useless for something like code coverage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can definitely see some potential for something like "from l1, c1 to l2, c2, find all the ranges in the origin that are affected", for which "find all the origin lines that went into line X in generated source" is sort of a simplified case. But we can't do that without parsing out the ends of the ranges from the sourcemap payload.

@isaacs
Copy link
Contributor Author

isaacs commented Jun 15, 2023

Can someone with permission to view the Jenkins test results help out here? I can't view the output to see why they might be failing.

@isaacs
Copy link
Contributor Author

isaacs commented Jun 15, 2023

Or even better, can someone grant me the Overall/Read permission?

@richardlau
Copy link
Member

@isaacs FYI Access to the Jenkins CI has been restricted for preparing next week's security releases. It'll be opened up again after those releases are out.

In any case, if this is referring to the CI from 1 May those results have been expunged -- we only keep Jenkins results around for 20 days.

@isaacs
Copy link
Contributor Author

isaacs commented Jun 15, 2023

Thanks for the update @richardlau

What's the best path forward here then? Wait until the sec releases are out and then push a change to trigger a re-run?

@richardlau
Copy link
Member

Yes, wait until the security releases are out and CI opened up again. No need to push new changes, we can trigger a new CI run with a request-ci Add this label to start a Jenkins CI on a PR. label.

@benjamingr benjamingr added the request-ci Add this label to start a Jenkins CI on a PR. label Jun 22, 2023
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Jun 22, 2023
@nodejs-github-bot
Copy link
Collaborator

@lpinca lpinca added the commit-queue Add this label to land a pull request using GitHub Actions. label Jun 23, 2023
@nodejs-github-bot nodejs-github-bot removed the commit-queue Add this label to land a pull request using GitHub Actions. label Jun 23, 2023
@nodejs-github-bot nodejs-github-bot merged commit e26ffe7 into nodejs:main Jun 23, 2023
@nodejs-github-bot
Copy link
Collaborator

Landed in e26ffe7

RafaelGSS pushed a commit that referenced this pull request Jul 3, 2023
This adds the `SourceMap.findOrigin(lineNumber, columnNumber)` method,
for finding the origin source file and 1-indexed line and column numbers
corresponding to the 1-indexed line and column numbers from a call site
in generated source code.

Fix: #47770
PR-URL: #47790
Fixes: #47770
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com>
@RafaelGSS RafaelGSS mentioned this pull request Jul 3, 2023
Ceres6 pushed a commit to Ceres6/node that referenced this pull request Aug 14, 2023
This adds the `SourceMap.findOrigin(lineNumber, columnNumber)` method,
for finding the origin source file and 1-indexed line and column numbers
corresponding to the 1-indexed line and column numbers from a call site
in generated source code.

Fix: nodejs#47770
PR-URL: nodejs#47790
Fixes: nodejs#47770
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com>
Ceres6 pushed a commit to Ceres6/node that referenced this pull request Aug 14, 2023
This adds the `SourceMap.findOrigin(lineNumber, columnNumber)` method,
for finding the origin source file and 1-indexed line and column numbers
corresponding to the 1-indexed line and column numbers from a call site
in generated source code.

Fix: nodejs#47770
PR-URL: nodejs#47790
Fixes: nodejs#47770
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com>
ruyadorno pushed a commit that referenced this pull request Sep 10, 2023
This adds the `SourceMap.findOrigin(lineNumber, columnNumber)` method,
for finding the origin source file and 1-indexed line and column numbers
corresponding to the 1-indexed line and column numbers from a call site
in generated source code.

Fix: #47770
PR-URL: #47790
Fixes: #47770
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com>
@ruyadorno ruyadorno mentioned this pull request Sep 10, 2023
ruyadorno pushed a commit that referenced this pull request Sep 13, 2023
This adds the `SourceMap.findOrigin(lineNumber, columnNumber)` method,
for finding the origin source file and 1-indexed line and column numbers
corresponding to the 1-indexed line and column numbers from a call site
in generated source code.

Fix: #47770
PR-URL: #47790
Fixes: #47770
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-ci PRs that need a full CI run.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

findSourceMap/findEntry provides confusing/misleading line and column numbers
9 participants