Skip to content

Sourcemap: stop producing sourcemap mappings with negative lines and columns #1445

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

pmwhite
Copy link
Contributor

@pmwhite pmwhite commented Apr 4, 2023

This patch began by changing the mapping type in compiler/lib/source_map.ml so that it more obviously encodes the possible cases for a particular mapping. All other changes were made in response to the type errors that ensued.

No new tests were added because this change is mostly behavior preserving, and the situation in which behavior changes is demonstrated by changes to existing tests.

The motivation for this change is that the sourcemaps produced by js_of_ocaml are rejected by the Mozilla sourcemap library. In particular, it's this error message that gets triggered.

…columns

This patch began by changing the mapping type in
compiler/lib/source_map.ml so that it more obviously encodes the
possible cases for a particular mapping. All other changes were made in
response to the type errors that ensued.
@hhugo
Copy link
Member

hhugo commented Apr 4, 2023

Is a mapping ever useful when ori_location is none ?

@pmwhite
Copy link
Contributor Author

pmwhite commented Apr 4, 2023

Maybe it isn't, but then why are those mappings even allowed? It seems strange for the spec to mention this kind of mapping if they aren't useful. That said, I can't think of a use for them, so maybe you're right.

@dyedgreen
Copy link

Seeing that the ocaml compiler seems to generate these maybe we can ask someone who know more about that code why these are generated?

@hhugo
Copy link
Member

hhugo commented Apr 5, 2023

Seeing that the ocaml compiler seems to generate these maybe we can ask someone who know more about that code why these are generated?

I don't think the OCaml compiler has anything to do with this. What makes you think that ?

Maybe it isn't, but then why are those mappings even allowed? It seems strange for the spec to mention this kind of mapping if they aren't useful. That said, I can't think of a use for them, so maybe you're right.

I initially misread your patch.
Replying to myself, I think having a mapping when there are no source info is important. The logic was added by 0e78d6d. It convey the information that the previous mapping no longer apply.

If you have the following list of token M X X X and have source location for M only and if you only emit a mapping for M, the program interpreting the sourcemap will consider that mapping also apply to X X X.

Do you know which part of this PR fixes your issue ? Have you measured the impact of boxing ori_location ?

&& c.ori_source = a.(prev).ori_source
&& c.ori_line = a.(prev).ori_line
&& c.ori_col = a.(prev).ori_col
then (* We already are at this location *)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this logic removed ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed it because it didn't look like it was necessary, but I'm not at all confident about that. If I put it back, then we have to decide what happens in the other three cases besides Some, Some. Do you know what situation causes this case to be hit?

@@ -198,14 +197,13 @@ let mapping_of_string str =
readline 0 0 []

let maps ~sources_offset ~names_offset x =
let gen_line = x.gen_line in
let ori_source = x.ori_source + sources_offset in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that this line was turning no source into an invalid source.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I believe that's correct. Turning the type into an option pointed this position out, but if we want to make this patch is minimal as possible, maybe this would be the only line that has to change. Let me know if you think this is what we should try to do.

@dyedgreen
Copy link

dyedgreen commented Apr 6, 2023

I don't think the OCaml compiler has anything to do with this. What makes you think that ?

My understanding was that js_of_ocaml reads a source map produced for the ocaml byte code (by the ocaml compiler) and then translates the mappings to match the generated JavaScript, but this might be inaccurate?

Your explanation makes sense though!

@pmwhite
Copy link
Contributor Author

pmwhite commented Apr 6, 2023

If you have the following list of token M X X X and have source location for M only and if you only emit a mapping for M, the program interpreting the sourcemap will consider that mapping also apply to X X X.

Thanks for explaining; this makes sense to me.

Do you know which part of this PR fixes your issue?

I think the line you pointed out in one of your review comments is indeed the location of bug. However, I'd argue there are several places where the bug could have been. The whole patch fixes the issue for good because you can't forget to handle both cases.

Have you measured the impact of boxing ori_location?

No, I have not. I think it's actually two extra boxes - one for the option and one for the new record. We could eliminate one of them by introducing a specialized option type where the Some case uses an inline record. We could eliminate both boxes if we make mapping an abstract type with functions that don't let you access certain fields without checking that they aren't negative one; this seems annoying, though.

@hhugo
Copy link
Member

hhugo commented Apr 7, 2023

We could eliminate both boxes if we make mapping an abstract type with functions that don't let you access certain fields without checking that they aren't negative one; this seems annoying, though.

You can also turn the mapping record type into a variant with 3 constructors, one for each case defined in the spec.

each segment is made up of 1,4 or 5 variable length fields.

@pmwhite
Copy link
Contributor Author

pmwhite commented Apr 10, 2023

We could eliminate both boxes if we make mapping an abstract type with functions that don't let you access certain fields without checking that they aren't negative one; this seems annoying, though.

You can also turn the mapping record type into a variant with 3 constructors, one for each case defined in the spec.

each segment is made up of 1,4 or 5 variable length fields.

Yes, good point; that would elegantly avoid introducing more boxes. That was actually approach I initially considered, not because I wanted to avoid boxing, but just because I thought it a good idea to have the types match the spec. I didn't end up doing that because I thought it would cause the PR to be a bit too large.

I'd be happy to re-write the PR to follow your suggestion. Does that sound good to you? I think if we do that, we probably don't need to measure impact after all because we aren't adding or removing boxes. (aside: I'm not sure how I would get a definitive answer about how much impact the changes had, anyway; it seems like a hard measurement to make without putting a lot time into building good benchmarks)

@pmwhite
Copy link
Contributor Author

pmwhite commented Apr 11, 2023

Actually, I just started trying that re-write, but I'm quickly discouraged from it because it greatly increases the friction of accessing gen_line and gen_col, which are accessed rather frequently. On top of the code friction (which could be reduced by using a helper function that matches on all the cases and extracts the relevant field), I would think that the cost of matching on the type constantly would be similar to the savings from not having an extra box. Would you be okay with sticking with this PR's approach?

@hhugo
Copy link
Member

hhugo commented Apr 20, 2023

@pmwhite, I've done a minimal fix in #1449 and tried the 3 constructors approach in #1450. It seems to improve both speed and memory consumption.

I run js_of_ocaml link with large js files containing sourcemaps. I did 4-5 runs and kept the best one against this PR, master, and #1450.

Here are the results

master

4.52user 0.31system 0:04.84elapsed 99%CPU (0avgtext+0avgdata 1118328maxresident)k
0inputs+161560outputs (0major+319701minor)pagefaults 0swaps

This PR

4.73user 0.44system 0:05.18elapsed 99%CPU (0avgtext+0avgdata 1230924maxresident)k
0inputs+161568outputs (0major+347848minor)pagefaults 0swaps

#1450

4.02user 0.33system 0:04.36elapsed 100%CPU (0avgtext+0avgdata 917084maxresident)k
0inputs+161568outputs (0major+269390minor)pagefaults 0swaps

#1450 is 10 to 15% faster and consume 17% to 25% less.

@hhugo
Copy link
Member

hhugo commented Apr 20, 2023

I've merged #1450

@hhugo hhugo closed this Apr 20, 2023
@pmwhite
Copy link
Contributor Author

pmwhite commented Apr 20, 2023

Oh, that is significant. Thank you very much for doing this!

@dyedgreen
Copy link

Thank you both for doing this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants