Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refresh.py: basic windows support #28

Closed
wants to merge 1 commit into from

Conversation

lummax
Copy link

@lummax lummax commented Feb 14, 2022

No description provided.

Copy link
Contributor

@cpsauer cpsauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay! Thank you, Lukas. Super exciting to bring this to Windows :) I really appreciate your great work.

So I know (and in the absence of a description): Is there more to come here/is this a WIP? Or should I be looking to merge?

In the meantime, I'll read through and write down some quick comments.

[Also, I don't have a Windows machine handy at the moment so I'm going to have to ask you to be responsible for making sure it works. But it sounds like you're already running/using some variant of this.]

Thanks again!
Chris

@@ -50,6 +51,12 @@ def _get_headers(compile_args: typing.List[str], source_path_for_sanity_check: t
# As an alternative approach, you might consider trying to get the headers by inspecing the Middlemen actions in the aquery output, but I don't see a way to get just the ones actually #included--or an easy way to get the system headers--without invoking the preprocessor's header search logic.
# For more on this, see https://github.com/hedronvision/bazel-compile-commands-extractor/issues/5#issuecomment-1031148373

if "cl.exe" in compile_args[0]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intended to detect both msvc and clang-cl, right? (If so, great!)

A key thing to double check from other platforms: Bazel is fond of wrapping the compiler in a shell script on some platforms. (e.g. the Grailbio Clang toolchain you'd raised in #15 and Bazel Apple). You aren't seeing that with Windows, are you?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually did not test this with clang-cl, so I don't know how that behaves. I did not see wrapper scripts for MSVC.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you check the clang toolchain you'd posted in that other issue, and commenting what you find?
(Great to hear no wrappers for msvc.)

Also: thoughts on using endswith?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bazel-contrib/toolchains_llvm#4 there is no support for Windows yet. :(

endswith(): good catch :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops. Sorry. That's on me.
[I'd searched windows in their codebase, seen some references, and wrongly concluded there was support.]

Minor, but any chance we could add a comment mentioning that we're also trying to catch clang-cl with the endswith?
(I can also do on merge, but writing this as a note, rather than resolving, to make sure that collectively we don't forget.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done and rebased.

refresh.template.py Outdated Show resolved Hide resolved
refresh.template.py Outdated Show resolved Hide resolved
refresh.template.py Outdated Show resolved Hide resolved
refresh.template.py Show resolved Hide resolved
refresh.template.py Show resolved Hide resolved
return [] # Worst case, we couldn't get the headers,

headers = (
# TODO This is not enough. Based on the locale, `cl.exe` will emit different marker strings. This should be extendend to multiple languages.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, I'm so sorry MSVC does this.

Brainstorming some crazy ideas:

  • We could try to call Bazel's code? (Not sure why they're extracting headers, but I remember that code you linked.)
  • Is there a way to set the compiler's locale?
  • We could invoke the compiler twice, with and without /showIncludes, and scan the lines that differ for the last instance of ": "?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could try to call Bazel's code? (Not sure why they're extracting headers, but I remember that code you linked.)

That is Java code, so I guess not.

Is there a way to set the compiler's locale?

No idea. Might depend on installed language packs for MSVC.

We could invoke the compiler twice, with and without /showIncludes, and scan the lines that differ for the last instance of ": "?

Would increase generation time for the compile_commands.json quite drastically.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd really like it we shipped something more robust here, and it feels like you've mostly figured out some good ways to do this better.

I did do some more digging, and maybe setting VSLANG=1033 would do the trick, getting everything always printed in english? (I saw ninja wrestling with it here) If it works, that would be so much better.

(Another creative idea: We could also run a test invocation, extract the prefix, and then match on that?)

If that fails, at a minimum, could we match the language prefixes that Bazel does and you linked to? (Languages also listed, maybe more easily, in that ninja issue.

Anyway, up to you on how, but I'd really like it if we solved this more robustly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did do some more digging, and maybe setting VSLANG=1033 would do the trick, getting everything always printed in english? (I saw ninja-build/ninja#613) If it works, that would be so much better.

I tried setting it to turkish just for the header finding cl.exe call using this trick and it did not work. A comment says "before calling vcvars32/64" and that is out of our control.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that fails, at a minimum, could we match the language prefixes that Bazel does and you linked to? (Languages also listed, maybe more easily, in that ninja issue.

Done.

refresh.template.py Outdated Show resolved Hide resolved
refresh.template.py Outdated Show resolved Hide resolved
refresh.template.py Outdated Show resolved Hide resolved
@lummax
Copy link
Author

lummax commented Feb 15, 2022

So I know (and in the absence of a description): Is there more to come here/is this a WIP? Or should I be looking to merge?

This is working as-is for us. But this is "basic" as there are still open points about multi-language/encoding support. I can't really answer them.

One option for the multi-language support would be to research and include all the marker strings. Don't know if we want that or if we wait for users to pop in and add support one-by-one (and provide a great opportunity for a good first PR :D)

[Also, I don't have a Windows machine handy at the moment so I'm going to have to ask you to be responsible for making sure it works. But it sounds like you're already running/using some variant of this.]

Yes.

@lummax
Copy link
Author

lummax commented Feb 15, 2022

Sorry, the force-push hides the fix-up commits. There are several accessible from 5252003

@cpsauer
Copy link
Contributor

cpsauer commented Feb 16, 2022

Not a problem on the force pushes. This is all small enough that it's easy for me to scan through. Thanks for having kept things incremental!

I would really like to ask for thing things to be a bit more robust before we merge, especially because my ability to support Windows folks well is more limited. Tapped back some comments above. Thanks for your patience on these :) I really appreciate your help.

@cpsauer
Copy link
Contributor

cpsauer commented Feb 18, 2022

About to head to sleep over here, but I got curious when I saw your comment about vcvars, and started reading up

Looks like there might be something very interesting there that could solve multiple problems for us. In addition to language, it looks like that command is what's supposed to be responsible for setting up the required environment variables, like INCLUDE. Running it in the script might solve both problems!

Some quick links, to hand things off in our cross-timezone relay:

@cpsauer
Copy link
Contributor

cpsauer commented Feb 24, 2022

@lummax, just checking in to make sure I'm not holding things up here. Super excited to merge this, and we're almost there!
But I'd love to ask for your help investigating the vcvars stuff (above) on your windows machine, in the hope that it might radically simplify things

@cpsauer cpsauer mentioned this pull request Feb 24, 2022
@cpsauer
Copy link
Contributor

cpsauer commented Feb 26, 2022

Noting down to either you or future me that we also want to update the README as part of this change. Decidedly awesome windows support, rather than basic, I'd say.

@lummax
Copy link
Author

lummax commented Mar 1, 2022

Sorry for being unresponsive. I have to squeeze working on this into the other obligations.

investigating the vcvars stuff

I see three ways that I both don't like:

  • call that vsvarsall.bat once and extractinng the required environment
  • create a wrapper that invokes every compilation action inside the correct environment by calling vsvarsall.bat for each one
  • detect Windows at the very start of the extract.py and somehow call the script again but under an environment where vsvarsall.bat has run

I don't like any of them and would rather depend on the user to have that script called in their shell environment.

update the README as part of this change
I'll leave that up to you :)

@cpsauer
Copy link
Contributor

cpsauer commented Mar 2, 2022

Totally get it, and I appreciate your help.

[Same situation over here, so I empathize! :) Sorry to be pushing for quality and not coding. Worst case I can spin up a Windows VM and dive in; however, I would really appreciate your finishing this out if you can, since you have things set up.]

Re the vcvars options, wouldn't any of those three better than the current hardcoding, if they work? That said, there might be even easier options. How about:
(1) For the language settings: what about doing a manual test run of vcvarsall.bat with and without the VSLANG setting, inspecting the environment to see how vcvars is setting cl.exe to English (assuming that does indeed work, might also need chcp 437?)). Then just directly set whatever environment variable vcvarsall uses in the python subprocess call.
(2) For INCLUDE: Looks like you might be able to get Bazel to give it to you via starlark by calling in through the toolchain (or through the link above).

@cpsauer
Copy link
Contributor

cpsauer commented Mar 19, 2022

@lummax, could I ask you to chase down those last couple leads so we can get this merged in?
Would be pretty awesome to know that we're as robust and clean as can be.

@lummax
Copy link
Author

lummax commented Mar 21, 2022

(1) For the language settings: what about doing a manual test run of vcvarsall.bat with and without the VSLANG setting, inspecting the environment to see how vcvars is setting cl.exe to English (assuming that does indeed work, might also need chcp 437?)). Then just directly set whatever environment variable vcvarsall uses in the python subprocess call.

I can't see revelant changes when setting VSLANG and re-running vcvarsall.bat.

I am skeptical of using anything related to the vcvarsall.bat because it is not-trivial to find it. bazel does for the autoconfigure:

So this is already hard to find the correct vcvarsall.bat and to do it in a robust fashion. And configurable with some environment variables: https://bazel.build/docs/windows#build_cpp. This is just the autoconfigure of the bazel CC toolchain. The user is free to setup their own CC toolchain.

I get the wish to make it as robust as possible. But IMHO this is a little out of the scope of the bazel-compile-commands-extractor if the user can get the same result by calling vcvarsall.bat themselves or setting up the right environment variables.

@cpsauer
Copy link
Contributor

cpsauer commented Apr 6, 2022

Wahoo! We're merged--in good final form 3b14e77. Many thanks, @lummax for laying the foundation of Windows support!

In retrospect, I made a rookie maintainer mistake by taking not merging this way earlier, and having us go incrementally from there. I also somewhat butchered git history dealing with the Windows \r\n. [Enough that GitHub didn't tie the commit to this PR, so I'll have to manually close.] I hope you'll forgive me.

Regardless, all your code is in, with many thanks.

If you're curious how some of the above resolved:

  • Turns out setting the compiler language via VSLANG does work...it just it doesn't work if you don't have the target language pack installed. Since we can't guarantee people will have English, that makes your hardcode-all-languages approach the cleanest we've got, I think. I added a note to that effect--and also wrote that to the Ninja folks
  • It was indeed possible to get the results to INCLUDE from the (top-level) Bazel toolchain. They really should be providing it in aquery, though. Updated my macOS issue to include that.

@cpsauer cpsauer closed this Apr 6, 2022
@lummax lummax deleted the basic-windows-support branch April 7, 2022 06:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants