-
Notifications
You must be signed in to change notification settings - Fork 715
Deduplicate hsc2hs args #11005
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Deduplicate hsc2hs args #11005
Conversation
0ac7b37
to
86c4392
Compare
You might do better with a test that tells |
86c4392
to
0cd5fad
Compare
It seems that this is more of an issue in Cabal already passes arguments to Not saying this isn't a good workaround for now but good to work out what the longer term correct thing to do is. |
b31bb26
to
c3cedc9
Compare
@geekosaur I did the change of introducing custom gcc that checks for duplicate arguments. The only thing is that it’s a bit too much work to do in Windows so there the successful compilation itself will do the check. @mpickering It does seem like a good idea to fix hsc2hs to use response files. Somehow I found out that it does use them sometimes, but not when the command-line is too long. Still, this change seems desirable because with enough dependencies there can be appalling amount of duplication which is just redundant. |
If extra-lib-dirs and/or extra-include-dirs are specified via either command line or project files, they’re going to be added for each package in the dendency graph. With enough long enough directories the command-line length limits for gcc may be hit and calls to the C compiler made from hsc2hs will start failing. Ideally hsc2hs should be using repsonse files, but needlessly specifying myriad of command-line parameters is redundant anyway.
c3cedc9
to
cd7e055
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to reviewers: the first diff is the actual change, the last is the changelog entry, the rest are tests with lots of files to try to trigger the error.
Synopsis
TLDR it’s all too easy to make cabal pass too many command-line arguments to
hsc2hs
so that it will stop working at all.Descirption
When
extra-lib-dirs
and/orextra-include-dirs
are specified (on either command-line or viacabal.project
orcabal.config
) and project has hsc2hs files that need processing, the cabal will instruct hsc2hs to use specified directories by passing them via--cflag
and--ldflag
arguments.The problem is that when, say,
N
different directories are specified, and current cabal file depends onK
packages then each directory will be passed via--ldflag
or--cflag
in totalK
times because cabal will pass it once for each package in the graph. Namely,extra-*-dirs
are added to all packages in graph (either by default which is probably wrong or with some effort on the user’s side despite cabal’s best judgement).Therefore hsc2hs receives
N * K
command-line arguments with lots of duplication and thes passes them togcc
. If the hsc2hs itself if recent enough then it’ll receive arguments via response file. But it will pass arguments to gcc on command line so the limit on the number of arguments will be hit there.In either case it seems reasonable to stop passing duplicate arguments regardless of where they came from which is what this PR adresses.
I have added a somewhat big test that needs to be this way to test
N * K
multiplication of command-line options and total number of options has to be big enough to not fit into command-line argument length limit which is nontrivial nowadays.The fix
The fix is to first reorder hsc2hs arguments into two groups, one that goes into
--cflag
and another one that goes into--ldflag
and deduplicate each separately. Relative order between--cflag
and--ldflag
was changed but it shouldn’t matter since they’re from different "namespaces". Order within each group was preserved. One commit just reorders groups and another adds deduplication.A smaller patch is possible, e.g. https://github.com/haskell-infra/hackage-doc-builder-config/blob/master/cabal-hsc2hs-args-patch.diff, but it will go around deduplicating among arguments that start with different prefixes every time which is going to be extra work.
Real-world significance
NB this is not a theoretical toy, it has been preventing generation of documentation on Hackage no less. It’s going to take some investigation if you want to check that out, but in brief:
ghcup
package, which depends on lots of cabal packages, fails to generate https://hackage.haskell.org/package/ghcup-0.1.50.2/reports/4 with an error like this:Error: cannot execute /nix/store/2agih0y5ns3sgbviw2xhivdgg59b41g9-gcc-14-20241116/libexec/gcc/x86_64-unknown-linux-gnu/14.2.1/cc1: posix_spawn: Argument list too long
hackage-build
tool that uses nix and leaves at https://github.com/haskell-infra/hackage-doc-builder-config/tree/58dfa4643d74c2de595407d40da7a6f2869d511b at the momenthackage-build
making up acabal.config
file with foreign dependencies https://github.com/haskell-infra/hackage-doc-builder-config/blob/58dfa4643d74c2de595407d40da7a6f2869d511b/run-hackage-build.nix#L13 that specifiesextra-lib-dirs
andextra-include-dirs
for a modest list of dependencies defined at https://github.com/haskell-infra/hackage-doc-builder-config/blob/58dfa4643d74c2de595407d40da7a6f2869d511b/build-depends.nixghcup
it callshsc2hs
to do the preprocessing and that step failsTemplate Α: This PR modifies behaviour or interface
Include the following checklist in your PR:
significance: significant
in the changelog file.