Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better support for running scripts. #7842

Closed
bacchanalia opened this issue Nov 30, 2021 · 28 comments · Fixed by #7851
Closed

Better support for running scripts. #7842

bacchanalia opened this issue Nov 30, 2021 · 28 comments · Fixed by #7851

Comments

@bacchanalia
Copy link
Collaborator

cabal run can run scripts, but the utility of this is limited because under the current design it recompiles on every invocation.

I propose adding the following:

  • fake packages for scripts to be written under ~/.local/share (or os equivalent)
    • specifically ~/.local/share/cabal/<abs path of script without extension>/ to avoid collisions
  • the option --tmp-script-artifacts (current behaviour. keep as default?)
  • the option --persist-script-artifacts (new behaviour)
  • the option --script-artifact-dir=DIR (place fake package in user supplied directory)
  • script support for cabal build and cabal repl
  • updates to the relevant docs

I have done a quick test with hard coded paths, and it works as expected without further changes. The only changes to run necessary are to add the flags and pass the correct directory based on them. I haven't scoped out the work needed to add support to build and repl, but I assume it would be very similar to how it works now in run.

@jneira
Copy link
Member

jneira commented Nov 30, 2021

@Mikolaj
Copy link
Member

Mikolaj commented Nov 30, 2021

also #7393

@jneira
Copy link
Member

jneira commented Nov 30, 2021

the option --tmp-script-artifacts (current behaviour. keep as default?)
the option --persist-script-artifacts (new behaviour)
the option --script-artifact-dir=DIR (place fake package in user supplied directory)

bikeshedding:

  • no option set or -no-persist-fake-package > current behavior
  • persist-fake-package > save to ./local/share
  • persist-fake-package = dir > save to dir

Trying to generalize the behaviour for fake packages and minimize the number of flags, independently of where are being used.
Not sure if the generalization makes sense, in that case replace persist-fake-package by persist-script-build.
Otoh the proper solution should be remove the need of fake-package at all in the entire codebase (see #6977), which could open the door to other improvements

@bacchanalia
Copy link
Collaborator Author

bacchanalia commented Nov 30, 2021

When else are fake packages used other than to build scripts?

edit: This is discussed in issue #6977. I shouldn't comment before coffee.

@bacchanalia
Copy link
Collaborator Author

Cache folders on different OS's (please correct me if I'm wrong):

  • on Linux save to ~/.cache (not ~/.local/share)
  • on Windows save to %LOCALAPPDATA%
  • on OSX save to ~/Library/Caches

It probably makes sense to add a utility function (in Distribution.Simple.Utils?) to get the cache folder.
Rather than putting the fake-package directory structures directly in <cache>/cabal, it might be a better idea to put them in <cache>/cabal/fake-packages in case other uses for <cache>/cabal come up.

@bacchanalia
Copy link
Collaborator Author

bacchanalia commented Nov 30, 2021

@Mikolaj I'm not sure I see the relevance of #7393?

@fendor What's the status of the work you were doing on #6149?

@jneira re #6977: I'm not sure caching makes sense for other uses of fake package. At the very least you'd need a completely separate scheme for where to place/how to identify cached instances.

@Mikolaj
Copy link
Member

Mikolaj commented Nov 30, 2021

@bacchanalia: re #7393, you complain "it recompiles on every invocation", and ghc --run would be able to execute a script without producing any compilation artefacts (and without linking) at all. Therefore, it could potentially be used as, in addition to or instead of --tmp-script-artifact. If we assume scripts are never computationally intensive, perhaps it could be used instead of cabal run always? Am I missing something?

@bacchanalia
Copy link
Collaborator Author

@Mikolaj running a script with cabal run allows you do have a cabal block

{- cabal:
build-depends: ...
-}

while ghc --run presumably just uses the global environment.

If we assume scripts are never computationally intensive

I don't think this is a good assumption. Scripts and packages have different ergonomics, and when you might prefer one over the doesn't neatly fall on that boundary, for me, anyway.

Another consideration is start-up time. On my machine running hello.hs with runghc takes 25ms, while cached cabal run only takes 15ms (and hopefully that could go lower, but that's out of scope) which is important in some applications.

@Mikolaj
Copy link
Member

Mikolaj commented Nov 30, 2021

Fine. And what about ghc --run (or --eval, but I don't remember the details) underlying --tmp-script-artifact? Is there a value in the temporary artifacts being written somewhere at all?

@fendor
Copy link
Collaborator

fendor commented Nov 30, 2021

What's the status of the work you were doing on #6149?

Stalled. I don't know how much everyone else knows about this feature, so I'll repeat a bit how this currently works: The script section (which is quite literally an executable section from a .cabal file [btw, you can break it if you put main-is: SomeName.hs into the script section]) is parsed and based on it, a .cabal file is generated and written to a project file in $TMP. The source is copied to that directory as well, iirc, and then cabal build && cabal exec is basically executed on that project in $TMP.

Doing the same for v2-repl File.hs doesn't cut it, since updates to our original script file won't be noticed, we lose the current working directory, etc... I played around with workarounds, e.g. after invoking cabal v2-repl in that directory, cd to the previous working directory, and listing the full path to the script instead of copying the source script... I can't remember whether there was a blocking issue with cding and listing the full path to the source file, but it was so weird that I wondered whether I would make everything worse by applying duct-tape to this design and stopped with my experiments.

@bacchanalia
Copy link
Collaborator Author

@Mikolaj That might be preferable in the --tmp-script-artifact case, but there might be some value in having consistency between the cases both in terms of implementation and use? Also a user might want to inspect the output. I don't really feel strongly about it either way.

@Mikolaj
Copy link
Member

Mikolaj commented Nov 30, 2021

Sure, it's your call as the implementer and, personally, I'd not expend any extra effort (unless minimal) until users express interest.

@bacchanalia
Copy link
Collaborator Author

@fendor In the thread for #6149 you mention that stack handles :r correctly. Did you investigate what they do?

@fendor
Copy link
Collaborator

fendor commented Nov 30, 2021

Did you investigate what they do?

No

@jneira
Copy link
Member

jneira commented Nov 30, 2021

Well, i think the goal is clear: avoid script recompilation in each run (as reported in #6354).
But i would try to not make the cli for caching the script compilation depends on the existence of a fake package, as it will be removed sooner or later.

Imho cabal runscript.hs should be cached by default. Like cabal build or cabal install cache everything by default without any special flag. The cache should be invalidated automatically in front of source changes, maybe adding cabal clean runscript.hs to consistently wipe out a corrupted cache.

That behaviour could use underneath the cache of fake-package build products for now. If fake-package is removed it could cache only the executable itself and nothing more. But no cli will be harmed in the process :-)

@jneira
Copy link
Member

jneira commented Dec 1, 2021

cabal repl script.hs and cabal build script.hs is another different beast, it will suppose change those commands and dont know how cabal build could be done without writing to disk the build products as the intent afaiu is only build the script without executing it but reuse all the work done. So they will need the fake-package or something similar in a known location.

i think if we would have the script cached cabal build would have less sense? we could use cabal run script.hs --only-dependencies

But i think cabal repl script.hs could be done on the fly without writing anything to disk without fake-package.

So i would separate these things:

  1. cache script using fake packages as described above
  2. change cabal repl to add support for scripts
  3. consider add support in cabal build. I think there are some value in being consistent between commands and make them work with the same targets.

And i would mark the first one as pr welcomed right now if everyone agree

@fendor
Copy link
Collaborator

fendor commented Dec 1, 2021

But i think cabal repl script.hs could be done on the fly without writing anything to disk without fake-package.

Currently we have to go over lib:Cabal (as we need to build stuff and that only happens in lib:Cabal), we can't pass an in-memory PackageDescription to lib:Cabal, so to have external dependencies, we currently have to serialise a PackageDescription to disk and point lib:Cabal to it.
However, I agree, we should do this on-the-fly without writing anything to disk. This requires us (probably) to remove fake-package logic and provide a proper implementation in exe:cabal (violating a bit the separation between lib:Cabal and exe:cabal, but who likes separation of concern anyway? ;D).

@bacchanalia
Copy link
Collaborator Author

bacchanalia commented Dec 1, 2021

i think if we would have the script cached cabal build would have less sense?

The point of cabal build would precisely be to cache the script so you can choose to pay the compilation time a different time from when you run it.

@bacchanalia
Copy link
Collaborator Author

@fendor I was just looking at this code. exe:cabal calls itself, which then uses lib:Cabal as a library. I don't really understand why that's necessary and the the original invocation of exe:cabal can't just use the library directly.

@jneira
Copy link
Member

jneira commented Dec 1, 2021

The point of cabal build would precisely be to cache the script so you can choose to pay the compilation time a different time from when you run it.

yeah thanks for confirming my guess:

dont know how cabal build could be done without writing to disk the build products as the intent afaiu is only build the script without executing it but reuse all the work done.


But i think cabal repl script.hs could be done on the fly without writing anything to disk without fake-package.

... but it would not be cached, fire up a repl is faster than compiling and we avoid the linker step. For small/medium scripts it would not be noticeable but maybe it will do for big ones.

Another approach could be reconsider the remove of fake-package of course


Nevertheless i think we could focus in the first step i mentioned above (cache script using fake packages as described above) in cabal run and start to work on this, thoughts about that @bacchanalia?

if the work in cabal run can be reused later in build and repl, we could extract the relevant code at that moment

@bacchanalia
Copy link
Collaborator Author

@jneira I agree with you about the way to proceed

  • cabal run is straight forward to make changes to with the existing fake package machinery
  • cabal build should work the same way and be able to make use of that work
  • cabal clean should probably also get work, if we have build artifacts around we'll want to be able to get rid of them

cabal repl is the outlier in the group in that it's harder to make work with a fake package for the reasons described in #6149
What I'd be tempted to do is bring in fsnotify and use it to keep the fake package in sync with the script. Alternately wait on it until after the work on #6977

@jneira
Copy link
Member

jneira commented Dec 1, 2021

What I'd be tempted to do is bring in fsnotify and use it to keep the fake package in sync with the script

hmm i think there is an actual machinery in cabal for that, i would try to reuse it if possible

https://github.com/haskell/cabal/blob/master/cabal-install/src/Distribution/Client/FileMonitor.hs

@bacchanalia
Copy link
Collaborator Author

After discussing with @jneira, the new proposal is change with default behavior without adding flags. There are several reasons for this:

  • The implementation of the new flags does not seem to be straightforwards
  • The behavior is more consistent with the way the rest of cabal works
  • cabal already has too many flags
  • My main motivation for the flags was a desire to be conservative and keep the old behavior around, but I'm not sure it's something people actually want.

@bacchanalia
Copy link
Collaborator Author

bacchanalia commented Dec 1, 2021

Does any have thoughts on <cachedir>/cabal/script-builds vs <cabal_dir>/script-cache?

@Mikolaj
Copy link
Member

Mikolaj commented Dec 1, 2021

Does XDG specification give any hints? We are moving to XDG slowly (and backward-compatibly).

Edit: see #7386 and #680, but there are also tickets about freeze files caches and perhaps there are any other caches already?

@bacchanalia
Copy link
Collaborator Author

bacchanalia commented Dec 1, 2021

That's an argument in favor of <cachedir>/cabal/script-builds. On linux <cachedir> would be ${XDG_CACHE_HOME:-~/.cache}

edit: or maybe not? The argument for <cabal_dir> is that it can be set with an environment variable, but also presumably once the XDG proposals go though <cabal_dir> would be in <cachedir> anyway.

@bacchanalia
Copy link
Collaborator Author

I had an idea for cabal clean which is that if cabal clean is run without args outside of a project context it should go through script-builds find every one that does not have an associated script on the filesystem, and delete them.

bacchanalia added a commit to bacchanalia/cabal that referenced this issue Dec 1, 2021
Enable caching of script builds by changing the location of the fake
package directory from a tmp directory to:
<cabal_dir>/scipt-builds/abs/path/to/script/

Resolves: haskell#6354
WIP: haskell#7842
bacchanalia added a commit to bacchanalia/cabal that referenced this issue Dec 2, 2021
Added module Distribution.Client.ScriptUtils for code to deal with
scripts that is common between commands.

WIP: haskell#7842
bacchanalia added a commit to bacchanalia/cabal that referenced this issue Dec 2, 2021
This changes the behaviour of cabal clean to accept extra args, which it
now interprets as script files. The behaviour of cabal clean is the same
when given extra args. When given extra args it instead removes the
caches for those scripts and also any orphaned caches (caches for which
the script no longer exists)

In addition this commit changes the cache to use hashes of paths because
this significantly simplifies the implementation of clean, and more
importantly it prevents collisions when a script has the name of the
subdirectory of a previously cached script.

WIP: haskell#7842
@fgaz
Copy link
Member

fgaz commented Dec 2, 2021

I think it'd be best to avoid mixing xdg and non-xdg. I think xdg should be adopted as a whole (#680)

What about using the store as cache by just installing the fake package as a remote package? The only problem I see is that it could quickly increase the size of the store since it's immutable

I had an idea for cabal clean which is that if cabal clean is run without args outside of a project context it should go through script-builds find every one that does not have an associated script on the filesystem, and delete them.

or it could be a part of #3333

bacchanalia added a commit to bacchanalia/cabal that referenced this issue Dec 2, 2021
repl starts in the correct directory and points directly to rather than
a dummy, so that reloading works properly.

There is a downside to the current approach which is that it uses a
different fake-project.cabal file from run and build, so it cannot share
the same cache with them.

WIP: haskell#7842
WIP: haskell#6149
@jneira jneira linked a pull request Dec 3, 2021 that will close this issue
bacchanalia added a commit to bacchanalia/cabal that referenced this issue Dec 7, 2021
bacchanalia added a commit to bacchanalia/cabal that referenced this issue Dec 13, 2021
Enable caching of script builds by changing the location of the fake
package directory from a tmp directory to:
<cabal_dir>/scipt-builds/abs/path/to/script/

Resolves: haskell#6354
WIP: haskell#7842
bacchanalia added a commit to bacchanalia/cabal that referenced this issue Dec 13, 2021
Added module Distribution.Client.ScriptUtils for code to deal with
scripts that is common between commands.

WIP: haskell#7842
bacchanalia added a commit to bacchanalia/cabal that referenced this issue Dec 13, 2021
This changes the behaviour of cabal clean to accept extra args, which it
now interprets as script files. The behaviour of cabal clean is the same
when given extra args. When given extra args it instead removes the
caches for those scripts and also any orphaned caches (caches for which
the script no longer exists)

In addition this commit changes the cache to use hashes of paths because
this significantly simplifies the implementation of clean, and more
importantly it prevents collisions when a script has the name of the
subdirectory of a previously cached script.

WIP: haskell#7842
bacchanalia added a commit to bacchanalia/cabal that referenced this issue Dec 13, 2021
repl starts in the correct directory and points directly to rather than
a dummy, so that reloading works properly.

There is a downside to the current approach which is that it uses a
different fake-project.cabal file from run and build, so it cannot share
the same cache with them.

WIP: haskell#7842
WIP: haskell#6149
bacchanalia added a commit to bacchanalia/cabal that referenced this issue Dec 13, 2021
bacchanalia added a commit to bacchanalia/cabal that referenced this issue Dec 23, 2021
Enable caching of script builds by changing the location of the fake
package directory from a tmp directory to:
<cabal_dir>/scipt-builds/abs/path/to/script/

Resolves: haskell#6354
WIP: haskell#7842
bacchanalia added a commit to bacchanalia/cabal that referenced this issue Dec 23, 2021
Added module Distribution.Client.ScriptUtils for code to deal with
scripts that is common between commands.

WIP: haskell#7842
bacchanalia added a commit to bacchanalia/cabal that referenced this issue Dec 23, 2021
This changes the behaviour of cabal clean to accept extra args, which it
now interprets as script files. The behaviour of cabal clean is the same
when given extra args. When given extra args it instead removes the
caches for those scripts and also any orphaned caches (caches for which
the script no longer exists)

In addition this commit changes the cache to use hashes of paths because
this significantly simplifies the implementation of clean, and more
importantly it prevents collisions when a script has the name of the
subdirectory of a previously cached script.

WIP: haskell#7842
bacchanalia added a commit to bacchanalia/cabal that referenced this issue Dec 23, 2021
repl starts in the correct directory and points directly to rather than
a dummy, so that reloading works properly.

There is a downside to the current approach which is that it uses a
different fake-project.cabal file from run and build, so it cannot share
the same cache with them.

WIP: haskell#7842
WIP: haskell#6149
bacchanalia added a commit to bacchanalia/cabal that referenced this issue Dec 23, 2021
jneira pushed a commit to bacchanalia/cabal that referenced this issue Dec 31, 2021
Enable caching of script builds by changing the location of the fake
package directory from a tmp directory to:
<cabal_dir>/scipt-builds/abs/path/to/script/

Resolves: haskell#6354
WIP: haskell#7842
jneira pushed a commit to bacchanalia/cabal that referenced this issue Dec 31, 2021
Added module Distribution.Client.ScriptUtils for code to deal with
scripts that is common between commands.

WIP: haskell#7842
jneira pushed a commit to bacchanalia/cabal that referenced this issue Dec 31, 2021
This changes the behaviour of cabal clean to accept extra args, which it
now interprets as script files. The behaviour of cabal clean is the same
when given extra args. When given extra args it instead removes the
caches for those scripts and also any orphaned caches (caches for which
the script no longer exists)

In addition this commit changes the cache to use hashes of paths because
this significantly simplifies the implementation of clean, and more
importantly it prevents collisions when a script has the name of the
subdirectory of a previously cached script.

WIP: haskell#7842
jneira pushed a commit to bacchanalia/cabal that referenced this issue Dec 31, 2021
repl starts in the correct directory and points directly to rather than
a dummy, so that reloading works properly.

There is a downside to the current approach which is that it uses a
different fake-project.cabal file from run and build, so it cannot share
the same cache with them.

WIP: haskell#7842
WIP: haskell#6149
jneira pushed a commit to bacchanalia/cabal that referenced this issue Dec 31, 2021
@mergify mergify bot closed this as completed in #7851 Dec 31, 2021
mergify bot pushed a commit that referenced this issue Dec 31, 2021
* Add support for script build caching to cabal run

Enable caching of script builds by changing the location of the fake
package directory from a tmp directory to:
<cabal_dir>/scipt-builds/abs/path/to/script/

Resolves: #6354
WIP: #7842

* Add support for scripts to cabal build.

Added module Distribution.Client.ScriptUtils for code to deal with
scripts that is common between commands.

WIP: #7842

* Add script support to cabal clean.

This changes the behaviour of cabal clean to accept extra args, which it
now interprets as script files. The behaviour of cabal clean is the same
when given extra args. When given extra args it instead removes the
caches for those scripts and also any orphaned caches (caches for which
the script no longer exists)

In addition this commit changes the cache to use hashes of paths because
this significantly simplifies the implementation of clean, and more
importantly it prevents collisions when a script has the name of the
subdirectory of a previously cached script.

WIP: #7842

* Add script support to cabal repl

repl starts in the correct directory and points directly to rather than
a dummy, so that reloading works properly.

There is a downside to the current approach which is that it uses a
different fake-project.cabal file from run and build, so it cannot share
the same cache with them.

WIP: #7842
WIP: #6149

* Added changelog for pr #7851

* Fix `cabal run script.hs` issue with --builddir

Fixes tests:
cabal-testsuite/PackageTests/NewBuild/CmdRun/Script/cabal.test.hs
cabal-testsuite/PackageTests/NewBuild/CmdRun/ScriptLiterate/cabal.test.hs

* Fixes for `build script` and `repl script`

- Fix build issue introduced in 079c5f0, where build was being passed
the wrong target filter
- Fix repl issue where script didn't work in a project context.
- Refactor code to share logic between repl and build/run
- Ensure temp directories are only created when needed

* Bug fixes relating to script support

ScriptUtils:
- Hash prefix for cache dirs was applied incorrectly.
- Overwriting fake-package files causes repeated work in some cases.
CmdClean:
- Clean distdir for script when --builddir is passed
- Always clean orphans because because there is no good way to specify
they should be cleaned. This may be bad behaviour in some obscure cases
(a cache is temporarily orphaned and an unrelated clean is run), but at
worst results in a cache rebuild.

* Add tests for improved script support

- Basic script support for build/repl/clean which checks for cached
project files
- Add check for cached project files to basic run script test
- No repeated work for build/build, build/run, run/run, and repl/repl
- Clean does not remove cache for existing scripts
- Clean does remove orphaned script caches

* Fix clean bug uncovered by 5fad121

- clean was trying to read source-builds even if it didn't exist
- add test specific to this case with other clean tests

* Update documentation for better script support

Ready for review: #7851
May close: #7842, #6354, #6149

* Attempt to fix `repl script` on Windows

PR #7851

* Attempt to fix remote test failures

Test logs showed that the failures where because the tests depended on a
module from cabal-install that some ghc versions could not find.

Instead of depending on cabal-install, I copied the needed function into
Test.Cabal.Prelude (It seemed like an acceptable place for it)

PR #7851

* Attempt to fix `repl script` on Windows

PR #7851

* Attempt to fix tests on old ghc versions

Tests failing on pre-AMP ghcs due to unsanctioned use of (<$>)

PR #7851

* Feedback: Update docs and formatting

PR #7851

* Feedback: code style changes

- remove partial selectors
- make a constant for fake-package.cabal

PR #7851

* Feedback: make hidden control flow explicit

PR #7851

* Feedback: add expected fail script run tests

PR #7851

* Fix `repl script` when cwd is deeper than cachedir

PR #7851

* Use script in-place for build or run

- Set the hs-source-dir to the location of the script for build and run,
  the same as with repl
- This removes the need to copy the script
- repl no longer needs a separate cache because all three commands
  use identical project files
- Adds multi-module support to scripts for free (#6787)
- Add new build/repl test and run multi-module test

PR #7851

* Fix file-locking issue on Windows

PR #7851

* Fix script recompilation based on cwd

- Pass info about cwd to repl through --repl-options instead of hacking
  it into the project file.
- Improve paths output by makeRelativeCanonical, makeRelativeToDir, and
  makeRelativeToCwd.
- Script multi-module support works, but with warning in repl.
- Remove script multi-module mention support in docs.

PR #7851

* Make `repl script` respect --repl-no-load

* Feedback: minor refactor

Move argument truncation from targetStrings out of
withScriptContextAndSelectors to runAction

PR #7851

* Feedback: refactor and comments for repl options

PR #7851

* Don't use hs-source-dirs for scripts.

- instead pass absolute path to script in main-is
- resolves issue with relative paths on Windows
- simplifies code and gives prettier build output
- update tests because build output has changed
- removes ability to use multi-module scripts
  (which was never officially endorsed)
- remove test for multi-module scripts
- add checks for unsupported fields in scripts

PR #7851

* Update changelog for PR #7851
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants