Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High CPU use with WSL #182

Open
tomjaguarpaw opened this issue Jul 18, 2018 · 25 comments
Open

High CPU use with WSL #182

tomjaguarpaw opened this issue Jul 18, 2018 · 25 comments

Comments

@tomjaguarpaw
Copy link

ghcid on Windows was constantly using 25% CPU (which I guess is all of one core but I'm not certain) even when there were no changes to any files that it's watching. Is there anything in particular that I should be tweaking?

$ ~/.cabal/bin/ghcid --version
Auto reloading GHCi daemon v0.6.6
$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 7.10.3
@ndmitchell
Copy link
Owner

Is it the ghcid process itself, or the ghci it spawns?

@tomjaguarpaw
Copy link
Author

ghcid, apparently:

image

@ndmitchell
Copy link
Owner

Weird. I can only imagine it's the file watching code going nuts, or GC making a freak out, but with only 1.2Mb active mem GC seems unlikely. Can you reproduce it?

@tomjaguarpaw
Copy link
Author

It seems to happen every time I launch ghcid on this particular project. I am using 7.10 though so it may be something that's not longer a problem for people who are living in the present day.

@ndmitchell
Copy link
Owner

Can you launch with --verbose? Can you launch with +RTS -S? If you can profile that's even better. Trying with a ghcid compiled with a newer GHC, even if the project remains on 7.10, would be interesting.

@tomjaguarpaw
Copy link
Author

With +RTS -S I get lots of lines like the below. Does that help?

   522240       496    358568  0.000  0.000    3.844    4.817    0    0  (Gen:  0)
   522240       496    358568  0.000  0.000    3.859    4.825    0    0  (Gen:  0)
   522240       496    358568  0.000  0.000    3.859    4.832    0    0  (Gen:  0)
   522240       496    358568  0.000  0.000    3.875    4.838    0    0  (Gen:  0)
   522240       496    358568  0.000  0.000    3.875    4.844    0    0  (Gen:  0)
   522240       496    358568  0.000  0.000    3.891    4.856    0    0  (Gen:  0)
   522240       496    358568  0.000  0.000    3.891    4.867    0    0  (Gen:  0)
   522240       496    358568  0.000  0.000    3.906    4.874    0    0  (Gen:  0)
   522240       496    358568  0.000  0.000    3.906    4.881    0    0  (Gen:  0)
   522240       496    358568  0.000  0.000    3.922    4.887    0    0  (Gen:  0)
   522240       496    358568  0.000  0.000    3.922    4.897    0    0  (Gen:  0)
   522240       496    358568  0.000  0.000    3.938    4.906    0    0  (Gen:  0)
   522240       496    358568  0.000  0.000    3.938    4.912    0    0  (Gen:  0)

@ndmitchell
Copy link
Owner

How frequent are they? How many lines/sec while idle do you see?

@ndmitchell
Copy link
Owner

Does +RTS -I0 make them go away?

@tomjaguarpaw
Copy link
Author

I think those incrementing numbers are seconds so it looks to be in the order of 100 per second. Unfortunately it seems I haven't compiled in support for -I0. If I get a moment to recompile I'll try it.

@tomjaguarpaw
Copy link
Author

I realised this is under WSL rather than a Windows native binary. The plot thickens.

@ndmitchell
Copy link
Owner

Any news on -I0? That's easy to fix and used to be an issue with Shake.

@tomjaguarpaw
Copy link
Author

Haven't tried yet, sorry.

@Wizek
Copy link

Wizek commented Aug 28, 2018

I have also ran into this issue under WSL. Trying to run with +RTS -I0 gives ghcid: Most RTS options are disabled. Link with -rtsopts to enable them. So I would need to build ghcid from scratch for this to try it seems. I wonder if there may be a nix one-liner snippet to do so quickly.

Btw, may be relevant: I have observed this issue when using WSL GHCid with NTFS (/mnt/c/...), haven't tried the 'more native' lxss/ext file system approach (/home/...).

@Wizek
Copy link

Wizek commented Aug 29, 2018

I've recompiled GHCid with -rtsopts from master and ran it with +RTS -I0; no discernible change in CPU usage.

image

I've also tried +RTS -S, and like @tomjaguarpaw, I get many of those (maybe dozens per second) once GHCi finished loading (only sparsely interspersed between loading messages before that). +RTS -I0 -S Doesn't seem to change that.

  1044480       368   1794320  0.000  0.000   57.375  320.851    0    0  (Gen:  0)
  1044480       368   1794320  0.000  0.000   57.375  320.859    0    0  (Gen:  0)
  1044480       368   1794320  0.000  0.000   57.375  320.870    0    0  (Gen:  0)
  1044480       368   1794320  0.000  0.000   57.375  320.892    0    0  (Gen:  0)
  1044480       368   1794320  0.000  0.000   57.391  320.908    0    0  (Gen:  0)
  1044480       368   1794320  0.000  0.000   57.391  320.915    0    0  (Gen:  0)
  1044480       368   1794320  0.000  0.000   57.391  320.942    0    0  (Gen:  0)
  1044480       368   1794320  0.000  0.000   57.391  320.951    0    0  (Gen:  0)
  1044480       368   1794320  0.000  0.000   57.406  320.976    0    0  (Gen:  0)
  1044480       368   1794320  0.000  0.000   57.406  320.981    0    0  (Gen:  0)
  1044480       368   1794320  0.000  0.000   57.406  320.994    0    0  (Gen:  0)
  1044480       368   1794320  0.000  0.000   57.406  321.002    0    0  (Gen:  0)
  1044480       368   1794320  0.000  0.000   57.406  321.012    0    0  (Gen:  0)
  1044480       368   1794320  0.000  0.000   57.406  321.026    0    0  (Gen:  0)
  1044480       368   1794320  0.000  0.000   57.422  321.047    0    0  (Gen:  0)

Here is the last summary I got after ctrl+c with -I0 -S:

  25,681,470,392 bytes allocated in the heap
      16,876,008 bytes copied during GC
       1,639,056 bytes maximum residency (4 sample(s))
          73,072 bytes maximum slop
               5 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     24584 colls,     0 par    0.672s   1.106s     0.0000s    0.0075s
  Gen  1         4 colls,     0 par    0.016s   0.015s     0.0037s    0.0087s

  TASKS: 6 (1 bound, 5 peak workers (5 total), using -N1)

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.000s  (  0.002s elapsed)
  MUT     time   56.797s  (321.214s elapsed)
  GC      time    0.688s  (  1.121s elapsed)
  EXIT    time    0.000s  (  0.008s elapsed)
  Total   time   57.484s  (322.345s elapsed)

  Alloc rate    452,163,440 bytes per MUT second

  Productivity  98.8% of total user, 99.7% of total elapsed

gc_alloc_block_sync: 0
whitehole_spin: 0
gen[0].sync: 0
gen[1].sync: 0

Is that helpful? Any next idea @ndmitchell or @tomjaguarpaw?

@tomjaguarpaw
Copy link
Author

tomjaguarpaw commented Aug 29, 2018 via email

@ndmitchell
Copy link
Owner

The fact most of the time is in MUT suggests if you compiled with profiling and ran it might tell you what where the time was going in the profile. I'd recommend that as the next step.

@tomjaguarpaw - are you on WSL? Windows?

@Wizek - does this happen for all projects or just one? Which GHC version?

@tomjaguarpaw
Copy link
Author

tomjaguarpaw commented Aug 30, 2018 via email

@ndmitchell
Copy link
Owner

See #121 (comment) - seems a more recent WSL has improved things. It also reminds me that if you are having issues, --poll might be your friend.

@ndmitchell ndmitchell changed the title High CPU use High CPU use with WSL Sep 2, 2018
@Wizek
Copy link

Wizek commented Sep 3, 2018

I can quickly comment that I dimly remember having observed the same high CPU use with --poll as well. Maybe I can try --poll 100 to verify that the CPU use is independent from file watching.

I'm also glad to read that this may be improved in an windows insider build, I'll attempt to verify that.

@Wizek
Copy link

Wizek commented Sep 3, 2018

Just checked: --poll=100 still uses 25% CPU constantly. And as expected it is taking a long time to pick up on file changes, so polling seems to be in effect indeed.

@Wizek
Copy link

Wizek commented Sep 3, 2018

Profiling done: https://gist.github.com/Wizek/f520108a5282aefb9333bca081fd565d

Does this give you any ideas? Next up: might be trying the insider build.

@Wizek
Copy link

Wizek commented Sep 3, 2018

Verified, insider build 17744 works well! Tried 2 combinations: editing files from a WSL editor vs a win32 editor; both on NTFS, both without poll, both picked up file changes and both work without excess CPU use. Wonderful!

For the sake of my curiosity, if one of you can tell me if you can make any sense of my attached prof file, I would appreciate knowing what was going wrong in the stable builds. Also, it may take months before this fix lands on stable.

@ndmitchell
Copy link
Owner

Unfortunately that profile is garbage, it says that it spent 100% of the time executing main which is <built in>. I've never seen anything like that before. Suggests something has upset the runtime quite substantially.

@lspitzner
Copy link
Contributor

doesn't that just mean that the profiled build is lacking the -fprof-auto (or related flags) in the right places (the right deps)?

@ndmitchell
Copy link
Owner

Probably @lspitzner - I confess I've never tried generating a profile without setting the flags first. My standard flags are -prof -auto-all -caf-all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants