Running several counters concurrently

`perf` can only monitor a specific OS process running on specific (or all) CPU core. It's unaware of Haskell's RTS and OS threads.

I expect that running several counters concurrently may give strange confusing results. Running a test with counter at the same time with other (non cpu-instruction-counter) tests will also be confusing.

Now, some test runners (e.g. tasty) do parallel test execution by default. This may be a great source of confusion for an unaware user.

I have several ideas of ranging complexity that can help here, but ultimately we have to play around and investigate this.

- Add a visible notice to README telling users not to run counters concurrently
- Make a global lock that is taken by `startInstructionCounter`
  - if the lock is taken, next `startInstructionCounter` can fail with meaningful error message;
  - or it could just wait until the lock is released, which will sequentialize `cpu-counter` tests
  - BUT: all of this seems hacky and won't help if you concurrently run non cpu-instruction-counter tests
- We can investigate how to actually make it work concurrently:
  - `startInstructionCounter` can return a `Handle` that will allow to work with this specific counter, tracking information related to it 
  - we can use `forkOn` to run on specific capability which usually corresponds to a core. It's implementation dependent, but we only work on Linux so it's probably fine
    - but probably there's a more reliable way to fork onto specific core, I don't know
  - I scanned through a manpage and noticed interesting variables like `PERF_SAMPLE_ID`, `PERF_FORMAT_ID`, `PERF_SAMPLE_GROUP`, `PERF_SAMPLE_ID`. I din't look any closer yet, but maybe this can be used for reliably tracking several counters. This [stackoverflow question](https://stackoverflow.com/questions/42088515/perf-event-open-how-to-monitoring-multiple-events) may be related, but I didn't read closely.

I can only be sure about the first option (warn users in the README). In any case, `cpu-instruction-counter` is a thing that works only on Linux and uses FFI, so the best practice should be that all instruction counting tests/benchmarks live in separate executable, that's compiled with `+RTS -N1` which eliminates the problem.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Running several counters concurrently #4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Running several counters concurrently #4

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions