Skip to content

Conversation

@TheLortex
Copy link

@TheLortex TheLortex commented Jun 9, 2023

Context

The runtime events infrastructure offers an API to write events to a ring buffer. These events are then consumed asynchronously by other programs. For eio, there are two potential consumers:

  • mirage-trace-viewer: displays the fiber execution trace.
  • meio: a monitor for eio's state. Displays the tree of fibers and cancellation contexts, and useful data such a GC latency and scheduler metrics.

This PR switches from the CTF format to using that new event infrastructure. Commits should be reviewable individually and the PR can be split as needed.

Commits

Switch from CTF to custom runtime events

A sub-library is created, defining custom event types, tags and events for eio. The event type is not modified, maybe we could remove unused fields.

Add events related to cancellation contexts + add note_parent

Cancellation contexts are additionally tracked, so that meio can display the cancellation context + fiber tree.
A note_parent event is added for that purpose, along with a Cancellation_context resource type.

separate Ctf.label in Ctf.set_name and Ctf.log

label is used both to set the name of a fiber and log what happens while it runs (such as "readv", "writev"). The function is split in set_name and log to better reflect the semantics.

Name cancellation context and give names to internal fibers

Add the ability to name cancellation contexts such as switches. Names are given to internal fibers.

tracing: create a task for system thread

System thread has a task ID, but its creation is not traced. This task is displayed in meio to show the overhead of eio's runtime system.

API to provide source location

In the default case, resources won't have a user-defined name, but it would still be useful to know what led to its creation. This PR adds an event to attach location to resources.

sprinkle location a bit everywhere

This adds an optional ?loc:string parameter to public eio functions.

automatic caller location

Uses the callstack to figure out the location when it's not user-specified. This probably has a huge cost so better solution should be investigated.

Notes

  • The last part about caller location makes big changes in eio's API. So I don't know if it's worth it at first. Happy to remove the last three commits. It's also not reliable to use the callstack in some situations, for example when the eio function is called in tail position. Better ways of obtaining caller location are investigated here: Automatically insert source location ocaml/ocaml#126
  • @patricoferris initiated the work, the attribution was lost in the rebase but I won't forget to add it back before the PR is merged.
  • Will only work starting from OCaml 5.1

@TheLortex
Copy link
Author

@talex5 now CI passes for 5.1

@avsm
Copy link
Contributor

avsm commented Jun 27, 2023

@TheLortex this is great, thanks! Do you plan to add a conditional compilation later to this PR so that it'll continue to work on OCaml 5.0? If you think it's too complex, I'm not opposed to making a future version of Eio depend on OCaml 5.1.0+ only, but it'll take a lot longer to merge since it's gated on that release.

@TheLortex
Copy link
Author

Sure, I can add conditional compilation, it shouldn't be too complex

@avsm
Copy link
Contributor

avsm commented Jun 28, 2023

Thanks; that would be the preferred option to get this support into eio soon, and deprecate 5.0 at a later stage.

@avsm
Copy link
Contributor

avsm commented Jul 7, 2023

Just a headsup that there are quite a few conflicts against main for this PR, @TheLortex

@talex5
Copy link
Collaborator

talex5 commented Jul 11, 2023

OCaml 5.1 is already in beta, so it shouldn't be long before we can just use that.

@TheLortex TheLortex force-pushed the runtime-events-tracing branch from 61ac96f to 9859eb9 Compare July 12, 2023 09:47
@avsm
Copy link
Contributor

avsm commented Jul 31, 2023

Discussed at dev meeting: we can bump eio to be OCaml 5.1.0 only as soon as it is released, and therefore do not need compatibility with 5.0 in this PR.

@patricoferris
Copy link
Collaborator

As an update I have a rebased PR at https://github.com/patricoferris/eio/tree/runtime-events-tracing which includes the objects to FCM switch, and just supports the Runtime Events (OCaml 5.1.0) tracing.

@talex5 talex5 marked this pull request as draft November 6, 2023 11:15
@talex5 talex5 mentioned this pull request Nov 20, 2023
5 tasks
@talex5
Copy link
Collaborator

talex5 commented Dec 21, 2023

Most of the remaining work here was merged (in an updated form) in #656, so I think we can close this.

The main change is that I removed note_parent, as that makes it seem that fibers could move to any CC. In fact, this only happens when running a new CC, and conceptually the fiber doesn't move (instead, the CC is created inside the fiber).

The remaining parts are adding locations everywhere (which is probably best left until we have some support from the compiler) and adding more labels, which I'll work on separately.

Thanks!

@talex5 talex5 closed this Dec 21, 2023
@talex5 talex5 mentioned this pull request Jan 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants