Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/trace: add absolute (system) timestamp #69869

Open
aktau opened this issue Oct 14, 2024 · 15 comments
Open

cmd/trace: add absolute (system) timestamp #69869

aktau opened this issue Oct 14, 2024 · 15 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FeatureRequest NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@aktau
Copy link
Contributor

aktau commented Oct 14, 2024

Proposal Details

Runtime traces don't contain (or display) an absolute timestamp for events. This would be useful for comparing traces with other (tracing) tools, logs et cetera.

An example would be combining a Go runtime trace with perf sched events. This would allow better investigation of kernel scheduling events and how they impact the latency of goroutines.

IIRC, runtime traces currently use a relative (monotonic) timestamps for performance reasons. It's fine to keep this as-is. But it would be helpful if (e.g.) the start of a trace batch could contain an absolute system time (perhaps something lie nanoseconds since the UNIX epoch) together with the relative timestamp it was derived from. Tools like go tool trace could then adjust all the relative timestamps post-hoc, or at least display the trace start and end time.

cc @mknyszek

@aktau
Copy link
Contributor Author

aktau commented Oct 14, 2024

(I don't know if this needs to be a proposal. It's not really a noteworthy feature, as it's an optional addition and only concerns very few users of a sub-tool, please adjust accordingly.)

@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Oct 14, 2024
@prattmic
Copy link
Member

In the new trace format, we do usually store absolute (monotonic) timestamps. The start of each batch has the full absolute timestamp. Each event within the batch contains the time delta from the previous event.

The exception is Windows, where time has low resolution, so in that case we use CPU ticks (RDTSC).

cc @felixge

@aktau
Copy link
Contributor Author

aktau commented Oct 17, 2024

That's nice. I don't know much about catapult, but could it be made to display things in absolute time?

@prattmic
Copy link
Member

In theory, I believe this field just needs to be in absolute time: https://cs.opensource.google/go/go/+/master:src/internal/trace/traceviewer/format/format.go;l=25;drc=70f4717e96cf9d8ca8a5f439d7d6487ec822ce49;bpv=1;bpt=1

The current values tend to just come from computations like https://cs.opensource.google/go/go/+/master:src/cmd/trace/goroutinegen.go;l=137;drc=70f4717e96cf9d8ca8a5f439d7d6487ec822ce49;bpv=1;bpt=1, which currently are event.Time - startTime.

The format says the time should be in microseconds. I don't know how nicely Catapult will display the values, given they will probably be huge.

@aktau
Copy link
Contributor Author

aktau commented Oct 19, 2024

That's just reading AFAICT. I spent some time looking for writing and found traceClockNow:

func traceClockNow() traceTime {
if osHasLowResClock {
return traceTime(cputicks() / traceTimeDiv)
}
return traceTime(nanotime() / traceTimeDiv)
}

nanotime is (on Linux) clock_gettime(CLOCK_MONOTONIC). Which is (quoting man7) time since boot:

CLOCK_MONOTONIC
              A nonsettable system-wide clock that represents monotonic time
              since—as described by POSIX—"some unspecified point in the
              past".  On Linux, that point corresponds to the number of sec‐
              onds that the system has been running since it was booted.

              The CLOCK_MONOTONIC clock is not affected by discontinuous
              jumps in the system time (e.g., if the system administrator
              manually changes the clock), but is affected by the incremen‐
              tal adjustments performed by [adjtime(3)](https://man7.org/linux/man-pages/man3/adjtime.3.html) and NTP.  This clock
              does not count time that the system is suspended.  All
              CLOCK_MONOTONIC variants guarantee that the time returned by
              consecutive calls will not go backwards, but successive calls
              may—depending on the architecture—return identical (not-
              increased) time values.

The absolute value can thus be calculated if we knew the time since boot. But I haven't been able to find that anywhere in a trace. Did I miss it?

@prattmic
Copy link
Member

prattmic commented Oct 21, 2024

The absolute value can thus be calculated if we knew the time since boot. But I haven't been able to find that anywhere in a trace. Did I miss it?

I think I am confused by what you mean by "absolute time". I thought that meant you wanted the CLOCK_MONOTONIC value, which we have. Do you actually mean the CLOCK_REALTIME value (wall time)? We don't record that. I don't think that would be as useful for correlation with other trace events on the machine, since the real time clock can change arbitrarily.

@aktau
Copy link
Contributor Author

aktau commented Oct 21, 2024

Do you actually mean the CLOCK_REALTIME value (wall time)? We don't record that. I don't think that would be as useful for correlation with other trace events on the machine, since the real time clock can change arbitrarily.

For correlation with other trace events on the machine, CLOCK_MONOTONIC is fine (ideal even), if one has these events available from the same epoch (boot time in Linux's case).

In my case, I don't always:

  • I'm checking metrics collected by a centralized metric system. Its timestamps use the UNIX epoch, as the metrics come from many different machines.
  • I'm checking logs printed by a given task (up to millisecond accuracy). Unless the task in question printed clock_gettime(CLOCK_MONOTONIC) somewhere, I don't have a way to line up this timestamp with trace timestamps.

Specifically, I get a crash which has exported a trace around crash time using the flight recorder. The monitoring service tells me the crash timestamp is 01:03:53 21-10-2024 UTC. I'd like to know when the trace started, and until what time before the crash it actually ran. I'd like the "when" to be absolute here, not some machine-relative timestamp.

It seems to me like this could be solved by (also) adding a timestamp relative to the UNIX epoch in every generation event. Then, tooling (go tool trace or something custom) could render all events in either a relative timeline (current default) or an absolute timeline.

That, or I'm missing something that currently exists, or a different way to correlate the data I want. Any ideas?

@prattmic
Copy link
Member

That makes sense. There is a caveat that it will probably be difficult to precisely line up events using wall time, but I agree it would be useful even just for getting a vague sense of where within logs, etc the trace occurs (within a few hundred ms).

I think including a real time timestamp in each generation would be reasonable. I think that would be sufficient. If the real time clock changes during tracing then computed timestamps may be wrong, but that should be rare.

@prattmic
Copy link
Member

(cc @mknyszek for thoughts on including a real time timestamp in each generation)

@aktau
Copy link
Contributor Author

aktau commented Oct 21, 2024

There is a caveat that it will probably be difficult to precisely line up events using wall time, but I agree it would be useful even just for getting a vague sense of where within logs, etc the trace occurs (within a few hundred ms).

Up to a few hundred ms is much better than what's currently available. It would already help me. I wonder how easy it is to get the equivalent of time.Now().UnixNano() in the runtime...

@prattmic
Copy link
Member

runtime.walltime

@mknyszek
Copy link
Contributor

We could just add this as a new event type that always lives in the same batch as EvFrequency, which currently always lives in its own batch. Old toolchains won't understand it and fail, but new toolchains will still be able to understand old traces without having to interpret anything differently.

With respect to exposing this in the trace API, the easiest thing to do would be to make it a property of the Sync event type; we could then also emit a Sync event at the start of the trace so we have a place to put the information for the first generation. Sync events always precede a generation, so you can always count on having the value you need to interpret all timestamps in a given generation.

@mknyszek
Copy link
Contributor

I don't think this needs to be a proposal. The trace parsing API is still experimental.

@mknyszek mknyszek changed the title proposal: cmd/trace: add absolute (system) timestamp cmd/trace: add absolute (system) timestamp Oct 21, 2024
@mknyszek mknyszek removed this from Proposals Oct 21, 2024
@mknyszek mknyszek added compiler/runtime Issues related to the Go compiler and/or runtime. and removed Proposal labels Oct 21, 2024
@mknyszek mknyszek modified the milestones: Proposal, Backlog Oct 21, 2024
@mknyszek mknyszek added the NeedsFix The path to resolution is known, but the work has not been done. label Oct 21, 2024
@aktau
Copy link
Contributor Author

aktau commented Oct 21, 2024

We could just add this as a new event type that always lives in the same batch as EvFrequency... With respect to exposing this in the trace API, the easiest thing to do would be to make it a property of the Sync event type.

I don't know much about this, but this sounds contradictory, unless internal trace events and trace API events are fully decoupled. In which case, SGTM.

@mknyszek
Copy link
Contributor

We could just add this as a new event type that always lives in the same batch as EvFrequency... With respect to exposing this in the trace API, the easiest thing to do would be to make it a property of the Sync event type.

I don't know much about this, but this sounds contradictory, unless internal trace events and trace API events are fully decoupled. In which case, SGTM.

Sorry, yes. They are completely decoupled. The Sync event type is an event type in the parsing API, the new event type would be part of the wire format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FeatureRequest NeedsFix The path to resolution is known, but the work has not been done.
Projects
Development

No branches or pull requests

4 participants