-
Notifications
You must be signed in to change notification settings - Fork 14.1k
[llvm]Add a simple Telemetry framework #102323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Objective: - Provide a common framework in LLVM for collecting various usage metrics - Characteristics: + Extensible and configurable by: * tools in LLVM that want to use it * vendors in their downstream codebase * tools users (as allowed by vendor) Background: The framework was originally proposed only for LLDB, but there were quite a few requests that it should be moved to llvm/lib given telemetry is a common usage to a lot of tools, not just LLDB. See more details on the design and discussions here on the RFC: https://discourse.llvm.org/t/rfc-lldb-telemetry-metrics/64588/20?u=oontvoo
some nits: just I think I'm missing the background to properly review the contents of this |
Please follow https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly. |
@jh7370 Here's the patch adding Telemetry as a common llvm framework that you'd requested. Please review. Thanks |
Again, as mentioned there is NO current user of this - but it will be used by LLDB (see attached patch on my first comment), along with the RFC. I think that should be sufficient to demonstrate how this can be used. |
There are no contributions without tests. |
I'll try to find time next week to take a look at this. |
Sorry, I haven't got around to looking at this yet: I've got some fires in my main line of work to deal with before I can sink my teeth into this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted elsewhere, please redo all your variable and function names to conform with the coding standards. It's surprisingly distracting having them wrong! Also, please reflow your comments to properly use the 80 character width, as it makes it a little harder to read when they are broken at somewhat arbitrary lengths.
To help reviewers, could you add some comments (potentially doxygen style) to the classes that you've added that describe their intended purpose, along with possibly a top-level comment highlighting the overall structure and how a tool might use this telemetry code - I'm aware that some of this is in the RFC, but ultimately people won't want to have to refer to an old RFC thread to discuss this. It might be even best to put the overall design and usage description under the llvm/docs folder.
I think you need at least one concrete implementation that can be used to write to a raw_ostream instance, which can be used for unit testing, and which helps to illustrate the interface.
virtual std::string name() const = 0; | ||
}; | ||
|
||
class Telemeter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a weird name. Could you try to use English names only, please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is English, AFAIK. https://www.oxfordlearnersdictionaries.com/us/definition/english/telemeter_2
P.S: Basically means a device for recording & sending telemetry, which is the appropriate name for this class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shows what I know 😄
Not sure it's a well-known word though, so it doesn't really help convey meaning though, which is the key thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't resolve conversations I've been involved with (see https://discourse.llvm.org/t/rfc-github-pr-resolve-conversation-button/73178 for a lengthy discussion). I'll resolve these when I'm happy with the resolution. In particular, as noted, I don't think the name "Telemeter" is going to convey the meaning clearly enough. Furthermore, as noted earlier, having all the classes prefixed with Telemetry
in their name seems unnecessary, given their presence in the telemetry
namespace. Could they simply be called Destination
, Config
, Event
etc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've renamed the Destiantion, Config, etc to get rid of the prefix.
What are your suggestions for alternatives for Telemeter
?
(prefer something that does not have Log
or Logger
in it - as someone pointed out it might cause confusions)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P.S: How about TelemetryCollector
?
(It doesn't fully communicate the aspect of "also transmitting the collected data" which Telemeter
does, but I suppose that's close enough)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I had a similar reaction as @jh7370 when I saw the class name. Without the Telemetry
prefix, it seems reasonable to use that as the class name? That seems to convey everything the class is doing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would go with Session
, personally, because the class seems to most closely resemble our same-named class in our downstream implementation. I'd love to avoid Collector
, mostly because for us our "telemetry collector" is actually a separate executable that monitors the file system for telemetry data files created by the individual tools, before forwarding that data onto the servers (in a roundabout way it actually does what this class is trying to do, just at a different point in the process, but I digress...)! However, neither of these points are especially strong, given they're more to do with how our downstream implementation works.
I see this class as being the thing that a) receives the user-created configuration data, b) receives the telemetry data, c) does things related to setup and teardown of the telemetry system. It then forwards that data onto the Destinations and generally manages things to do with telemetry. I therefore could be persuaded by Session
, Manager
, Service
or even System
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like Session
, assuming this class is responsible for generating/storing the unique session ID. I would imagine in LLDB a session is created during Initialization and cleaned up during Termination.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I believe this is resolved - we've agreed to name it telemetry::Manager
)
Added the set of tests. PTAL. |
✅ With the latest revision this PR passed the C/C++ code formatter. |
- simplified tests + add more test to cover write(Key, Map) - make dispatch virtual with default impl, add an additional preDispatch - move Destination list management to base Manager
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy enough with the current state for this to be landed. We can iterate further after it's in tree, should we feel the need to.
I've got a final couple of comments, but neither of them are essential.
It might be worth getting another LLDB person to sign off, but I'll leave you to judge that.
using Enable_If_Map = | ||
std::enable_if_t<Is_DenseMap<T>::value || Is_StdMap<T>::value>; | ||
|
||
template <typename T, typename = Enable_If_Map<T>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a look, and it might be enough to just do typename = typename T::mapped_type
instead of the Enable_if_Map
template argument, to cover all map types (or more generally any type with a mapped_type
sub-type.
I believe you can further restrict it to map types that reference a serializable type, as we have similar code in our codebase to achieve this. Loosely adapted, the code looks like this:
class Serializer;
// Provide a base definition for IsSerializableType which is "false".
template <typename T, typename SerializerTy = Serializer,
typename = std::true_type>
struct IsSerializableType : std::false_type {};
// Specialize the template to provide a true_type instead, if the input type can be
// passed as the second argument to a write method of a SerializerTy& with
// StringRef first argument.
template <typename T, typename SerializerTy>
struct IsSerializableType<
T, SerializerTy,
typename std::is_same<decltype(std::declval<SerializerTy &>().write(
std::declval<StringRef>(), std::declval<T>())),
void>::type> : std::true_type {};
Putting these together, the following might work, although I haven't rigorously tested it:
template <typename T, typename MappedType = typename T::mapped_type, typename = std::enable_if_t<IsSerializableType<typename MappedType>::value>>
void write(StringRef KeyName, const T &Map) {
// ...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for using mapped_type
. I probably wouldn't bother checking that its serializable and just let the user get a compile error on the write(key, mapped_type())
call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done - i've removed all the complex enable_if with the simple mapped_type restriction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can be the "lldb person" :)
virtual void beginObject(StringRef KeyName) = 0; | ||
virtual void endObject() = 0; | ||
|
||
template <typename T> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move these helpers (if you do end up having them) into a private:
block.
using Enable_If_Map = | ||
std::enable_if_t<Is_DenseMap<T>::value || Is_StdMap<T>::value>; | ||
|
||
template <typename T, typename = Enable_If_Map<T>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for using mapped_type
. I probably wouldn't bother checking that its serializable and just let the user get a compile error on the write(key, mapped_type())
call
virtual void beginObject(StringRef KeyName) = 0; | ||
virtual void endObject() = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question (for both of you) do you want begin/end object to be public, or should it be a private interface, accessible only through the map wrapper?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be public to allow for nesting objects.
Eg.,
s.beginObject("Parent");
s.beginObject("Child");
s.write("ChildAge", 100);
s.beginObject("GrandChild");
s.write("GrandKidAage", 10);
s.endObject();
s.endObject();
....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Some of that could be achieved through types like map<string, map<string, ??>>
, but this is definitely more flexible. (At the cost of potentially forgetting to call endObject, but if needed, we could solve that with a scoped RAII helper.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have an assert in the finalize()
call to make sure all children have been "ended" properly.
virtual void write(StringRef KeyName, bool Value) = 0; | ||
virtual void write(StringRef KeyName, int Value) = 0; | ||
virtual void write(StringRef KeyName, size_t Value) = 0; | ||
virtual void write(StringRef KeyName, unsigned long long Value) = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just repinging my idea for redirecting everything to two overloads (for signed vs. unsigned). (Doing that because you seems to have converged on something similar for map types as well).
if (T == nullptr) | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still believe there shouldn't be a null check here. we have llvm::(dyn_)cast*_if_present* for that.
Co-authored-by: Pavel Labath <pavel@labath.sk>
Co-authored-by: Pavel Labath <pavel@labath.sk>
- combine all the write(int-variants) into one (pavel's suggestion) - replace the complex enable_if logic with simply mapped_type and let users deal with compile errors if mis-use
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/130/builds/7860 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/160/builds/10454 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/89/builds/13167 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/180/builds/10452 Here is the relevant piece of the build log for the reference
|
Reverts #102323 Reason: broke CI
(already reverted - Will look at the build failures and rollforward) |
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/51/builds/8062 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/168/builds/6819 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/80/builds/7959 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/97/builds/3787 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/118/builds/3761 Here is the relevant piece of the build log for the reference
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate what @labath was saying about signed/unsigned overloads, but I don't think those are going to be always sufficient. There are a couple of different cases.
-
Our downstream API (which isn't part of our downstream LLVM code, so doesn't have access to LLVM types) uses a non-LLVM JSON implementation for its serialization. That JSON library has different overloads for 32-bit and 64-bit integers, I presume to provide a faster path for writing the 32-bit version. Switching to (effectively) only have
long long
andunsigned long long
support would mean that performance advantage is lost, because we'd be forced to use the 64-bit version. -
Admittedly, I believe it's only used by testing code at the moment, but we have a downstream concrete Serializer type that converts the values into a
std:::variant
-like class to be stored in a dictionary. We don't want to lose the type information in this context, i.e. we want to be able to distinguish between e.g.int
andlong long
. Although this case is primarily for our unit tests, I don't see why there couldn't be a real-world use-case that requires the original type information somewhere down the line and since implementing the overloads isn't exactly complex, I would really like to have them all.
Finally, although the overloads were removed, the documentation wasn't updated, so anybody blindly copying it would end up with build failures with the current version.
That makes sense. I'm not pushing for this, I just wanted to make sure that this option is considered. Implementation that don't want/need the type information can always implement the forwarding themselves. |
Thanks - i'll revert the code back to previous revision (with all the overloads listed explicitly)l |
FWIW, I had a brainwave over the Christmas break (and only just got back to working, hence the delayed comment) that we could get the best of both worlds by providing a default implementation for all the shorter types (i.e. |
SGTM (@oontvoo is OOO for the next two weeks) |
Reverts llvm/llvm-project#102323 Reason: broke CI
Local branch amd-gfx 4088678 Merged main:53d080c5b5df into amd-gfx:f0fbc982ed3b Remote branch main 8c00900 [llvm]Add a simple Telemetry framework (llvm#102323)
Objective:
- tools in LLVM that want to use it
- vendors in their downstream codebase
- tools users (as allowed by vendor)
Background:
The framework was originally proposed only for LLDB, but there were quite a few requests that it should be moved to llvm/lib given telemetry is a common usage to a lot of tools, not just LLDB.
See more details on the design and discussions here on the RFC: https://discourse.llvm.org/t/rfc-lldb-telemetry-metrics/64588/20?u=oontvoo
See also PR/98528 where the framework is extended and used.