-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Success markers telemetry #10065
Success markers telemetry #10065
Conversation
This PR adds events specified in this comment: #9830 (comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs/docs/telemetry/events.json
Outdated
] | ||
}, | ||
"Markers Stats Computed": { | ||
"description": "Triggered when marker stats has been computed.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"description": "Triggered when marker stats has been computed.", | |
"description": "Triggered when marker stats have been computed.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs/docs/telemetry/events.json
Outdated
"required": [ | ||
"count" | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is saying that count
is required, however, it's only required when strategy is first_n
or sample
. Or am I misunderstanding how this is used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that's a mistake, fixed it in the latest commit.
rasa/cli/evaluate.py
Outdated
@@ -123,6 +124,10 @@ def _run_markers( | |||
stats_file: (Optional) Path to write out statistics about the extracted | |||
markers. | |||
""" | |||
telemetry.track_markers_evaluation_initiated( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
different name suggestion: track_evaluate_markers_initiated
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the event is now called "Markers Extraction Initiated" (based on one of your suggestions) so I renamed this function to have a similar name
@usc-m sorry I didn't read your comment until after I finished my review:
I made a similar observation but suggested a different name. I like your suggestion: "Evaluation - Success Markers" |
Re-requesting your review because I changed all the names and fixed the issue with
I think we can do this if it makes sense. When does this event need to be tracked? |
What I mean is the number of markers in the config file. So as soon as the config file is processed we would know the number of markers. I can dig into the code to see where this happens. @usc-m anything else you want to track? You suggested a measure of marker complexity. It's a good idea, but maybe not as trivial to implement so maybe we can add it later. What do you think? |
Number of markers in the config file should be a matter of counting the number of sub-markers of the marker returned from
Not strictly necessary but something like maximum depth of nested markers, or how wide they get (maximum number of sub-markers under a marker that isn't top-level). Don't think these would be hard to add but also aren't as necessary - I think we'd get some useful info out about how the feature is used but it's something we could add later perhaps? |
It tells us how complex the conditions get, and thus how people are using (or potentially abusing?) Markers.
Definitely useful but not urgent. Let's just count how many Markers are defined for now 👍 |
"required": [ | ||
"strategy", | ||
"only_extract", | ||
"seed", | ||
"count" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@usc-m are all of these required? 🤔 or just the top two?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the perspective of telemetry I think it makes sense - we'd want to know if it's being used or not. The types of those are string
or null
so if it's not present we'd see an explicit null. Would be good to check with someone who understands the telemetry schema here better (though I think this is only used in the docs to explain what data we send back to users and isn't used to actually validate anything internally)
docs/docs/telemetry/events.json
Outdated
] | ||
}, | ||
"Markers Stats Computed": { | ||
"description": "Triggered when marker stats has been computed.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"description": "Triggered when marker stats has been computed.", | |
"description": "Triggered when marker statistics have been computed.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use the full word in text, and reserve "stats" for code :)
No way to actually change config path, now fixed with nargs
02c60c0
to
1b4535a
Compare
OK, I think I've added the extra telemetry. Is there any other changes we need to make? Anything more to collect, names all fine etc.? |
{ | ||
"strategy": strategy, | ||
"only_extract": only_extract, | ||
"seed": seed, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will report the actual seed used by the user - do we want the actual seed or do we want to just know if they used a seed? Is this something that's worth changing or are we considering the actual seed value not important enough to be careful about not collecting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only want to know if the user set a seed. I don't think it counts as private info so maybe it doesn't matter if we collect it. But if we can do true/false that might be better.
@@ -139,6 +147,10 @@ def _run_markers( | |||
"Please see errors listed above and fix before running again." | |||
) | |||
|
|||
# Subtract one to remove the virtual OR over all markers | |||
num_markers = len(markers) - 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the total number of all conditions and operators used in all marker configurations that were evaluated -- to get the number of user-defined markers we should
- get rid of the special case here and always add an "ANY_MARKER
... My bad, iter gives us all conditions and operators but len is just the sub-markers all good 👍 - but because of 1. there might not be sub-markers if there is a single user defined markerlen(markers.sub_markers)
instead oflen(markers)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@usc-m , I could help and add that while you work on other comments if you like?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could also track something like max( len(sub_marker) for sub_marker in markers)
, i.e. the maximum number of conditions and operators used in a single user defined marker, to get a glimpse of how complex the queries are
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I suggested that too, also the branching factor (maximum number of children under any marker)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah, that's also a nice idea -- will definitely tell us if people miss begin able to re-use a marker definition
@@ -608,6 +609,9 @@ def evaluate_trackers( | |||
if tracker: | |||
tracker_result = self.evaluate_events(tracker.events) | |||
processed_trackers[tracker.sender_id] = tracker_result | |||
|
|||
processed_trackers_count = len(processed_trackers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we really want the number of processed trackers or processed sessions (or both)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in this case processed trackers is actually useful because we get it before, as a command line argument (the users intent) and after (what they actually got). Might help to highlight issues in our models of how tracker stores work. Right now we don't collect session info anywhere so I'm not sure what info we could get from it, but we could also just collect it as well if you think it would be useful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no clue how people use sender_ids and trackers usually, so no idea if that is better - but I guess it won't hurt if we don't add this now (and maybe later)
rasa/cli/evaluate.py
Outdated
telemetry.track_markers_parsed_count(num_markers) | ||
max_depth = markers.depth() - 1 | ||
# Find maximum branching of marker | ||
branching_factor = max(len(marker) - 1 for marker in markers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
branching_factor = max(len(sub_marker) - 1 for marker in markers.sub_markers for for sub_marker in marker)
to exclude the artificial Or marker?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yep, otherwise we can end up with cases where the number of markers (which we already get) is returned twice - good spot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. There are still a couple of unresolved typos. Did you miss them? (one spotted by you even 😄)
* Markers telemetry * Everything without tests * Specified events.json * Test added * Changelog entry * Naming fixes * Fix lint, fix CLI bug No way to actually change config path, now fixed with nargs * Add markers parsed telemetry * Document telemetry functions * always add ANY_MARKER; add test * Add complexity telemetry * Skip root marker to avoid double reporting total marker count Co-authored-by: Matthew Summers <m.summers@rasa.com> Co-authored-by: ka-bu <kathrin.bujna@gmail.com>
Proposed changes:
Status (please check what you already did):
black
(please check Readme for instructions)