Success Markers CLI #9824

usc-m · 2021-10-07T16:51:31Z

Proposed changes:

Adds a CLI for the Success Markers feature (closes Success markers: create new markers CLI #9658)

Status (please check what you already did):

added some tests for the functionality
updated the documentation
updated the changelog (please check changelog for instructions)
reformat files using black (please check Readme for instructions)

usc-m · 2021-10-07T17:12:19Z

So I've reached a bit of an impasse on a few fronts here:

I'm not sure if this is the CLI design we want
I don't have an interface to go against for writing out marker files or computing them from the trackers that I can go against to complete implementation

I've made this PR a draft to facilitate discussion on these points (@TyDunn I've tagged you because you have opinions on CLI design/might be aware of others who do as well) before I proceed further with implementing them. At the moment the sketch of an implementation I have here is pretty much exactly what we have in the implementation proposal, however, there was some discussion of unifying them into a single command, and also discussion around where these should live.

Here's some ideas I've seen floated or had:

Current
This splits extract and stats into two subcommands under markers, extract pulls trackers in from a tracker store specified by the endpoint config (which has its argument handled by the existing default argument stuff for that) according to a strategy (--by - would need a default if it were optional), up to a number --num-trackers (useless if strategy is all). If a seed is set that is used to set up the random number generator (useless if strategy is not sample).
Optionally, stats can be extracted to a file at this stage.

stats takes a file produced by extract and produces a stats file.

Unique properties this has (can be interpreted positively or negatively depending on what our priorities are):

Markers has it's own subcommand, quite a prominent feature on its own
-- Could be useful if we were to expand what could be done with markers
Extract and stats are separated

Unanswered questions for this one that I can think of:

Are these the correct optional/positional parameters?
For the optionals, what should their defaults be?
Does this model the most common way of using the feature? (i.e. are people always going to want to be working with files or are they looking for quick stats printed to the terminal?)
Should we also print stats to the terminal?

rasa markers extract [--stats <FILENAME>] [--num-trackers <NUM>] [--by <first-n|sample|all>] [--seed <SEED>] [--endpoints <ENDPOINT_CONFIG>] <OUTPUT_FILENAME>

rasa markers stats <INPUT_FILENAME> <OUTPUT_FILENAME>

Variation
One idea I had was to try and roll both commands into one and sit it under an evaluate namespace, as I feel like that's more likely to be useful as a top-level subcommand later. This version assumes we want to print stats by default and moves file exporting to an option for both extraction and summary statistics. Some problems I've encountered with this though are:

Not fully clear what the default invocation would be
Still no clarity on default values of parameters
Not sure if this models the intended workflow better, but feels a bit cleaner to me

rasa evaluate markers [--extract <FILENAME>] [--stats-file <FILENAME>] [--endpoints <ENDPOINT_CONFIG>]  [--num-trackers <NUM>] [--by <first-n|sample|all>] [--no-stats]

As for the issue of interfaces, we don't yet have a central markers class/data structure which I can use to implement functions to write out/read in markers from disk, or write out stats. This is proving to be a bit of a blocker for this ticket and the other ticket I have at the moment (#9661), since I'm not sure how to access marker data in memory.

We have samples which I can derive a schema of the on-disk representation from the PR closing #9662, but I don't want to duplicate effort which would otherwise be done in #9660. If we can define some of these interfaces now that would unblock me and we can somewhat parallelise the work (though the merge order will probably be weird, and it might be difficult to test this PR until #9660 is in).

usc-m · 2021-10-14T16:11:11Z

After discussing with the team and voting on options, we've decided on rasa evaluate markers as the overall command. The extract and stats components will be rolled into one, with the default being to extract and produce stats (and stats can be disabled with a flag). This would look roughly like:

rasa evaluate markers [--no-stats] [--endpoints <ENDPOINTS_CONFIG>] <by_first_n|by_sample|all> [-n <NUM_TRACKERS>] [--seed <SEED>] <OUTPUT_FILENAME>

Extract would extract from the tracker store specified, using the strategy specified (-n and --seed will be put into a mutually exclusive group depending on the strategy), and save the results into OUTPUT_FILENAME (e.g. markers.json). Stats would be saved into a modified version of the output filename (e.g. markers.stats.json), unless disabled by the flag

ka-bu

🚀

rasa/core/evaluation/marker_base.py

rasa/cli/evaluate.py

rasa/cli/arguments/evaluate.py

rasa/cli/evaluate.py

joejuzl

🚀 Cool to see it all coming together :)

joejuzl · 2021-10-28T12:35:07Z

rasa/cli/arguments/evaluate.py

+        default="markers.yml",
+        type=str,
+        help="The config file(s) containing marker definitions. This can be a single "
+        "YAML file, or a directory that contains several files with domain "


what does "domain" mean here? Like domain-specific configs?

I think I just copied this over from the domain version of this absentmindedly, fixing it to be "marker definitions"

joejuzl · 2021-10-28T12:37:57Z

rasa/cli/arguments/evaluate.py

+
+    add_endpoint_param(
+        parser,
+        help_text="Configuration file for the model server and the connectors as a "


I guess the key thing worth mentioning here is that it contains the configuration for connecting to the tracker store right?

joejuzl · 2021-10-28T12:39:36Z

rasa/cli/arguments/evaluate.py

+        "--seed", type=int, help="Seed to use if selecting trackers by 'sample'"
+    )
+    parser.add_argument(
+        "count", type=int, help="The number of trackers to extract markers from",


Does this need to be different from the "count" in set_markers_first_n_arguments?

Not strictly but it arises because of sub-parsers (already commented on them on another comment)

joejuzl · 2021-10-28T12:43:45Z

rasa/cli/evaluate.py

+
+    markers_subparser = marker_parser.add_subparsers(dest="strategy")
+
+    markers_first_n_subparser = markers_subparser.add_parser(


Could this be done with argument i.e. with add_mutually_exclusive_group? Feels a bit weird doing it with subparsers...

I will check again, I looked into how rasa test does it and it looks like it uses sub-parsers (though I agree it's a bit odd). I think I could do it with add_mutually_exclusive_group and choices though.

So I had a look, and I'm not sure that's possible easily - add_mutually_exclusive_group lets us force one of the args in the group to be picked, but we have a situation where one option has 2 args that are allowed, one has one arg, and one has none. Those are all dependent on the value of strategy (which we could force with choices), but I'm not sure we can do mutual exclusion another way without the use of sub-parsers here (which I agree looks quite nasty)

joejuzl · 2021-10-28T12:45:38Z

rasa/cli/evaluate.py

+    seed: int,
+    count: int,


Looking at _run_markers_cli these can be None? (Optional)

joejuzl · 2021-10-28T12:50:43Z

rasa/cli/evaluate.py

+    processed_trackers = {}
+
+    for tracker in tracker_loader.load():
+        tracker_result = markers.evaluate_events(tracker.events)


Was there a reason we don't pass a list of trackers in so all the logic is encapsulated in markers?

didn't want to have a function that just loops but back then I didn't think of writing results .... so not really, we could (think you're right, we should) create a function in markers.py that gets an iterator over markers and a filestream and do the work there (plus a just a path to the statsfile)

joejuzl · 2021-10-28T12:51:43Z

rasa/cli/evaluate.py

+    return processed_trackers
+
+
+def _save_results(


It feels to me like this logic shouldn't be in the cli package. Is there a reason we don't have it in rasa.core.evaluation.marker?

Yeah, we were torn - at first it was going to just be json.dumps into a file and it seemed too trivial to worry about, but then we changed the format and now it probably should live with the marker package. I'll move it

joejuzl · 2021-10-28T12:53:01Z

tests/conftest.py

@@ -638,6 +638,16 @@ def write_endpoint_config_to_yaml(
    return endpoints_path


+def write_markers_config_to_yaml(


If this is only used in one test file it can probably live there.

Co-authored-by: Aciel Eshky <a.eshky@rasa.com>

Also removes otherwise-unused import to stop circularity

rasa/core/evaluation/marker_base.py

ka-bu

Looks good :) Just some tiny suggestions on the tests

rasa/core/evaluation/marker_base.py

ka-bu · 2021-10-29T07:55:21Z

rasa/core/evaluation/marker_base.py

+                    )
+
+    @staticmethod
+    def _write_dialogue(


_write_relevant_event_information ?

ka-bu · 2021-10-29T08:03:44Z

rasa/core/evaluation/marker_base.py

@@ -485,6 +488,86 @@ def from_config(config: MarkerConfig, name: Optional[Text] = None) -> Marker:
            marker.name = name
        return marker

+    def export_markers(
+        self,
+        tracker_loader: MarkerTrackerLoader,


How about...

Suggested change

tracker_loader: MarkerTrackerLoader,

tracker_loader: Iterator[Optional[DialogueStateTracker]]:

and then we just pass tracker_loader.load() (which for some reason has the "optional", so we also need to check for tracker == None below, don't we?) -- which makes it a tiny bit easier to add an isolated test for this :) (and no worries, I think we can just reuse/move the test you already wrote)

Yeah good catch on the optional - I remember why I did that but I forgot about it. It has to be optional because retrieve on the tracker store interface returns an optional (which I assume is to handle things like your database falls over and your request times out, or something has been deleted in between you requesting the list of keys and requesting the value for a specific key)

ka-bu · 2021-10-29T08:05:24Z

tests/cli/test_rasa_evaluate_markers.py

+        None, 10, endpoints_path, "first_n", markers_path, results_path, None
+    )
+
+    with open(results_path, "r") as results:


Since we have the new export now inside Marker along with all the logic of what is exported -- How about we just check here that the file can be loaded with the csv reader, and move the below (and copy the above example, without the tracker loader/store to a test_export in the marker tests?

hmm, I'm not sure how we check it can be loaded with the CSV reader because I'm pretty sure it processes the CSV lazily - so we can't know it's a valid CSV without checking the whole file. I guess just iterate all rows and assert that it doesn't raise?

yep, I would just iterate over it 👍

rasa/cli/arguments/evaluate.py

Co-authored-by: Kathrin Bujna <kathrin.bujna@gmail.com>

aeshky

Nice work 🚀
I think export_markers should be unit tested in the test_marker.py as kathrin suggested.

aeshky · 2021-10-29T10:34:48Z

rasa/core/evaluation/marker_base.py

@@ -485,6 +488,86 @@ def from_config(config: MarkerConfig, name: Optional[Text] = None) -> Marker:
            marker.name = name
        return marker

+    def export_markers(


Should we add explicit unit tests for these new functions outside of the CLI tests?

aeshky · 2021-10-29T10:42:38Z

rasa/core/evaluation/marker_base.py

+            tracker_result = self.evaluate_events(tracker.events)
+            processed_trackers[tracker.sender_id] = tracker_result
+
+        Marker._save_results(output_file, processed_trackers)


I get various type warnings in pycharm in this file. Should we fix these?

👍 working on those as part of the config validation task -- we can ignore them for now

joejuzl

LGTM 👍

ka-bu

🥳

Had to move WriteRow protocol from stats to utils since it was needed in more than one place and direct import caused circularity. Had to also address linting errors

Protocol doesn't exist in Py3.7

aeshky · 2021-11-01T14:20:34Z

🎉

* Create stubs to start filling out Aciel's snippet Co-authored-by: Aciel Eshky <a.eshky@rasa.com> * Sketch out basics of CLI arguments * Start changing over args to new format * Make CLI match description from design pass * Add tests * Switch over timestamps to event IDs * Fix tests now event IDs are supported * Start work on doc comments * Switch over event_id fix Also removes otherwise-unused import to stop circularity * Address PR comments * Fix incorrect file extension * Codeclimate * Update rasa/core/evaluation/marker_base.py Co-authored-by: Kathrin Bujna <kathrin.bujna@gmail.com> * Address PR comments * Move test to marker * Rename dialogue to session Co-authored-by: Aciel Eshky <a.eshky@rasa.com> Co-authored-by: Kathrin Bujna <kathrin.bujna@gmail.com>

usc-m requested review from aeshky and TyDunn October 7, 2021 16:51

usc-m force-pushed the 9658/success-markers-cli branch from 15041ec to 165082d Compare October 27, 2021 16:58

ka-bu reviewed Oct 28, 2021

View reviewed changes

rasa/core/evaluation/marker_base.py Outdated Show resolved Hide resolved

rasa/cli/evaluate.py Outdated Show resolved Hide resolved

rasa/cli/arguments/evaluate.py Outdated Show resolved Hide resolved

rasa/cli/evaluate.py Outdated Show resolved Hide resolved

usc-m requested a review from joejuzl October 28, 2021 10:18

usc-m marked this pull request as ready for review October 28, 2021 10:18

usc-m requested a review from a team as a code owner October 28, 2021 10:18

joejuzl suggested changes Oct 28, 2021

View reviewed changes

usc-m and others added 10 commits October 28, 2021 14:13

Create stubs to start filling out Aciel's snippet

778a69b

Co-authored-by: Aciel Eshky <a.eshky@rasa.com>

Sketch out basics of CLI arguments

186a1e4

Start changing over args to new format

6a90fc2

Make CLI match description from design pass

a6e29e9

Add tests

7ed690e

Switch over timestamps to event IDs

e4dd4fc

Fix tests now event IDs are supported

bf4030b

Start work on doc comments

ab8d3f2

Switch over event_id fix

a70d06a

Also removes otherwise-unused import to stop circularity

Address PR comments

1d45b90

usc-m force-pushed the 9658/success-markers-cli branch from 165082d to 1d45b90 Compare October 28, 2021 16:24

usc-m requested review from joejuzl and ka-bu October 28, 2021 16:30

usc-m changed the title ~~(WIP) Success Markers CLI~~ Success Markers CLI Oct 28, 2021

usc-m commented Oct 28, 2021

View reviewed changes

rasa/core/evaluation/marker_base.py Show resolved Hide resolved

usc-m added 2 commits October 28, 2021 17:39

Fix incorrect file extension

3aa624d

Codeclimate

441ccc5

ka-bu suggested changes Oct 29, 2021

View reviewed changes

aeshky reviewed Oct 29, 2021

View reviewed changes

rasa/cli/arguments/evaluate.py Show resolved Hide resolved

Update rasa/core/evaluation/marker_base.py

10d8360

Co-authored-by: Kathrin Bujna <kathrin.bujna@gmail.com>

Address PR comments

65fad80

aeshky suggested changes Oct 29, 2021

View reviewed changes

joejuzl approved these changes Oct 29, 2021

View reviewed changes

usc-m added 2 commits October 29, 2021 16:23

Move test to marker

f630721

Merge branch 'main' into 9658/success-markers-cli

60cedb4

usc-m requested review from aeshky and ka-bu October 29, 2021 15:25

ka-bu approved these changes Oct 29, 2021

View reviewed changes

Rename dialogue to session

d6d1321

usc-m enabled auto-merge (squash) October 29, 2021 15:32

aeshky approved these changes Nov 1, 2021

View reviewed changes

aeshky and others added 3 commits November 1, 2021 10:55

Merge branch 'main' into 9658/success-markers-cli

e686a3c

Fix types, fix CLI test

86ddf76

Had to move WriteRow protocol from stats to utils since it was needed in more than one place and direct import caused circularity. Had to also address linting errors

Fix types again

036a0a1

Protocol doesn't exist in Py3.7

usc-m merged commit 18ca40c into main Nov 1, 2021

usc-m deleted the 9658/success-markers-cli branch November 1, 2021 13:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Success Markers CLI #9824

Success Markers CLI #9824

usc-m commented Oct 7, 2021 •

edited

Loading

usc-m commented Oct 7, 2021 •

edited

Loading

usc-m commented Oct 14, 2021 •

edited

Loading

ka-bu left a comment

joejuzl left a comment

joejuzl Oct 28, 2021

usc-m Oct 28, 2021

joejuzl Oct 28, 2021

joejuzl Oct 28, 2021

usc-m Oct 28, 2021

joejuzl Oct 28, 2021

usc-m Oct 28, 2021

usc-m Oct 28, 2021

joejuzl Oct 28, 2021

joejuzl Oct 28, 2021

ka-bu Oct 28, 2021

joejuzl Oct 28, 2021

usc-m Oct 28, 2021

joejuzl Oct 28, 2021

ka-bu left a comment

ka-bu Oct 29, 2021

ka-bu Oct 29, 2021

usc-m Oct 29, 2021

ka-bu Oct 29, 2021

usc-m Oct 29, 2021

ka-bu Oct 29, 2021

aeshky left a comment

aeshky Oct 29, 2021

aeshky Oct 29, 2021

ka-bu Oct 29, 2021

joejuzl left a comment

ka-bu left a comment

aeshky commented Nov 1, 2021


		markers_subparser = marker_parser.add_subparsers(dest="strategy")

		markers_first_n_subparser = markers_subparser.add_parser(

		@@ -638,6 +638,16 @@ def write_endpoint_config_to_yaml(
		return endpoints_path


		def write_markers_config_to_yaml(

	tracker_loader: MarkerTrackerLoader,
	tracker_loader: Iterator[Optional[DialogueStateTracker]]:

Success Markers CLI #9824

Success Markers CLI #9824

Conversation

usc-m commented Oct 7, 2021 • edited Loading

usc-m commented Oct 7, 2021 • edited Loading

usc-m commented Oct 14, 2021 • edited Loading

ka-bu left a comment

Choose a reason for hiding this comment

joejuzl left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ka-bu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aeshky left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joejuzl left a comment

Choose a reason for hiding this comment

ka-bu left a comment

Choose a reason for hiding this comment

aeshky commented Nov 1, 2021

usc-m commented Oct 7, 2021 •

edited

Loading

usc-m commented Oct 7, 2021 •

edited

Loading

usc-m commented Oct 14, 2021 •

edited

Loading