-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make it possible to use or
with slot values
#9317
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great 💯 I've only left a few nit-picking comments, nothing too major.
I was also wondering if the task regarding rasa.shared.core.training_data.visualization
modification in the Definition of Done was set aside? or something you have to follow-up on?
rasa/shared/core/training_data/story_reader/story_step_builder.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this!
I think it would be really cool if we could test this through the TrainingDataGenerator
.
So we can see that the actual trackers are created as we expect (test_common_story_reader.py
is probably the most similar).
Also we need to make sure that story visualisation still works correctly (it's used by Rasa X).
rasa/shared/core/training_data/story_reader/yaml_story_reader.py
Outdated
Show resolved
Hide resolved
tests/shared/core/training_data/story_reader/test_yaml_story_reader.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good 👍 I think the visualisation module also needs updating.
}, | ||
) | ||
assert tracker.events[2] == ActionExecuted("utter_default") | ||
assert tracker.events[3] == SlotSet(key="name", value="joe") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a missing newline at the end here.
Also, would index 4 in tracker.events
also be a SlotSet
event, the one with value = bob
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good assumption but no — however, I am not sure why. My assumption was that there should be two trackers and trackers[0].events[3] == SlotSet(key="name", value="joe")
and trackers[1].events[3] == SlotSet(key="name", value="bob")
but that doesn't happen.
I was thinking that's an issue but it doesn't happen with or expressions when the intents (not SlotSet
s) are used, and it was like that before all these changes — so I now wonder if it's the bug or I just got that all wrong.
Maybe somebody can clarify? @joejuzl or @wochinge or maybe even someone from research
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good catch @ancalita ! It should be two trackers - one with "joe" and one with "bob". If you do the or
with intents you also get two trackers. This might require some changes in the TrackerGenerator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still needs to be handled right @alwx ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be good to go now — I rebased the PR: there is no common_story_reader.py
anymore, and the test to check if story steps get copied or not has already been added to test_yaml_story_reader.py
(e.g. test_or_statement_with_slot_was_set
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the rebasing just deleted this test. We still need a test which asserts that the generator creates two trackers out of the story with the or
slot. So you basically need a test which asserts that training.load_data
causes 2 trackers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally added a test for this! (test_or_statement_story_with_or_slot_was_set
)
Just checked it: |
}, | ||
) | ||
assert tracker.events[2] == ActionExecuted("utter_default") | ||
assert tracker.events[3] == SlotSet(key="name", value="joe") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still needs to be handled right @alwx ?
6f77e0f
to
bac6dc9
Compare
tests/core/test_tracker_stores.py
Outdated
@@ -398,8 +398,8 @@ def test_db_url_with_query_from_endpoint_config(tmp_path: Path): | |||
driver: my-driver | |||
another: query | |||
""" | |||
f = tmp_path / "tmp_config_file.yml" | |||
f.write_text(endpoint_config) | |||
stories_path = tmpdir / "stories.yml" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why these modifications were needed? In addition to the code quality failings, the test also fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it was some post-merge stuff, sorry 🙈
Removed it!
steps[0].as_story_string() | ||
== """ | ||
## hello world | ||
* slot{"slot_was_set": [{"name": "joe"}]} OR slot{"slot_was_set": [{"name": "bob"}]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that actually the CI expects this:
slot{"name": "joe"} OR slot{"name": "bob"}
🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed this
- action: some_action | ||
""" | ||
|
||
reader = YAMLStoryReader(is_used_for_training=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a test here when the is_used_for_training
flag is True
too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's tested in another test (test_or_statement_with_slot_was_set_is_used_for_training
) — in this case it makes no sense 'cause it will break the logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by it will break the logic? You mean the method as_story_string()
can't be called for each of the 2 steps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because when you specify that is_used_for_training=True
then checkpoints are getting added and trackers are getting copied (see add_stories
method in StoryStepBuilder
, line 110). As a result, reader.read_from_parsed_yaml(yaml_content)
will create multiple trackers and calling .as_story_string
on them will lead to results like that which are meaningless:
## hello world
- slot{"name": "joe"}
> GENR_OR_b2ae7
However, this result is important when these trackers are used for training, and that case is already tested in test_or_statement_with_slot_was_set_is_used_for_training
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks for the deep-dive!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🙌🏼
|
||
Args: | ||
events: Events that need to be added. | ||
is_used_for_training: Identifies if it's a part of OR statement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this anymore if we no longer support MD?
def _generate_checkpoint_name_for_or_statement( | ||
self, messages: List[UserUttered] | ||
) -> str: | ||
def _generate_checkpoint_name_for_or_statement(self, messages: List[Event]) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method refers to messages_texts_or_intents
- does this work for all events?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The __str__
method is definied for all the events so it does because it requires only that. However, the names should be changed.
), | ||
], | ||
) | ||
async def test_story_with_retrieval_intent_warns( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we not expect this warning any more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returned it back — in this case we can actually just remove the optional argument for YAMLStoryReader
to test exactly the same warning.
assert len(steps) == 1 | ||
|
||
assert ( | ||
steps[0].as_story_string() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this no longer used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't. is_used_for_training=False
can no longer be used so the tracker(s) generated by StoryReader
can only be used for training.
It makes sense because previously the ability to generate trackers not for training purposes was linked to conversion between Markdown and Yaml.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So can we delete as_story_string()
as part of removing mark down support?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It still seems to be used by the interactive learning so I would skip that for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So should we keep the test then?
steps[0].as_story_string() | ||
== """ | ||
## hello world | ||
* slot{"name": "joe"} OR slot{"name": "bob"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯
Proposed changes:
Status (please check what you already did):
black
(please check Readme for instructions)