-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
markers / validation #10012
markers / validation #10012
Conversation
return marker | ||
|
||
|
||
class CompoundMarker(Marker, ABC): | ||
class OperatorMarker(Marker, ABC): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 prefer this naming, gives us scope to break out validation into true Binary/Unary(/n-ary???) operators down the line by subclassing and makes it clearer what's going on
""" | ||
if name == Marker.ANY_MARKER: | ||
if name == Marker.ANY_MARKER or name in MarkerRegistry.all_tags: | ||
raise InvalidMarkerConfig( | ||
f"You must not use the special marker name {Marker.ANY_MARKER}. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to modify this in the case they used a reserved name in the registry rather than ANY_MARKER
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
think there was another case where I wanted to add the name == Marker.ANY_MARKER
to some name in MarkerRegistry.all_tag
check but forgot
return combined_configs | ||
|
||
@staticmethod | ||
def from_config_dict(config: Dict[Text, MarkerConfig]) -> Marker: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think this will break one of the CLI tests (specifically the one testing the export_marker
function on this class)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep - we lose that but we win nicer debug messages
If domain checking needs done that needs hooked up to the CLI as well (the CLI parameter exists but is unused). |
656e284
to
41a8ad4
Compare
Need to work out why set union over a generator expression wasn't working
tests/core/evaluation/test_marker.py
Outdated
def test_all_operators_in_schema(): | ||
operators_in_schema = rasa.shared.utils.schemas.markers.OPERATOR_SCHEMA["enum"] | ||
operators_in_schema = {tag.lower() for tag in operators_in_schema} | ||
def collect_slots(marker: Marker) -> Set[Text]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can merge these 3 functions into one :)
rasa/cli/evaluate.py
Outdated
markers = Marker.from_path(config) | ||
|
||
if not markers.validate_against_domain(domain): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice idea to just "bubble up" the warnings and then decide to break it here to gather all warnings first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was hoping to do it more like a compiler where you can track line number and also collect warnings a bit more elegantly, but it seemed like adding that would involve a lot of complicated stuff around adding information about where markers were parsed from to the markers and I didn't want to do too much in the time we have left
rasa/core/evaluation/marker.py
Outdated
if not valid: | ||
rasa.shared.utils.io.raise_warning( | ||
f"Referenced action '{self.text}' does not exist in the domain" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The extraction process isn't managed here - this goes over each marker and decides if they are valid, and if not emits a message, and then passes back True/False if it's valid or not. In the CLI we catch that boolean and error and exit based on it, but in these I wanted a way to collect all of the issues without stopping first, and then stop later (since reporting one at a time would be frustrating for users). The reason it's a warning is mostly because that was my first pass on it. Ideally it would go out as an error without terminating (and then get caught in the CLI to wrap it up and actually terminate) but we don't have a helper function for that so it's a little more involved - definitely something I want to fix before this merges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CLI also terminates before the extraction process begins. It loads the marker definitions and the domain, does the validation first, and terminates if issues are found. It will only establish a connection to the tracker store (and by extension perform extraction) if there are no issues
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or in short: #10012 (comment) :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looked at how DialogueStateTracker manages this and have replicated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that it's consistent but that approach is also a bit confusing, isn't it? I would've expected a "raise..." if there's a logger error (and usually, I would not explicitly log the error but just have a decorator to pass on the error message)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... on the other hand, getting an error as a result of "only warnings" is also a bit unexpected maybe 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I figured it was consistent from the perspective of the user, but it might be worth re-evaluating how we do error management across the codebase at some point
* validation; name vs tag; specify expected number of sub-markers;... * Delete old Marker test fixtures and associated files. * Fix tests after rebasing main * First pass on domain validation * Add some tests around domain validation Need to work out why set union over a generator expression wasn't working * Remove unused markers test data * Config validation tests * Fix lint * add tests for from_path; simplify other tests * make domain path optional for _run_markers * Codeclimate fixes * Codeclimate fixes * Swap over warnings for errors Co-authored-by: aeshky <aciel.eshky@gmail.com> Co-authored-by: Matthew Summers <m.summers@rasa.com>
Proposed changes:
Overview
TODO
Status (please check what you already did):
black
(please check Readme for instructions)