Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flow tag in import and export #1887

Merged
merged 5 commits into from
Apr 1, 2025
Merged

Conversation

hasinaxp
Copy link
Contributor

@hasinaxp hasinaxp commented Mar 31, 2025

Summary by CodeRabbit

  • New Features

    • Added schema-based validation for training rules to ensure data adheres to defined standards.
    • Integrated flow tags into conversation flow processing, enriching story design.
    • Introduced functionality to download updated training rules for streamlined data management.
  • Enhancements

    • Improved the training data import process with optimized configuration handling and robust error management.
    • Updated internal data handling for clearer slot and rule extraction.
  • Tests

    • Expanded test coverage for new components related to story and rule processing, including flow tag handling and YAML reading/writing functionalities.
    • Adjusted test cases to verify the new metadata structure and behavior of rule validations.

Copy link

coderabbitai bot commented Mar 31, 2025

Walkthrough

The changes introduce a new static method for rule validation in the TrainingDataValidator and update the training file importer to KRasaFileImporter. A new module (model_data_imporer.py) adds several custom classes for enhancing rule and story handling, including flow tag support. Additionally, modifications in the data processing and utility modules refine method signatures and import updates. Test cases have been updated to reflect the new metadata structure and importer usage. These changes collectively enhance training data validation, import functionality, and rule processing.

Changes

File(s) Change Summary
kairon/importer/validator/file_validator.py Added static method validate_rules for YAML rule validation; updated from_training_files method to utilize KRasaFileImporter; adjusted import statements.
kairon/shared/data/model_data_imporer.py Introduced new classes (CustomRuleStep, CustomStoryStepBuilder, KRuleParser, CustomStoryGraph, KYAMLStoryReader, KRasaFileImporter, KYAMLStoryWriter) to enhance story/rule processing with flow tags.
kairon/shared/data/processor.py Replaced RasaFileImporter with KRasaFileImporter, updated method signatures (add_slot, __extract_rules, __extract_rule_events), added get_rules_for_download, and introduced a flow_tags attribute.
kairon/shared/utils.py Updated YAML writer by switching to KYAMLStoryWriter; modified register_telegram_webhook method signature with explicit type annotations.
tests/unit_test/data_processor/data_processor_test.py,
tests/unit_test/events/events_test.py
Adjusted test assertions to include new metadata with flow_tags; replaced RasaFileImporter with KRasaFileImporter in the event test.

Sequence Diagram(s)

sequenceDiagram
    participant YAMLLoader
    participant TrainingDataValidator
    participant Validator
    TrainingDataValidator->>YAMLLoader: Load schema from rules.yml
    YAMLLoader-->>TrainingDataValidator: Return schema
    TrainingDataValidator->>YAMLLoader: Load rules from provided file
    YAMLLoader-->>TrainingDataValidator: Return rules
    TrainingDataValidator->>Validator: Validate rules against schema
    alt Validation errors exist
        Validator-->>TrainingDataValidator: Return errors
        TrainingDataValidator->>Caller: Raise ValueError
    else No validation errors
        Validator-->>TrainingDataValidator: Return success
        TrainingDataValidator-->>Caller: Return validated rules
    end
Loading
sequenceDiagram
    participant TrainingDataValidator
    participant KRasaFileImporter
    participant ConfigLoader
    TrainingDataValidator->>KRasaFileImporter: Call from_training_files(...)
    KRasaFileImporter->>ConfigLoader: Load training/config files
    ConfigLoader-->>KRasaFileImporter: Return file data
    KRasaFileImporter-->>TrainingDataValidator: Return processed training data
Loading

Possibly related PRs

  • flow tagging and training data preparation changes #1869: The changes in the main PR are related to the modifications in the KRasaFileImporter class, which is also referenced in the retrieved PR, indicating a direct connection in how training data is imported and processed.
  • fixed python-multipart and multipart #1619: The changes in the main PR, specifically the introduction of the KRasaFileImporter class and its usage in the TrainingDataValidator, are directly related to the modifications in the retrieved PR, which also involves replacing RasaFileImporter with KRasaFileImporter in the context of data processing.
  • Action serialization #1570: The changes in the main PR, specifically the introduction of the validate_rules method and the modifications to the from_training_files method in the TrainingDataValidator class, are related to the updates in the retrieved PR that also involve the KRasaFileImporter class, which is used in the context of action handling and validation.

Suggested reviewers

  • hiteshghuge

Poem

I'm a rabbit hopping through the code,
Validating rules down every node.
With KRasa leading the data parade,
Flow tags now in the stories we’ve made.
A nibble of changes in this code delight,
Hop on board and celebrate! 🐇✨


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 50799af and a085053.

📒 Files selected for processing (1)
  • tests/unit_test/validator/data_importer_test.py (2 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
tests/unit_test/validator/data_importer_test.py (1)
kairon/shared/data/processor.py (1)
  • MongoProcessor (178-8986)
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Codacy Static Code Analysis
  • GitHub Check: Analyze (python)
  • GitHub Check: Python CI
🔇 Additional comments (16)
tests/unit_test/validator/data_importer_test.py (16)

5-5: Well-structured imports for new components and dependencies.

The import changes appropriately reflect the new components being tested, including imports for mock objects and new custom classes like KRasaFileImporter, CustomRuleStep, and various other components that support flow tags.

Also applies to: 9-11, 17-18, 20-20


283-285: LGTM: CustomRuleStep test verifies default flow tag.

Simple and effective test ensuring the CustomRuleStep initializes with the expected default flow tag.


288-293: LGTM: CustomStoryStepBuilder test validates rule step creation.

Well-structured test that verifies the builder creates story steps with the correct flow tags when marked as a rule.


296-299: LGTM: KRuleParser test confirms custom flow tag assignment.

The test effectively verifies that the parser correctly assigns custom flow tags to new rule parts.


302-316: LGTM: CustomStoryGraph test validates flow tag identification.

Good test structure using mocks to simulate a story step and verify that the graph correctly identifies and stores flow tags.


319-325: LGTM: KYAMLStoryReader test verifies metadata parsing.

Test confirms that the reader correctly parses flow tags from metadata.


328-336: LGTM: KRasaFileImporter test with proper mocking.

Effective use of monkeypatching to mock file reading and validation, testing the importer's ability to get stories.


338-346: LGTM: KYAMLStoryWriter test verifies rule step processing.

The test properly validates that the writer processes rule steps and correctly includes flow tags in the output metadata.


348-351: LGTM: CustomRuleStep custom flow tags test.

Test confirms that custom flow tags are correctly assigned when provided, complementing the default flow tag test.


355-357: LGTM: CustomStoryStepBuilder default flow tags test.

Simple and direct test that verifies the builder initializes with the correct default flow tags.


360-390: LGTM: Detailed KRuleParser test for the new_part method.

Well-structured test that:

  1. Captures and verifies rule conditions
  2. Confirms snippet action calls
  3. Validates flow tag assignment from metadata

Good use of local functions for capturing method calls.


394-417: LGTM: CustomStoryGraph test with multiple steps.

Thorough test that verifies the graph correctly handles multiple story steps and combines their flow tags.


420-434: LGTM: KYAMLStoryReader unexpected input handling test.

Good test for error handling, verifying that the reader raises appropriate warnings when encountering unexpected step strings.


437-472: LGTM: KYAMLStoryReader YAML parsing test.

Comprehensive test that:

  1. Mocks the StoryParser and KRuleParser classes
  2. Validates that the reader processes both stories and rules
  3. Confirms steps are correctly combined from multiple sources

Excellent coverage of complex parsing logic.


476-487: LGTM: KYAMLStoryWriter file output test.

Good test using a temporary path to verify that the writer correctly dumps story steps to a file with the expected content.


491-515: LGTM: KRasaFileImporter exclusion test verifies sampling logic.

This test effectively verifies that the importer correctly handles exclusion percentages when loading data from multiple story files. Good use of mocking to simulate file reading.

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai plan to trigger planning for file edits and PR creation.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (9)
kairon/importer/validator/file_validator.py (1)

1115-1130: Added new validate_rules static method

This method validates rules against a predefined schema, but has some potential improvements:

  1. The error handling should be consistent with the rest of the codebase (using AppException instead of ValueError)
  2. Missing docstring documentation
  3. Schema file path is hardcoded and not configurable

Consider enhancing the method with proper error handling and documentation:

@staticmethod
def validate_rules(rules_path: str):
+    """
+    Validates rules defined in a YAML file against a schema.
+    
+    Args:
+        rules_path: Path to the rules YAML file
+        
+    Raises:
+        AppException: If rules validation fails
+    """
    current_dir = os.path.dirname(os.path.realpath(__file__))
    schema_file_path = os.path.join(current_dir, "..", "shared", "schemas", "rules.yml")

    with open(schema_file_path, 'r') as schema_file:
        schema = yaml.safe_load(schema_file)

    validator = Validator(schema)

    with open(rules_path, 'r') as rules_file:
        rules = yaml.safe_load(rules_file)

    if not validator.validate(rules):
-        raise ValueError(f"Validation errors: {validator.errors}")
+        raise AppException(f"Rules validation failed: {validator.errors}")
kairon/shared/data/model_data_imporer.py (5)

35-39: Provide a docstring for clarity.

This constructor looks good overall, but it lacks a docstring. Adding one can improve maintainability by clarifying the purpose and usage of CustomRuleStep.


41-47: Document the constructor parameters.

While the implementation is straightforward, consider adding a docstring to describe the parameters and how flow_tags is used. It will make the class more comprehensible and maintainable.


81-90: Replace debug print with logging.

Using a direct print for the flow_tags is likely a leftover debug statement. Consider replacing it with a logging call if you need to keep it, or remove it to avoid cluttering the output.

-        print("CustomStoryGraph created with flow_tags:", self.flow_tags)
+        import logging
+        logging.debug(f"CustomStoryGraph created with flow_tags: {self.flow_tags}")

91-133: Use “key in dict” instead of “key in dict.keys()”.

To address the linter warnings and Python best practices, remove .keys() altogether when checking membership for lines 110, 112, 114, 116, 118, 120, 122, and 124.

-        elif KEY_USER_INTENT in step.keys():
+        elif KEY_USER_INTENT in step:

-        elif KEY_OR in step.keys():
+        elif KEY_OR in step:

... (repeat for each occurrence) ...
🧰 Tools
🪛 Ruff (0.8.2)

110-110: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


110-110: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


112-112: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


114-114: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


116-116: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


118-118: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


120-120: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


122-122: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


124-124: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


168-193: Remove unused variable flow_tags.

The method _extract_flow_tags is never utilized after its return value is assigned at line 170, resulting in an unused variable.

 def get_stories(self, exclusion_percentage: Optional[int] = None) -> CustomStoryGraph:
     story_graph = self._get_stories(exclusion_percentage)
-    flow_tags = self._extract_flow_tags(story_graph)
     return CustomStoryGraph(story_graph.story_steps)
🧰 Tools
🪛 Ruff (0.8.2)

170-170: Local variable flow_tags is assigned to but never used

Remove assignment to unused variable flow_tags

(F841)

kairon/shared/data/processor.py (3)

4989-4998: Consider optimizing the isinstance check.

The current implementation uses multiple isinstance calls which can be merged for better readability and performance.

def __extract_rules(self, story_steps, bot: str, user: str):
    saved_rules = self.fetch_rule_block_names(bot)

    for story_step in story_steps:
        if (
-                (isinstance(story_step, CustomRuleStep) or isinstance(story_step, RuleStep))
+                isinstance(story_step, (CustomRuleStep, RuleStep))
                and story_step.block_name.strip().lower() not in saved_rules
        ):
            rule = self.__extract_rule_events(story_step, bot, user)
            yield rule
🧰 Tools
🪛 Ruff (0.8.2)

4994-4994: Multiple isinstance calls for story_step, merge into a single call

Merge isinstance calls for story_step

(SIM101)


5092-5092: Small typo in return type annotation.

There's a missing space between the colon and return type annotation.

- def get_rules_for_download(self, bot: Text) ->StoryGraph:
+ def get_rules_for_download(self, bot: Text) -> StoryGraph:

164-5118: Consider adding documentation for flow tag functionality.

The implementation of flow tags looks good, but adding documentation comments to explain the purpose and behavior of flow tags would help future developers understand the feature better. Consider adding documentation to describe how flow tags work and the difference between "agentic_flow" and "chatbot_flow".

🧰 Tools
🪛 Ruff (0.8.2)

311-314: Return the condition config.get("dynamic_url_slot_name") == slot directly

Replace with return config.get("dynamic_url_slot_name") == slot

(SIM103)


324-328: Use return any(set_slot.get("name") == slot for set_slot in set_slots) instead of for loop

Replace with return any(set_slot.get("name") == slot for set_slot in set_slots)

(SIM110)


338-341: Return the condition directly

Inline condition

(SIM103)


352-356: Use return any(MongoProcessor.is_slot_in_field(note, slot) for note in notes) instead of for loop

Replace with return any(MongoProcessor.is_slot_in_field(note, slot) for note in notes)

(SIM110)


361-364: Return the condition set_slot == slot directly

Replace with return set_slot == slot

(SIM103)


378-382: Use return any(set_slot.get("name") == slot for set_slot in set_slots) instead of for loop

Replace with return any(set_slot.get("name") == slot for set_slot in set_slots)

(SIM110)


397-401: Use return any(value == slot for value in metadata.values()) instead of for loop

Replace with return any(value == slot for value in metadata.values())

(SIM110)


410-413: Return the condition config.get("set_slot") == slot directly

Replace with return config.get("set_slot") == slot

(SIM103)


418-421: Use return any(set_slot.get("name") == slot for set_slot in set_slots) instead of for loop

Replace with return any(set_slot.get("name") == slot for set_slot in set_slots)

(SIM110)


453-457: Use return any(set_slot.get("name") == slot for set_slot in set_slots) instead of for loop

Replace with return any(set_slot.get("name") == slot for set_slot in set_slots)

(SIM110)


498-498: kairon.importer.validator.file_validator.TrainingDataValidator imported but unused

Remove unused import: kairon.importer.validator.file_validator.TrainingDataValidator

(F401)


553-553: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


569-569: Do not perform function call REQUIREMENTS.copy in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


679-682: Use ternary operator type_value = "text" if not metadata else "json" instead of if-else-block

Replace if-else-block with type_value = "text" if not metadata else "json"

(SIM108)


725-725: Do not perform function call REQUIREMENTS.copy in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


806-806: Local variable actions is assigned to but never used

Remove assignment to unused variable actions

(F841)


1274-1274: Use bool(...) instead of True if ... else False

Replace with `bool(...)

(SIM210)


1347-1347: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


1384-1385: Use a single if statement instead of nested if statements

(SIM102)


1467-1468: Use a single if statement instead of nested if statements

(SIM102)


1525-1525: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


1528-1528: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


1790-1790: Comparison to None should be cond is None

Replace with cond is None

(E711)


2181-2181: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


2348-2351: Use a single if statement instead of nested if statements

(SIM102)


2762-2762: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


2833-2833: Use example.get("entities", None) instead of an if block

Replace with example.get("entities", None)

(SIM401)


2921-2921: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


2924-2924: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


2972-2972: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


3206-3206: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


3298-3302: Use a single if statement instead of nested if statements

Combine if statements using and

(SIM102)


3549-3549: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


3621-3621: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


3645-3645: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


3893-3893: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


3942-3942: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


3946-3949: Use contextlib.suppress(AppException) instead of try-except-pass

Replace with contextlib.suppress(AppException)

(SIM105)


3997-3997: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


4096-4098: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


4100-4101: Use a single if statement instead of nested if statements

Combine if statements using and

(SIM102)


4113-4113: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


4136-4136: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


4159-4159: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


4192-4192: Local variable response is assigned to but never used

Remove assignment to unused variable response

(F841)


4311-4313: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


4527-4527: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


4587-4587: Use key not in dict instead of key not in dict.keys()

Remove .keys()

(SIM118)


4714-4714: Use action_config.get("website") instead of action_config.get("website", None)

Replace action_config.get("website", None) with action_config.get("website")

(SIM910)


4755-4755: Use action_config.get("website") instead of action_config.get("website", None)

Replace action_config.get("website", None) with action_config.get("website")

(SIM910)


4917-4917: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


4955-4955: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


4994-4994: Multiple isinstance calls for story_step, merge into a single call

Merge isinstance calls for story_step

(SIM101)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7bfabdf and d4b653c.

📒 Files selected for processing (6)
  • kairon/importer/validator/file_validator.py (4 hunks)
  • kairon/shared/data/model_data_imporer.py (1 hunks)
  • kairon/shared/data/processor.py (8 hunks)
  • kairon/shared/utils.py (3 hunks)
  • tests/unit_test/data_processor/data_processor_test.py (2 hunks)
  • tests/unit_test/events/events_test.py (2 hunks)
🧰 Additional context used
🧬 Code Definitions (4)
tests/unit_test/events/events_test.py (1)
kairon/shared/data/model_data_imporer.py (1)
  • KRasaFileImporter (167-199)
kairon/shared/data/model_data_imporer.py (1)
kairon/shared/data/processor.py (1)
  • get_stories (3652-3763)
kairon/shared/utils.py (1)
kairon/shared/data/model_data_imporer.py (2)
  • KYAMLStoryWriter (202-284)
  • dump (262-284)
kairon/shared/data/processor.py (3)
kairon/shared/data/model_data_imporer.py (2)
  • KRasaFileImporter (167-199)
  • CustomRuleStep (35-38)
kairon/shared/models.py (1)
  • FlowTagType (127-129)
kairon/shared/data/data_objects.py (1)
  • Rules (716-748)
🪛 Ruff (0.8.2)
kairon/shared/data/model_data_imporer.py

110-110: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


110-110: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


112-112: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


114-114: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


116-116: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


118-118: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


120-120: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


122-122: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


124-124: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


170-170: Local variable flow_tags is assigned to but never used

Remove assignment to unused variable flow_tags

(F841)

kairon/shared/data/processor.py

4994-4994: Multiple isinstance calls for story_step, merge into a single call

Merge isinstance calls for story_step

(SIM101)

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Codacy Static Code Analysis
  • GitHub Check: Python CI
  • GitHub Check: Analyze (python)
🔇 Additional comments (20)
tests/unit_test/events/events_test.py (2)

23-23: Import change looks good

The change from using RasaFileImporter to using KRasaFileImporter aligns with the PR objectives for supporting flow tag in import and export.


76-78: Implementation correctly uses the new importer class

The test fixture now correctly uses KRasaFileImporter instead of RasaFileImporter while maintaining the same method signature. This should allow the tests to properly handle the new flow tag functionality in stories and rules.

tests/unit_test/data_processor/data_processor_test.py (2)

5281-5281: Additional metadata structure successfully implemented for rules.

The assertion now includes a metadata section with flow_tags for rules, which is consistent with the changes introduced to support flow tags in the data model.


5296-5296:

❓ Verification inconclusive

Verify if all rules should have metadata with flow_tags.

While the first rule includes metadata with flow_tags, the second rule (multiflow_story_story_download_data_files_1) doesn't have any metadata section. Please verify if this inconsistency is intentional or if all rules should have flow tags.


🏁 Script executed:

#!/bin/bash
# Check for consistency of metadata structure across rule definitions in other test files

# Look for patterns of rule definitions with and without metadata
echo "Rules with metadata:"
rg -A 3 "rule:" --glob "tests/**/*.py" | rg -A 3 "metadata:"

echo "Rules without metadata that might need updating:"
rg -A 5 "rule:" --glob "tests/**/*.py" | rg -v "metadata:" | rg "steps:"

Length of output: 3955


Verify consistency in metadata inclusion for rule definitions.

In tests/unit_test/data_processor/data_processor_test.py, the rule "rule for download" includes a metadata section with flow_tags, whereas the rule multiflow_story_story_download_data_files_1 does not include any metadata. The grep results also show that some tests (like those in utility_test.py) define rules without metadata, suggesting an inconsistency.

  • Location: tests/unit_test/data_processor/data_processor_test.py (line 5296) for the rule in question.
  • Observation: Mixed patterns across the codebase where some rules include metadata (flow_tags) and others do not.
  • Action: Please verify whether the omission of the metadata section for the rule multiflow_story_story_download_data_files_1 is intentional. If all rules are meant to include metadata with flow_tags, update the definitions accordingly to ensure consistency.
kairon/shared/utils.py (3)

86-86: New import for supporting flow tags in rule steps

This import is part of a broader change to support flow tags, with the KYAMLStoryWriter class handling the serialization of story steps that include flow tag metadata.


1206-1206: Changed YAMLStoryWriter to KYAMLStoryWriter for writing rule steps

The new KYAMLStoryWriter is now used to write rule steps, which will preserve flow tag metadata when dumping rules to YAML format.


1635-1635: Added type hints to method parameters

Adding type hints improves code clarity and enables better IDE support and static type checking.

kairon/importer/validator/file_validator.py (3)

18-18: Added ruamel.yaml import

Import of ruamel.yaml library is needed for the new rule validation functionality.


30-30: New import for KRasaFileImporter

This import enables the use of the custom file importer that supports flow tags in import and export operations.


81-83: Replaced RasaFileImporter with KRasaFileImporter

The custom KRasaFileImporter class is now used instead of the standard RasaFileImporter to support flow tag metadata during import.

kairon/shared/data/model_data_imporer.py (4)

48-62: Verify single-step return logic.

_next_story_steps currently returns a single-step list, which may be correct if each rule or story block is meant to contain exactly one step. Otherwise, this might limit multi-step usage. Review the call sites to verify that we only ever need a single step here.


65-78: Class initialization looks consistent.

The usage of CustomStoryStepBuilder to incorporate flow_tags in rule parsing is well-structured. No issues found here.


134-137: Confirm overwriting of existing flow tags.

_parse_metadata assigns flow_tags to self.current_step_builder.flow_tags, overwriting any previously set tags. If you require merging or preserving earlier tags, add the relevant logic here.


201-289: Overall structure appears sound.

The YAML writer correctly processes rule steps, ensuring flow tags are serialized under metadata. This aligns well with the custom story/rule parsing approach.

kairon/shared/data/processor.py (6)

164-164: Good addition of new classes for enhanced functionality.

The import now includes KRasaFileImporter and CustomRuleStep from the custom importer module, which will enable flow tag support.


233-233: Well implemented switch to use the flow tag-aware method.

You've correctly updated to use get_rules_for_download() instead of the previous get_rules_for_training() when downloading files, ensuring flow tags are properly included.


5001-5025: Good implementation of flow tag support in rule events extraction.

This method has been properly updated to support flow tags:

  1. Type annotation now correctly accepts either CustomRuleStep or RuleStep
  2. Default flow tag is set to [FlowTagType.chatbot_flow.value]
  3. Flow tags are conditionally extracted from the rule step when available
  4. The flow tags are properly passed to the Rules object

5089-5091: Good addition of return type annotation.

Adding the -> StoryGraph return type annotation enhances type safety and code clarity.


5092-5118: Well implemented method for downloading rules with flow tags.

This new method appropriately:

  1. Creates CustomRuleStep objects that include flow tags for each rule
  2. Transfers all rule properties including flow tags
  3. Returns a StoryGraph with the complete rule information

Note that while get_rules_for_training() only includes rules with the chatbot flow tag, this method includes all rules with their respective flow tags, which makes it ideal for export/download functionality.


4801-4801: Good improvement of method signature with stronger typing.

Adding proper type annotations for bot: str and user: str improves code clarity and type safety.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (8)
kairon/shared/data/model_data_imporer.py (8)

81-81: Remove debug print statement before production

This print statement should be removed or converted to a proper logging statement in production code.

-        print("CustomStoryGraph created with flow_tags:", self.flow_tags)
+        # Use logger instead if debugging information is needed
+        # logger.debug(f"CustomStoryGraph created with flow_tags: {self.flow_tags}")

102-116: Optimize dictionary key lookups

Using key in dict.keys() is less efficient than directly using key in dict. This pattern appears multiple times in the code.

-        elif KEY_USER_INTENT in step.keys() or KEY_USER_MESSAGE in step.keys():
+        elif KEY_USER_INTENT in step or KEY_USER_MESSAGE in step:
            self._parse_user_utterance(step)
-        elif KEY_OR in step.keys():
+        elif KEY_OR in step:
            self._parse_or_statement(step)
-        elif KEY_ACTION in step.keys():
+        elif KEY_ACTION in step:
            self._parse_action(step)
-        elif KEY_BOT_END_TO_END_MESSAGE in step.keys():
+        elif KEY_BOT_END_TO_END_MESSAGE in step:
            self._parse_bot_message(step)
-        elif KEY_CHECKPOINT in step.keys():
+        elif KEY_CHECKPOINT in step:
            self._parse_checkpoint(step)
-        elif KEY_SLOT_NAME in step.keys():
+        elif KEY_SLOT_NAME in step:
            self._parse_slot(step)
-        elif KEY_ACTIVE_LOOP in step.keys():
+        elif KEY_ACTIVE_LOOP in step:
            self._parse_active_loop(step[KEY_ACTIVE_LOOP])
-        elif KEY_METADATA in step.keys():
+        elif KEY_METADATA in step:
            self._parse_metadata(step)
🧰 Tools
🪛 Ruff (0.8.2)

102-102: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


102-102: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


104-104: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


106-106: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


108-108: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


110-110: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


112-112: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


114-114: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


116-116: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


191-194: Consider documenting the purpose of the custom YAML representer

The custom YAML representer for lists is added but its purpose is not documented. Consider adding a comment explaining why this custom representation is needed.

        self.yaml = YAML()
+        # Custom representer to ensure lists are properly formatted in the YAML output
        self.yaml.representer.add_representer(list, self.represent_list)

    def represent_list(self, dumper, data):
        return dumper.represent_sequence('tag:yaml.org,2002:seq', data)

1-2: File has a typo in its name

The filename contains a typo: "imporer" should be "importer". Consider renaming the file and updating all imports accordingly.

This typo should be fixed to maintain code clarity and consistency.


35-36: Inconsistent argument formatting in CustomStoryStepBuilder

The flow_tags parameter has a default value but doesn't use type hints, unlike the parent class constructor. Consider adding type hints for consistency.

    def __init__(
-        self, flow_tags = None, *args, **kwargs
+        self, flow_tags: Optional[List[str]] = None, *args, **kwargs
    ) -> None:

67-67: Missing type hint for flow_tags parameter

The flow_tags parameter is missing a type hint. For consistency with the rest of the code, add the appropriate type hint.

-    def _new_rule_part(self, name: Text, source_name: Optional[Text], flow_tags = None) -> None:
+    def _new_rule_part(self, name: Text, source_name: Optional[Text], flow_tags: Optional[List[str]] = None) -> None:

242-243: Consider handling potential type conversion issues

The code converts flow tags to strings using str(tag) which might not handle complex objects gracefully. Consider adding validation or error handling for unexpected tag types.

        if hasattr(rule_step, "flow_tags"):
-            result["metadata"] = {"flow_tags": [str(tag) for tag in rule_step.flow_tags]}
+            try:
+                result["metadata"] = {"flow_tags": [str(tag) for tag in rule_step.flow_tags]}
+            except Exception as e:
+                # Log the error and use a default or skip the problematic tag
+                result["metadata"] = {"flow_tags": [str(tag) for tag in rule_step.flow_tags if isinstance(tag, (str, int, float, bool))]}

159-168: Consider adding docstrings to the KRasaFileImporter class

The KRasaFileImporter class and its methods lack docstrings. Consider adding documentation to explain the purpose and behavior of these methods.

class KRasaFileImporter(RasaFileImporter):
+    """Custom implementation of RasaFileImporter that creates and returns CustomStoryGraph objects."""
+
    def get_stories(self, exclusion_percentage: Optional[int] = None) -> CustomStoryGraph:
+        """Override the get_stories method to return a CustomStoryGraph instead of StoryGraph.
+        
+        Args:
+            exclusion_percentage: Optional percentage of stories to exclude
+            
+        Returns:
+            CustomStoryGraph object containing parsed story steps with flow tags
+        """
        story_graph = self._get_stories(exclusion_percentage)
        return CustomStoryGraph(story_graph.story_steps)

    def _get_stories(self, exclusion_percentage: Optional[int] = None) -> CustomStoryGraph:
+        """Load stories from files and create a CustomStoryGraph.
+        
+        Args:
+            exclusion_percentage: Optional percentage of stories to exclude
+            
+        Returns:
+            CustomStoryGraph object containing parsed story steps with flow tags
+        """
        return CustomStoryGraph(self.load_data_from_files(
            self._story_files, self.get_domain(), exclusion_percentage
        ))
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d4b653c and 5ac307d.

📒 Files selected for processing (1)
  • kairon/shared/data/model_data_imporer.py (1 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
kairon/shared/data/model_data_imporer.py (1)
kairon/shared/data/processor.py (1)
  • get_stories (3652-3763)
🪛 Ruff (0.8.2)
kairon/shared/data/model_data_imporer.py

102-102: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


102-102: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


104-104: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


106-106: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


108-108: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


110-110: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


112-112: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


114-114: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


116-116: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Codacy Static Code Analysis
  • GitHub Check: Analyze (python)
  • GitHub Check: Python CI

Copy link
Collaborator

@hiteshghuge hiteshghuge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reviewed

@hiteshghuge hiteshghuge merged commit cf1b5e2 into digiteinfotech:master Apr 1, 2025
7 of 8 checks passed
rexa0310 pushed a commit to rexa0310/kairon that referenced this pull request Apr 1, 2025
* flow tag in import and export

* flow tag in import and export

* flow tag in import and export

* flow tag in import and export

* flow tag in import and export

---------

Co-authored-by: spandan_mondal <spandan.mondal@nimblework.com>
@coderabbitai coderabbitai bot mentioned this pull request Apr 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants