fix: Missing parameters such as character length when creating a knowledge base #2827

shaohuzhang1 · 2025-04-08T09:24:16Z

fix: Missing parameters such as character length when creating a knowledge base

…ledge base

f2c-ci-robot · 2025-04-08T09:24:21Z

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

f2c-ci-robot · 2025-04-08T09:24:25Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

shaohuzhang1 · 2025-04-08T09:25:12Z

apps/dataset/serializers/dataset_serializers.py

+                    'document_list': document_model_list,
+                    "document_count": len(document_model_list),
+                    "char_length": reduce(lambda x, y: x + y, [d.char_length for d in document_model_list],
+                                           0)}, dataset_id

        @staticmethod
        def get_last_url_path(url):


Certainly! Here are some observations and improvements for the provided code:

Code Structure: The function save has multiple responsibilities—saving, handling associations, and constructing the response—but it's split into two separate functions save_with_association and get_response. This can be refactored to merge these functionalities where appropriate.

Bulk Insertion Logic: You're correctly using QuerySet.bulk_create, but consider adding a try-except block around it to handle any potential errors during bulk insertion, such as database constraint violations or other exceptions.

Response Data:

Add checks to ensure that document_model_list is not empty before accessing its attributes like char_length.

Use type hints (Dict) explicitly where possible to clarify the expected types of inputs and outputs.

User ID Handling: Ensure that user_id is properly set when returning the response.

Here’s an improved version of the code based on the above points:

from functools import reduce def save(self, instance: Dict, with_valid=True): # Assume DataSetSerializers and ProblemParagraphMapping exist elsewhere in the code dataset_serializers = DataSetSerializers(instance) dataset_responses = { 'id': instance.get('id') } problems = None if with_valid: problems = self._create_problems_with_relation(instance) user_id = None # Define this variable somewhere in your class or pass it via arguments document_mappings = self._generate_document_mappings(instance, with_valid) if problems: dataset_responses.update({ 'problems': problems, 'document_relations': document_mappings, 'dataset_list': DocumentSerializers.Query(data={'dataset_id': instance['id']}).list(with_valid=True)} ) return dataset_responses @staticmethod def _create_problems_with_relation(dataset_data: dict) -> list: problems = [] # Implement logic to create new problems and their mappings here return problems @staticmethod def _generate_document_mappings(dataset_data: dict, with_valid=True) -> List[dict]: document_models = [] # Fetch all document models related to the dataset if with_valid: mapping_list = [] for doc in document_models: mapping_dict = {'doc_id': doc.id, 'dataset_id': doc.dataset_id} mapping_list.append(mapping_dict) QuerySet(ProblemParagraphMapping).bulk_create(mapping_list) for doc in document_models: document_responses = { 'id': str(doc.id), 'name': doc.name, 'description': doc.description, 'user_id': user_id, # Assuming user_id is defined somewhere else 'paragraphs': doc.paragraphs } document_responses['size'] = sum(len(x) for x in [resp["description"] for resp in (document_responses)] * len([x.split("\n") for x in document_responses])) document_requests = ResponseDocumentsSerializer.create_instances( data=[docResponses] )[0] document_responses.pop("paragraphs") return document_requests @staticmethod def save_with_association(instance: Dict, with_valid=True): problem_paragraph_mapping_list = [ ProblemParagraphMapping(...) for ... in ... ] if len(problem_paragraph_mapping_list) > 0: BulkCreateManager.objects.bulk_create(problem_paragraph_mapping_list) @staticmethod def get_response(data: dict, dataset_id, with_valid=True) -> Dict[str, Any]: return { **DataSetSerializers(data).data, 'document_list': DocumentListSerializer.data(data), "document_count": len(DocumentListSerializer.data(data)), "char_length": sum(reduce(lambda x, y: x + y, [d.char_length for d in Data]), 0)}, dataset_id

These changes aim to improve clarity, maintainability, and robustness of the original code. Adjustments might depend heavily on actual context and classes used within the surrounding environment.

fix: Missing parameters such as character length when creating a know…

5ed2e60

…ledge base

f2c-ci-robot bot added the do-not-merge/release-note-label-needed label Apr 8, 2025

shaohuzhang1 merged commit 3557ea5 into main Apr 8, 2025
4 checks passed

shaohuzhang1 deleted the pr@main@fix_create_dataset branch April 8, 2025 09:24

shaohuzhang1 commented Apr 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Missing parameters such as character length when creating a knowledge base #2827

fix: Missing parameters such as character length when creating a knowledge base #2827

Uh oh!

shaohuzhang1 commented Apr 8, 2025

Uh oh!

f2c-ci-robot bot commented Apr 8, 2025

Uh oh!

f2c-ci-robot bot commented Apr 8, 2025

Uh oh!

Uh oh!

shaohuzhang1 Apr 8, 2025

Uh oh!

Uh oh!

fix: Missing parameters such as character length when creating a knowledge base #2827

fix: Missing parameters such as character length when creating a knowledge base #2827

Uh oh!

Conversation

shaohuzhang1 commented Apr 8, 2025

Uh oh!

f2c-ci-robot bot commented Apr 8, 2025

Uh oh!

f2c-ci-robot bot commented Apr 8, 2025

Uh oh!

Uh oh!

shaohuzhang1 Apr 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!