Skip to content

fix: The document list status filtering is incorrect. #1822

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 7 additions & 6 deletions apps/dataset/serializers/document_serializers.py
Original file line number Diff line number Diff line change
Expand Up @@ -143,20 +143,21 @@ def get_request_params_api():
required=True,
description='知识库id'),
]

@staticmethod
def get_request_body_api():
return openapi.Schema(
type=openapi.TYPE_OBJECT,
required=['source_url_list'],
properties={
'source_url_list': openapi.Schema(type=openapi.TYPE_ARRAY, title="文档地址列表", description="文档地址列表",
'source_url_list': openapi.Schema(type=openapi.TYPE_ARRAY, title="文档地址列表",
description="文档地址列表",
items=openapi.Schema(type=openapi.TYPE_STRING)),
'selector': openapi.Schema(type=openapi.TYPE_STRING, title="选择器", description="选择器")
}
)



class DocumentInstanceSerializer(ApiMixin, serializers.Serializer):
name = serializers.CharField(required=True,
error_messages=ErrMessage.char("文档名称"),
Expand Down Expand Up @@ -396,8 +397,8 @@ def get_query_set(self):
query_set = query_set.annotate(
reversed_status=Reverse('status'),
task_type_status=Substr('reversed_status', TaskType(task_type).value,
TaskType(task_type).value),
).filter(task_type_status__in=[State(status).value]).values('id')
1),
).filter(task_type_status=State(status).value).values('id')
else:
if status != State.SUCCESS.value:
query_set = query_set.filter(status__icontains=status)
Expand Down Expand Up @@ -674,15 +675,15 @@ def cancel(self, instance, with_valid=True):
ListenerManagement.update_status(QuerySet(Paragraph).annotate(
reversed_status=Reverse('status'),
task_type_status=Substr('reversed_status', TaskType(instance.get('type')).value,
TaskType(instance.get('type')).value),
1),
).filter(task_type_status__in=[State.PENDING.value, State.STARTED.value]).filter(
document_id=document_id).values('id'),
TaskType(instance.get('type')),
State.REVOKE)
ListenerManagement.update_status(QuerySet(Document).annotate(
reversed_status=Reverse('status'),
task_type_status=Substr('reversed_status', TaskType(instance.get('type')).value,
TaskType(instance.get('type')).value),
1),
).filter(task_type_status__in=[State.PENDING.value, State.STARTED.value]).filter(
id=document_id).values('id'),
TaskType(instance.get('type')),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

在上述代码中,存在以下几个需要检查和优化的地方:

  1. get_request_body_api 方法中的 items 部分

    'source_url_list': openapi.Schema(type=openapi.TYPE_ARRAY,
                                         title="文档地址列表",
                                         description="文档地址列表",
                                         items=openapi.Schema(type=openapi.TYPE_STRING))

    这里定义了 source_url_list 字段为数组类型,并且每个元素是字符串类型。虽然当前这样定义没有问题,但在未来维护或扩展时,可能有必要考虑是否需要限制单个URL的最大长度或其他验证规则。

  2. 多个类中重复的部分

    class DocumentInstanceSerializer(ApiMixin, serializers.ModelSerializer):
        ...
        
        
    class AnnotationInstanceSerializer(serializers.ModelSerializer):
        ...
    
    ...

    在不同地方使用 serializers.ModelSerializer 比较频繁,可以考虑提取一个通用的基类来避免冗余代码。

  3. DocumentInstanceSerializer 中的数据验证逻辑

    for item in validated_data.get('annotation_type_list'):
        if item not in set(annotation_type_options) + {'其他'}:
            raise ValidationError(f"Invalid annotation type: {item}")

    这部分代码通过将允许的注解类型与 '其他' 合并到同一个集合内进行快速查找。如果需要更复杂的过滤条件或更多的数据验证,可以适当调整这个方法。

  4. cancel 方法中的查询设置

    ListenerManagement.update_status(QuerySet(Paragraph).annotate(
        ...
    ), TaskType(instance.get('type')), State.REVOKE)
    ListenerManagement.update_status(QuerySet(Document).annotate(
        ...
    ), TaskType(instance.get('type')), State.REVOKE)

    可以考虑将这些相同的操作合并成一个函数,以便于重用和调试。

  5. API 文档字段信息
    确保所有 API 的文档字段描述(如 @staticmethod get_request_body_api())准确无误,并提供适当的错误消息。

以下是针对以上发现的一些建议性修改:

@@ -143,20 +143,22 @@ def get_request_params_api():
                                           required=True,
                                           description='知识库id'),

     @staticmethod
     def get_request_body_api():
         return openapi.Schema(
             type=openapi.TYPE_OBJECT,
             required=['source_url_list'],
+            properties={
                 'source_url_list': openapi.Schema(type=openapi.TYPE_ARRAY, title="文档地址列表", description="文档地址列表",
-                                                    items=openapi.Schema(type=openapi.TYPE_STRING)),
+                                                    items=openapi.Schema(type=openapi.TYPE_STRING, min_length=1, max_length=255)),  # 添加最小、最大长度限制
                 'selector': openapi.Schema(type=openapi.TYPE_STRING, title="选择器", description="选择器")
             }
         )
 
@@ -677,16 +678,17 @@ def cancel(self, instance, with_valid=True):
                 ).values('id')
         else:
             if status != State.SUCCESS.value:
                 query_set = query_set.filter(status__icontains=status)

+        document_id = QuerySet(Document).annotate(reversed_status=Reverse('status')).get(id=instance['document_id']).id
+
         ListenerManagement.update_status(
             QuerySet(Paragraph).annotate(
                 reversed_status=Reverse('status'),
-                task_type_status=Substr('reversed_status', TaskType(instance.get('type')).value,
-                                        TaskType(instance.get('type')).value),
+                task_type_status=Substr('reversed_status', TaskType.instance(instance.get('type')).value, 1),  # 使用实例获取值
             ).filter(task_type_status__in=[State.PENDING.value, State.STARTED.value]),
                                              TaskType(instance.get('type')),
                                              State.REVOKE)
         ListenerManagement.update_status(
             QuerySet(Document).annotate(
                 reversed_status=Reverse('status'),
-                task_type_status=Substr('reversed_status', TaskType(instance.get('type')).value,
-                                        TaskType(instance.get('type')).value),
+                task_type_status=Substr('reversed_status', TaskType.instance(instance.get('type')).value, 1),  # 使用实例获取值
             ).filter(task_type_status__in=[State.FINISHED.value, State.CANCELED.value]).
                                               values('id')))

通过这些建议性的修改,可以使代码更加清晰、易于维护和验证。

Expand Down
2 changes: 1 addition & 1 deletion ui/src/views/document/index.vue
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@
"
class="justify-center"
:command="beforeCommand('status', State.STARTED, TaskType.GENERATE_PROBLEM)"
>生成问题中</el-dropdown-item
>生成中</el-dropdown-item
>
</el-dropdown-menu>
</template>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

在修改 text 属性时,从 "生成问题中" 改为 "生成中" 是一个较小的调整,但需要注意的是,在某些情况下,这可能会导致代码逻辑和用户界面显示上的不同体验。

推荐的做法是统一使用单字母缩写或更简洁的文本表达方法,如 "生". 这可以提高代码的可读性和一致性。同时,确保所有相同用途的按钮和菜单项都使用相同的格式,以便于维护和理解。

Expand Down
Loading