Skip to content

[Bug]: HTTP API Lacks Type and Valid Value Checks for all parameters under parser_config During Dataset Creation #5719

Closed
@asiroliu

Description

@asiroliu

Is there an existing issue for the same bug?

  • I have checked the existing issues.

RAGFlow workspace code commit ID

x

RAGFlow image version

cb1febb7d8ae (infiniflow/ragflow:v0.17.0-slim)

Other environment information

Actual behavior

The API accepts and persists invalid values for parser_config.chunk_token_count during dataset creation, including negative numbers, non-integer types, and out-of-range values.

Expected behavior

The API should reject requests with:

Negative integers (e.g., -1)
Non-integer types (e.g., 3.14, "1024")
Values exceeding reasonable limits (e.g., >2048)

Steps to reproduce

1. Send a POST request with invalid chunk_token_count values (e.g., -1, 3.14, "1024")

response = requests.post(
    f'http://127.0.0.1:9380/api/v1/datasets',
    json={
        "name": "test",
        "chunk_method": "naive",
        "parser_config": {"chunk_token_count": -1}  # Test with -1/3.14/"1024"
    }
)

2. Observe the successful response:

{
    "code": 0,
    "data": {
        "parser_config": {"chunk_token_count": -1},
        "name": "test",
        // ...other fields
    }
}

Additional information

No response

Activity

asiroliu

asiroliu commented on Mar 6, 2025

@asiroliu
ContributorAuthor

Currently, all parameters under parser_config lack type checking and validation of valid values.

changed the title [Bug]: HTTP API Lacks Type and Valid Value Checks for parser_config.chunk_token_count During Dataset Creation [Bug]: HTTP API Lacks Type and Valid Value Checks for all parameters under parser_config During Dataset Creation on Mar 6, 2025
added theissue type on Mar 7, 2025
added a commit that references this issue on Mar 7, 2025
asiroliu

asiroliu commented on Mar 10, 2025

@asiroliu
ContributorAuthor
  1. When chunk_token_num, task_page_size, auto_keywords, or auto_questions are set to ​floating-point numbers (e.g., 3.14), no error is thrown.
  2. When these fields are set to ​string types (e.g., "1234"), the error message is unclear and appears as:
{'code': 100, 'data': None, 'message': 'TypeError("\'<=\' not supported between instances of \'int\' and \'str\'")'}
added a commit that references this issue on Mar 18, 2025

Fix: infiniflow#5719 Added type check for parser_config

1946d73

3 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Participants

    @asiroliu@KevinHuSh

    Issue actions

      [Bug]: HTTP API Lacks Type and Valid Value Checks for all parameters under parser_config During Dataset Creation · Issue #5719 · infiniflow/ragflow