Skip to content

[Bug]: AttributeError Triggered via HTTP API When Setting document chunk_method to tag/table/one/email/picture Without parser_conf #6081

Closed
@asiroliu

Description

@asiroliu

Self Checks

  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (Language Policy).
  • Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
  • Please do not modify this template :) and fill in all the required fields.

RAGFlow workspace code commit ID

x

RAGFlow image version

9a43ca28ab4d(infiniflow/ragflow:nightly)

Other environment information

Actual behavior

When attempting to update a document using the PUT endpoint /api/v1/datasets/{dataset_id}/documents/{document_id} with certain chunk_method values (e.g., "tag", "resume", "table", etc.) without explicitly setting parser_config, the server returns a AttributeError error

Expected behavior

No response

Steps to reproduce

1. Send a PUT request with chunk_method but omit parser_config:

payload = {"chunk_method": "tag"}  # or  "table", "one", "email", "picture" and not set parser_config
response = requests.put(  
    f'http://127.0.0.1:9380/api/v1/datasets/{dataset_id}/documents/{document_id}',  
    json=payload  
)  

2. Observe the error response:

{
    "code": 100,
    "data": None,
    "message": "AttributeError(\"'NoneType' object has no attribute 'items'\")",
}

3. Check server logs for the traceback:

2025-03-14 13:32:14,354 ERROR    23 'NoneType' object has no attribute 'items'
Traceback (most recent call last):
  File "/ragflow/.venv/lib/python3.10/site-packages/flask/app.py", line 880, in full_dispatch_request
    rv = self.dispatch_request()
  File "/ragflow/.venv/lib/python3.10/site-packages/flask/app.py", line 865, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)  # type: ignore[no-any-return]
  File "/ragflow/api/utils/api_utils.py", line 303, in decorated_function
    return func(*args, **kwargs)
  File "/ragflow/api/apps/sdk/doc.py", line 312, in update_doc
    DocumentService.update_parser_config(doc.id, req["parser_config"])
  File "/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3128, in inner
    return fn(*args, **kwargs)
  File "/ragflow/api/db/services/document_service.py", line 354, in update_parser_config
    dfs_update(d.parser_config, config)
  File "/ragflow/api/db/services/document_service.py", line 344, in dfs_update
    for k, v in new.items():
AttributeError: 'NoneType' object has no attribute 'items'

Additional information

The error originates from the default value of parser_config being set to None in api_utils.py#L355. When parser_config is not explicitly provided in the request payload, the code attempts to iterate over new.items() (where new is None), leading to the AttributeError.

    key_mapping = {
        "naive": {"chunk_token_num": 128, "delimiter": "\\n!?;。;!?", "html4excel": False, "layout_recognize": "DeepDOC",
                  "raptor": {"use_raptor": False}},
        "qa": {"raptor": {"use_raptor": False}},
        "tag": None,
        "resume": None,
        "manual": {"raptor": {"use_raptor": False}},
        "table": None,
        "paper": {"raptor": {"use_raptor": False}},
        "book": {"raptor": {"use_raptor": False}},
        "laws": {"raptor": {"use_raptor": False}},
        "presentation": {"raptor": {"use_raptor": False}},
        "one": None,
        "knowledge_graph": {"chunk_token_num": 8192, "delimiter": "\\n!?;。;!?",
                            "entity_types": ["organization", "person", "location", "event", "time"]},
        "email": None,
        "picture": None}

Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions