Skip to content

Chainlit Data Layer nullifies thread-level "metadata" column (sometimes) #2497

@DmitriyAlergant

Description

@DmitriyAlergant

Describe the bug

The data model has "metadata" JSON column on a thread level. The data layer's update_thread() method is called from different places i the backend, sometimes including session-level metadata (e.g. from socket.py persist_user_session() ), and sometimes without metadata.

The method was supposed to only update the attributes that were provided. But this clearly cannot work for metadata column that is defaulted to {} that is not None

        data = {
            "id": thread_id,
            "name": thread_name,
            "userId": user_id,
            "tags": tags,
            "metadata": json.dumps(metadata or {}),
            "updatedAt": datetime.now(),
        }

        # Remove None values
        data = {k: v for k, v in data.items() if v is not None}

        # Build the query dynamically based on available fields
        columns = [f'"{k}"' for k in data.keys()]
        placeholders = [f"${i + 1}" for i in range(len(data))]
        values = list(data.values())

As a result, i believe that with each invocation of update_thread() without providing metadata, nullifies it in the database while the other invocation (with providing metadata) sets it up. In practice in the database most of the threads do have metadata populated with session-level details, but that may be due to timing that persist_user_session() gets called last. Some threads in the database actually end up with blank metadata {}.

Impact: not sure entirely. I observed quite a few of subtle and evasive few thread persistence / thread resume related issues, some of which may be caused by this.

Proposed Fix Behavior

Improve the data layer's update_thread() method so that:

  • (at a minimum) when metadata argument is passed, no-op (do not update the column) - that was clearly an intended behavior, currently not working as such.

  • (ideally) when metadata argument is passed, do not replace the column entirely but rather merge by top-level keys. This would enable apps to persist and retrieve their own custom thread-level metadata without being overwritten by chainlit itself.

Willing to submit a PR with a fix. Especially if maintainers bless the second option (merge metadata by keys). This is actually how I came across this. Was trying to implement some custom thread-level persistence (letting the app directly invoke get_data_manager().get_thread() and get_data_manager().update_thread(), and it has not worked.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingdata layerPertains to data layers.needs-triagestaleIssue has not had recent activity or appears to be solved. Stale issues will be automatically closed

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions