Skip to content

Exclude the ".txt" extension while adding text to vector DB via create_by_text API #6954

Closed
@gijigae

Description

@gijigae

Self Checks

  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • Please do not modify this template :) and fill in all the required fields.

1. Is this request related to a challenge you're experiencing? Tell me about your story.

While adding text to a vector DB via create_by_text API, the file name of document is suffixed with the .txt extension even though the document is comprised of text extracted from web or YouTube's transcrips. For example, a YouTube URL is referenced as "https://youtu.be/Xkm3-thqgXc.txt". Since the file name is used in "Citations and Attributions", showing "https://youtu.be/Xkm3-thqgXc.txt" does not make sense. It should be referenced as "https://youtu.be/Xkm3-thqgXc" without the extension.

image

2. Additional context or comments

No response

3. Can you help us with this feature?

  • I am interested in contributing to this feature.

Metadata

Metadata

Assignees

No one assigned

    Labels

    📚 feat:datasourceData sources like web, Notion, Logseq, Lark, Docs

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions