Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server : (webui) add support for .pdf file upload #11647

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

dannyl1u
Copy link

@dannyl1u dannyl1u commented Feb 4, 2025

Closes #11611

Allows uploading of .pdf files, uses pdf.js to parse into text and prepends the text of the uploaded pdf(s) to the prompt.

Uploaded pdf(s) can be deleted and will not be included in the prompt.

Please let me know if any changes should be made (e.g. prompt structure)

Demo video (apologies for poor video quality, GitHub only allows up to 10mb):

fileupload.mp4

@ggerganov
Copy link
Member

Does it also work with other plain-text formats? For example, .txt, .h/.cpp, etc.

@woof-dog
Copy link
Contributor

woof-dog commented Feb 4, 2025

If there is an attachment button near the "Stop"/"Send", I'd really appreciate it if it's hidden by default but able to be turned on in the settings because I have to manually press "Stop" all the time and would not like to accidentally click the file attachment button.

You might consider also turning the textarea used for entering responses into a drop zone so you can drag and drop files there. That would really make the UX better for me since having to go through a file picker UI would probably take longer than opening the file and copying+pasting.

Also it seems you are basing this on an old commit, there were several changes to the textarea in the last few days, be sure to be careful rebasing - don't want to revert those other changes.

@ggerganov It appears the file input only accepts .pdf files on this PR

@ngxson
Copy link
Collaborator

ngxson commented Feb 4, 2025

This is a nice idea. But I'm still a bit hesitate about having this function built-in. Problems are:

  1. pdf.js does not work well with PDF containing tables and images
  2. The bundle size is quite big, +800kb gzip in this case

My speculation is that frontend-only PDF is not that good in practice, so probably we should not add this as a permanent functionality. Instead, hidden it behind a toggle in "settings" page, and using the CDN pre-built package seems to be a better solution. If users use it and really love it, we can bundle it inside llama-server later on.

In near future, what I'm thinking is to introduce a skeleton for "experimental" UI functionalities, so more things can be added in the future without risk of breaking the UI/UX. Things already on my list are:

  • PDF parsing
  • Model context protocol (discussed in another PR)
  • Equivalent of "canvas" on claude / chatgpt
  • On-browser python (Pyodide)
  • Or even the whole linux emulator on-browser (WebVM)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

server : add support for file upload to the Web UI
4 participants