Skip to content

Media tracking implementation not compatible with customized org-attach-id-dir #1073

@bgutter

Description

@bgutter

Please update gptel first -- errors are often fixed by the time they're reported.

  • I have updated gptel to the latest commit and tested that the issue still exists

Bug Description

I believe that gptel-send does not correctly locate media files referenced by org-mode "attachment" links when using org-attach's "ID based" storage support. Instead, gptel will gather an incorrect path to the image file, determine that the file is not readable, and then fail to encode and send it to the LLM.

I spent some time debugging this and have a fix that works for me, but I am not sure if it is appropriate.

First, in gptel--parse-media-links, make sure to use org-expand if the org entity is an attachment. Without this, gptel will try to open a relative file path that does not exist.

        (when-let* ((link (org-element-context))
                    ((gptel-org--link-standalone-p link))
                    (raw-link (org-element-property :raw-link link))
                    (raw-path (org-element-property :path link))
                    (type (org-element-property :type link))
                    (path (if (string= type "attachment")
                              (org-attach-expand raw-path)
                            raw-path))

Second, in gptel--realize-query, set the current buffer appropriately before calling gptel--parse-buffer. Without this, org-attach-expand will not function correctly.

          (setq full-prompt (with-current-buffer (plist-get info :buffer)
                              (gptel--parse-buffer ;prompt from buffer or explicitly supplied
                               gptel-backend (and gptel--num-messages-to-send
                                                  (* 2 gptel--num-messages-to-send)))))

If this issue report and proposed resolution passes your sniff test, I can put together a pull-request.

Backend

None

Steps to Reproduce

  1. Customize org-attach-id-dir to some random location on your machine, not in the same working directory as your chat buffer
  2. Attach an image file and link to it
  3. Enable media tracking
  4. Try to send it, asking "What is in this image?"

Without workaround, llm will say that it cannot see images. With workaround, it will describe image content.

Alternatively, you can set gptel-expert-commands to t and use the JSON inspect query support;

  1. Perform steps 1-3 above
  2. Inspect query JSON in the gptel transient

Without workaround, JSON payload will reflect that the attachment link was not detected as media. You'll just see the link text in the preceding :text node. With workaround, the media will be represented as an encoded entry.

Additional Context

Reproduced on 2 machines

  • NixOS, Emacs 30.1
  • Windows 11, Emacs 29.something IIRC

Backtrace

Log Information

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingwaitingWaiting for a response from another party.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions