Fix RPC extension UI session reconciliation for editor/newSession flows by jorgeavaldez · Pull Request #127 · dnouri/pi-coding-agent

jorgeavaldez · 2026-02-18T06:43:25Z

Fixes RPC-mode extension UX so extension-driven flows (for example the handoff extension) behave like the pi CLI.

Handoff flow:

generate a prompt
let user edit it
start a new session
insert drafted prompt into the main input buffer

There were three gaps affecting extension flows:

missing support for extension_ui_request method editor
no explicit session-switch event when extensions call ctx.newSession()
session reconciliation after editor responses could fail when callbacks ran outside the chat buffer (e.g. temporary editor buffer context)

What changed

Added full extension_ui_request editor support:
- temporary editor replaces the existing input area (same window)
- C-c C-c submit, C-c C-k cancel
- restores previous buffer and cleans up temporary editor buffer on completion
Fixed unsupported extension UI responses to always echo the original request :id (for correct RPC correlation).
Added extension session reconciliation after extension-driven UI actions:
- run get_state, compare previous/current session identity (session-file, fallback session-id)
- if changed, reload get_messages history and update usage/header
- otherwise refresh state/header only
Expanded reconciliation triggers:
- after sending extension UI responses (confirm / select / input / editor)
- after set_editor_text
- agent_end fallback sync for hook/tool-driven session changes
Added handling for editor callbacks from non-chat buffers:
- pi-coding-agent--extension-ui-send-response now accepts optional chat-buf
- pi-coding-agent--extension-ui-sync-session-state now accepts optional proc/chat-buf
- editor submit/cancel captures originating chat buffer and passes it through, so reconciliation runs against the correct session even when callback execution context is the temporary editor buffer

Validation

Added/updated unit tests for:

editor submit/cancel behavior
unsupported-method response correlation
session sync changed vs unchanged paths after set_editor_text
agent_end fallback sync
extension UI response paths triggering sync
editor submit sync when callback runs outside chat buffer
handoff-style flow simulation: editor submit → new session state reconcile → set_editor_text

Render test suite passes when run against the local checkout.

Disclaimer: This pull request was put together with assistance from an LLM. I reviewed and revised it to the best of my abilities, and I do believe I have a solid understanding of how it works and why this is a good fix 😄

- Handle thinking_start, thinking_delta, thinking_end events - Use ```thinking markdown fence which becomes #+begin_src thinking - Add pi-thinking face (italic, comment color) - Add font-lock to apply face to thinking block content - Simple streaming approach: just insert markers at start/end, no replacement

Add GitHub Actions workflows for automated testing: - Unit tests on Emacs 28.2/29.4 - Integration tests with Ollama (qwen3:1.7b) - GUI tests with real LLM streaming - Nightly tests against pi-mono main branch Key changes: - Split ci.yml into focused workflows (unit/integration/gui) - Add nightly matrix testing: 0.30.2, latest, main - Use qwen3:1.7b (supports tool calling) - Optimize GUI tests to 7 essential tests (4 LLM calls) - Add debug output for CI test failures - Add scripts/ollama.sh for local Docker testing - Support PI_BIN override for testing local pi builds Local usage: make test-gui # NPM version make test-gui PI_BIN=/path/to/cli.js # Local build GUI tests focus on behavior requiring real windows: - Scroll preservation during streaming - Auto-scroll when at buffer end - End-to-end tool invocation

* fix: make pending requests per-process to avoid cross-session interference Root cause: pi--pending-requests was a global hash table shared by ALL pi sessions. When any session's process exited (via pi-quit or process death), it cleared ALL pending request callbacks for ALL sessions. This caused a subtle bug: quitting session A would silently break session B's ability to receive responses. The user would send a prompt, see their message displayed, but never receive a response because the callback was cleared. Changes: - Remove global pi--pending-requests variable - Add pi--get-pending-requests to create/retrieve per-process hash table stored as a process property - Update pi--rpc-async to use per-process table - Update pi--dispatch-response to use per-process table - Update pi--handle-process-exit to only clear that process's requests - Add defensive checks in pi--send-prompt with helpful error messages when process is unavailable All 198 unit tests and 10 integration tests pass. * fix: bind q to pi-quit for consistent quit behavior Previously, q in chat buffer was bound to quit-window while C-c C-p q used pi-quit. This caused inconsistent behavior: - quit-window: might kill or bury buffer, but leaves input window orphaned (showing some other buffer) - pi-quit: properly collects windows, kills both buffers, then deletes the input window Now both q and C-c C-p q use pi-quit for clean, consistent cleanup.

Skip aborted messages when updating pi--last-usage for context percentage display. Aborted messages may have incomplete usage data (zeros if aborted before message_start event), which would reset the header to show 0%. Matches TUI footer.ts behavior: only exclude aborted, not error (errors still consume context and have valid usage data). Fixes dnouri#7

* build: Add package-lint to local linting checks Adds package-lint (MELPA's linter) to catch issues before submission: - New lint-package target in Makefile - Integrated into check.sh pre-commit script - Auto-installs package-lint from MELPA if not present This catches issues like: - Private functions that shouldn't be autoloaded - Package-Requires in non-main files - Various MELPA conventions * ci: Add melpazoid workflow for MELPA style checks Melpazoid catches additional style issues beyond package-lint: - Top-level define-key (should be in keymap definition) - Idiomatic patterns (bolp instead of point comparisons) - License checks - And more Runs on push/PR to main branches. Warnings are informational. * ci: Separate lint workflow from unit tests - Unit Tests workflow now runs 'make test-unit' - New Lint workflow runs 'make lint' (checkdoc + package-lint) - Melpazoid workflow remains separate for style checks - Nightly still runs full 'make check' Added 'test-unit' make target for consistency - each workflow now maps to exactly one make target: test-unit.yml -> make test-unit lint.yml -> make lint test-integration.yml -> make test-integration-ci test-gui.yml -> make test-gui-ci nightly.yml -> make check melpazoid.yml -> (external)

- Remove autoload from private function pi--setup-session - Remove Package-Requires from pi-core.el (only effective in main file) - Use (not (bolp)) instead of point/line-beginning-position comparison - Move define-key calls inside pi-input-mode-map defvar - Add test for slash-capf bolp edge case All issues reported by package-lint and melpazoid are now resolved.

2026-01-02 - Change pi--make-separator to output org heading format (* prefix) - Update pi--post-process-org to demote headings by one level - Content headings now become sub-headings of You/Assistant separators - Enables org-mode folding and structure navigation

- Display HH:MM for today, YYYY-MM-DD HH:MM for older messages - Timestamp right-aligned, eats into right dashes, label stays centered - New pi-timestamp face (inherits shadow) for subtle styling - Extract timestamp from message :timestamp field (ms since epoch) - Capture current-time for live messages

Closes dnouri#5 - Show 'Compacting...' message while compaction runs - Display compaction result with token count and summary - Allow canceling auto-compaction with C-c C-k - Update context usage % after compaction

2026-01-02 Add support for auto-retry events, hook errors, and streaming errors: - Handle auto_retry_start/end events with visual retry notices - Display hook_error events with hook name and error details - Handle error type in message_update events - Show "no model available" warning when API key missing - Prevent duplicate Assistant headers during retry sequences - Add pi-retry-notice and pi-error-notice faces - Track retry state (is-retrying, retry-attempt, last-error) - Clear error/retry state on agent_start/end events - Add comprehensive tests for all new event types

When an API returns an error (e.g., 400 'Could not process image'), the error arrives in message_end with stopReason='error' and the error details in errorMessage. This was being silently ignored. Now we call pi--display-error for these cases, matching the TUI behavior in assistant-message.ts.

Error handling and session recovery

Previously, calling (pi "session-name") while already in a pi buffer would ignore the session argument and reuse the current session. Now we only reuse the current session when no session name is provided.

…ffer Create new session when name provided from pi buffer

Simplify the rendering pipeline from: Agent → Markdown → [pandoc] → Org → Display to: Agent → Markdown → Display Key changes: - Derive pi-chat-mode from gfm-mode instead of org-mode - Replace pandoc conversion with native markdown rendering - Replace drawer syntax (:BASH:...:END:) with overlay-based tool blocks - Tool blocks use background colors matching CLI (pending/success/error) - Align markdown tables after streaming completes - Update thinking block regexp for markdown fence syntax Dependencies: - Add: markdown-mode (pure Elisp, from MELPA) - Remove: pandoc (external binary) Net result: -16 lines of code, simpler architecture, no external tools.

pi-gui-test-streaming-p was only checking for 'streaming' status, but after pi-send the status is 'sending' (waiting for response). This caused a race condition where tests would complete before the LLM response arrived.

Small LLMs like qwen3:1.7b sometimes skip tool calls unless explicitly instructed. Changed prompt from 'Read the file' to 'Use the read tool to show the contents of' for reliability.

Migrate from org-mode to markdown-mode Simplify the rendering pipeline from: Agent → Markdown → [pandoc] → Org → Display to: Agent → Markdown → Display Key changes: - Derive pi-chat-mode from gfm-mode instead of org-mode - Replace pandoc conversion with native markdown rendering - Replace drawer syntax (:BASH:...:END:) with overlay-based tool blocks - Tool blocks use background colors matching CLI (pending/success/error) - Align markdown tables after streaming completes - Update thinking block regexp for markdown fence syntax Dependencies: - Add: markdown-mode (pure Elisp, from MELPA) - Remove: pandoc (external binary)

- test-gui.yml: 15 minute timeout - test-integration.yml: 15 minute timeout - nightly.yml: 10 min for unit, 15 min for integration/GUI Prevents hung tests from consuming 6 hours of CI time. Normal runs complete in 3-5 minutes.

Small LLMs sometimes skip tool calls with ambiguous prompts. Changed 'Use the read tool to show...' to 'Call the read tool on X and show me its contents...' which makes the tool call requirement explicit.

…le (dnouri#21) Previously, session activity state was tracked in two places: - pi--status (buffer-local symbol: idle/sending/streaming/compacting) - :is-streaming in pi--state plist (boolean) This duplication was error-prone and confusing. Now: - pi--status is the single source of truth, defined in pi-core.el - pi--update-state-from-event sets status on agent_start/agent_end - pi--extract-state-from-response returns :status for initialization - :is-streaming and :is-compacting removed from pi--state plist Benefits: - Clearer code: one variable to check, not two - Status set atomically in event handler, not scattered in display functions - Tests are simpler and more focused

* refactor: Rename files from pi-*.el to pi-coding-agent-*.el Per MELPA maintainer request, renaming the package from 'pi' to 'pi-coding-agent' to avoid namespace collision. * refactor: Update provide/require to pi-coding-agent - (provide 'pi) → (provide 'pi-coding-agent) - (provide 'pi-core) → (provide 'pi-coding-agent-core) - All test file provide/require statements updated similarly * refactor: Update package headers and footers - ;;; pi.el → ;;; pi-coding-agent.el - ;;; pi-core.el → ;;; pi-coding-agent-core.el - All test file headers similarly updated - Docstrings updated to reference pi-coding-agent * refactor: Update defgroup and :group references to pi-coding-agent - (defgroup pi) → (defgroup pi-coding-agent) - :group 'pi → :group 'pi-coding-agent (27 occurrences) - Kept :prefix "pi-" for ergonomic function naming * refactor: Update internal documentation strings - "Version of pi.el" → "Version of pi-coding-agent" - "Core functionality for pi.el" → "Core functionality for pi-coding-agent" * refactor: Update test file docstrings - "core pi.el functionality" → "core pi-coding-agent functionality" - "pi.el test files" → "pi-coding-agent test files" - "tests for pi.el" → "tests for pi-coding-agent" * refactor: Update test data paths Update sample directory path in session-dir-name test * refactor: Update Makefile file references - Update byte-compile to use new filenames - Update test loading to use new feature names - Update lint commands for new filenames - PI_BIN and CLI references unchanged (refers to external tool) * refactor: Rename all pi- symbols to pi-coding-agent- Full MELPA compliance: all public and private symbols now use the pi-coding-agent- prefix to match the package name. - Renamed ~200 functions, variables, faces, and constants - Main entry point is now M-x pi-coding-agent - Mode names: pi-coding-agent-chat-mode, pi-coding-agent-input-mode - Face names: pi-coding-agent-separator, etc. - Customization: pi-coding-agent-rpc-timeout, etc. - Fixed docstrings that exceeded 80 chars due to longer names - Updated all test calls from (pi) to (pi-coding-agent) The short 'pi' alias was removed as package-lint doesn't allow aliases with non-conforming prefixes. Users can add their own: (defalias 'pi 'pi-coding-agent) * fix: Update test scripts for renamed functions - Update run-gui-tests.sh to use pi-coding-agent-gui-test- prefix - Fix (pi) call in gui-test-utils to (pi-coding-agent) * docs: Update README for pi-coding-agent package name - Update title and all package references - Update installation instructions (package-install, use-package) - Update configuration variable names - Update M-x command from 'pi' to 'pi-coding-agent' - Add tip for creating short alias - Update all GitHub and MELPA URLs to new repo name * ci: Update MELPA recipe in melpazoid workflow Update package name from 'pi' to 'pi-coding-agent' and repo from 'dnouri/pi.el' to 'dnouri/pi-coding-agent' * docs: Add 'pi' alias to all use-package examples Users who copy the example config will get M-x pi automatically. Updated Usage section to reference M-x pi since users have the alias. * fix: Quote gfm-mode symbol in docstring Checkdoc requires Lisp symbols to be quoted in docstrings.

…dnouri#34) Root cause: Overlay created with rear-advance=t for streaming, but move-overlay doesn't clear the rear-advance property. The overlay kept extending whenever text was inserted at or after its end. Fix: Replace overlay with a new one (rear-advance=nil) when finalizing. This affects both normal tool completion and abort scenarios. Changes: - Add pi-coding-agent--tool-overlay-finalize helper that replaces the overlay with a non-rear-advance version - Use finalize helper in display-tool-end (success/error cases) - Use finalize helper in display-agent-end (abort case) - Update toggle-tool-output to adjust overlay bounds after content change Closes dnouri#31

* Update for pi-coding-agent 0.37 API changes - reset -> new_session command - hook_error -> extension_error event - hookPath -> extensionPath field - queuedMessageCount -> pendingMessageCount - Update CI to test against 0.37.5 * Add integration test for new_session command Catches API breaking changes like reset -> new_session * Update branch API: entryIndex -> entryId The branch command now uses entryId (UUID) instead of entryIndex (number). Format function now takes optional index parameter for display. * Improve integration tests for API compatibility - Simplify new_session test to not require LLM response - Add get_branch_messages test to verify entryId field - Both tests catch API breaking changes without slow LLM calls * Auto-reinstall pi when cached version differs from requested Prevents stale cache from causing test failures after version upgrades. Skips check for 'latest' since we can't compare version strings.

After the rename to pi-coding-agent, the check.sh script still referenced the old file names (pi.el, pi-core.el, etc.). Also added package initialization for markdown-mode dependency, matching the Makefile's approach.

The async callback in pi-coding-agent-new-session relied on current buffer context to find the chat buffer, but async callbacks run in arbitrary buffer contexts. This caused pi-coding-agent--clear-chat-buffer to silently do nothing when the callback was executed in a non-pi buffer. Fix: Capture chat-buf before the async call and use with-current-buffer in the callback, consistent with other commands like pi-coding-agent-cycle-thinking. Also add buffer-live-p check to handle case where buffer is killed while waiting for RPC response.

* fix: Handle empty vector and nil text in branch selector Two issues fixed: 1. Empty vector detection: JSON arrays are parsed as vectors, and (null []) is nil (false). Changed to use seq-empty-p to properly detect empty messages list. 2. Nil text handling: Added (or text "") to handle case where message text might be nil, preventing wrong-type-argument error from truncate-string-to-width. Added tests for both edge cases. * fix: Async callback context issues in branch and resume Multiple functions used --get-chat-buffer and --get-input-buffer inside async callbacks, but callbacks run in arbitrary buffer contexts where these functions return nil. Changes: - --load-session-history: Accept optional chat-buf parameter - --display-session-history: Accept optional chat-buf parameter - --show-branch-selector: Capture chat-buf and input-buf before async call, pass to --load-session-history - pi-coding-agent-resume-session: Capture chat-buf before async call Also includes previous fixes: - Empty vector detection: Use seq-empty-p instead of null - Nil text handling: Default to empty string in format-branch-message Added test for load-session-history buffer parameter. * refactor: Address code review feedback - Consistent naming: Use chat-buf instead of target-buf in --load-session-history (matches parameter name) - Defensive coding: Add buffer-live-p check in --display-session-history - Documentation: Add note about async callback usage, fix docstring (MESSAGES is a vector, not a list) - Clarity: Add comment explaining why text nil check is needed (RPC may return null)

… (dnouri#39) Add font-lock-ensure call after toggling tool output. JIT font-lock only fontifies visible portions lazily, so expanding collapsed content left it unhighlighted until the user scrolled around. The fix follows the established pattern from --finalize-streaming-message and --render-history-text which both call font-lock-ensure after inserting content.

…uri#24) (dnouri#40) The --spinner-stop function now accepts an optional buffer parameter. When called from async callbacks (which may run in arbitrary buffer context like process sentinel), the buffer must be passed explicitly. Without this fix, if the process died during manual compaction: 1. Callback runs in sentinel context 2. --spinner-stop calls --get-chat-buffer which returns nil 3. (delq nil spinning-sessions) doesn't remove actual buffer 4. Spinner continues spinning forever This follows the same pattern used to fix async callback context issues in branch, resume, and new-session.

* fix: Pin all CI workflows to minimum supported pi version (0.51.3) - test-gui.yml: 0.50.1 → 0.51.3 - test-integration.yml: 0.50.4 → 0.51.3 - nightly.yml: 0.50.1 → 0.51.3 in version matrix - README.org: update nightly description (0.48.0 → 0.51.3) - Remove dead get_commands skip guards (0.50.1 compat no longer needed) - Makefile: set load-prefer-newer to avoid stale .elc in make test * refactor: Centralize PI_VERSION — Makefile is single source of truth Workflows now extract PI_VERSION from Makefile instead of hardcoding it. Updating the pinned version only requires changing one line in the Makefile. Nightly simplified to test just two versions: pinned + latest. Removed pi-mono main branch builds (noise — latest covers forward compat). * fix: Session-name integration test race condition The test called accept-process-output once for 1 second, which returns as soon as ANY output arrives. If residual output from the prompt phase was buffered, it returned immediately — before the set_session_name RPC response arrived. The file was then read before the write happened. Fix: poll for the RPC callback flag with a proper timeout loop, matching the pattern used by all other integration tests. * fix: Sync pi-coding-agent-core.el version header to 1.2.6 Was stuck at 0.1.0 while the main file had 1.2.6. Both files are part of the same package and should have matching versions.

…ri#109) - Change stats key from S to i to avoid conflict with Skills S submenu - Switch submenu run keys from digits (1-9) to letters (a-z) to support more than 9 commands without 'Key sequence 1 0' crash - Switch submenu edit keys from shift-numbers (!@#) to uppercase letters (A-Z) so a/A always refer to the same command - Extract shared ordering helper (submenu-commands-ordered) so run and edit key positions always correspond - Cap both key sets at 26 entries

Split the 4,300-line monolith into a strict dependency chain: pi-coding-agent.el (entry point, 165 lines) ├── menu.el (transient menu, session management) ├── input.el (history, completion, send/abort, queuing) └── render.el (chat rendering, tool output, fontification) └── ui.el (shared state, faces, modes, display primitives) └── core.el (JSON/RPC protocol — unchanged) Test files mirror the source structure (6 test files, 454 tests). Per-module make targets for fast feedback during development. Purely structural — no behavioral changes.

…ates (dnouri#113) Fontify buffers (" *pi-fontify:<lang>*") were globally shared by language. When two sessions wrote files in the same language, they corrupted each other's buffer — producing garbled syntax highlighting and triggering full O(n) re-fontification on every token instead of incremental O(1) appends. This caused multi-second freezes during write tool streaming in multi-session workflows. Changes: - Each chat buffer now tracks its own fontify buffers via a buffer-local hash table (pi-coding-agent--fontify-buffers). Lookup is by hash table, not by buffer name. Buffers are killed on session end. - Skip the delete+reinsert cycle in display-tool-streaming-text when the visible tail content is unchanged. Most LLM tokens extend a partial line not shown in the rolling preview, making ~90% of deltas no-ops for the buffer mutation path. fontify-sync still runs on every delta to keep the fontify buffer current. - Remove handle-message-update, which accumulated text deltas via O(n²) concat into a field never read during streaming. - Surface fontify-sync errors via message instead of swallowing silently with (condition-case nil ... (error nil)). Benchmarks (Emacs 30.1, 500-delta write tool, xvfb real windows): Two-session victim (mean): 36,145µs → 1,537µs (23.5x faster) Two-session victim (p95): 84,656µs → 2,753µs (30.7x faster) Single-session write (p50): 641µs → 594µs (stable) Bash streaming (mean): 79µs → 28µs (2.8x faster) Fontify buffer corruption: 246 shrink events → 0 Fontify buffer leaks: confirmed → fixed Realistic partial-token benchmark (1,623 tokens, 171 lines of Python): Display updates: 171 out of 1,623 deltas (89.5% skip rate) Mean per-delta: 1,063µs (includes 89.5% near-free no-ops)

* fix: reduce test model maxTokens from 8192 to 500 qwen3:1.7b generates ~100-200 thinking tokens per response that cannot be disabled via Ollama's OpenAI-compatible API. With maxTokens 8192, the model can ramble for 600+ tokens (21s locally, ~105s on CI). Reducing to 500 caps worst-case output while still allowing: - Tool calls (~160 tokens including thinking overhead) - Short text responses (~300 tokens after thinking) This eliminates the long-tail variance that pushes CI runs past timeouts. * fix: increase test timeouts for CI reliability Integration: 180s → 300s (steer test needs 2 LLM turns, ~170s on CI) GUI: 90s → 180s (tool calls take 50-65s on CI, 90s gave only 38% margin) Added docstrings explaining the rationale for each value. * fix: add Ollama model warmup to all CI workflows First inference after model pull pays a 10-30s cold start penalty for loading weights into memory. Add a minimal warmup call (think:false, num_predict:1) after pull to absorb this cost during setup rather than during the first test. Applied to: nightly, test-integration, test-gui workflows. * fix: reduce test model maxTokens from 500 to 300 500 tokens was still too much for the steer test (needs 2 LLM turns). On a loaded CI runner, one turn took 104s at maxTokens=500. Two turns exceeded the 300s timeout. At maxTokens=300: - Tool calls still work (use 123-155 tokens including thinking overhead) - Text responses complete in ~9s locally (vs ~17s at 500) - Two turns estimated: 90s on CI (5x), 124s under heavy load (10x) - Both well within the 300s timeout Text content is empty at 300 tokens (thinking consumes the budget), but integration tests only check events and RPC mechanics, not content. * fix: improve GUI test reliability — drop /no_think, poll for content Two root causes identified via spike testing (240+ trials): 1. /no_think prefix hurts tool calling: Without it, 'Read the file' achieves 85% tool call rate (vs 75% with /no_think). The prefix is just noise that wastes the model's thinking budget. 2. ~2% tool re-call rate: The model occasionally calls read again after getting results (following the tool description's 'continue with offset' instruction). This loops indefinitely in the agent. Fixes: - Drop /no_think from all GUI test prompts - content-tool-output-shown: poll for expected content instead of waiting for idle, making the test immune to re-call loops - scroll-preserved-tool-use: same pattern — wait for tool output before checking scroll position - Bump maxTokens 300→400 for better tool call success rate while keeping steer test timing reasonable (~200s estimated for 2 turns) * fix: tool-overlay-bounded — poll for content, avoid idle race The test failed in 0.007s because of a race condition: the user message is displayed (satisfying wait-for-send-start), but the model hasn't started streaming yet, so wait-for-idle returns immediately. The test then checks for 'BEFORE' which isn't in the chat yet. Fix: send with no-wait and poll for expected content, same pattern as content-tool-output-shown. This also makes the test immune to the ~2% tool re-call loop. * fix: tool-overlay-bounded — split into two simple prompts The combined prompt ('Read file and say ENDMARKER') has ~15-30% tool call failure rate because it's too complex for qwen3:1.7b. Split into two prompts: 1. 'Read the file PATH' — simple, ~85% tool call rate 2. 'Say ENDMARKER' — no tool needed, text only The test still verifies the overlay boundary: ENDMARKER text (from the second turn) must not be inside the tool block overlay (from the first turn). The overlay is per-tool-block, so text from any subsequent turn exercises the same boundary logic. * fix: set maxTokens=500 + integration timeout 600s CI runs revealed two competing constraints: - Tool calling needs maxTokens≥500 (at 300-400, ~15-30% no-tool-call rate) - Steer test needs low maxTokens (2 turns × 150s/turn at maxTokens=500) Resolution: use maxTokens=500 for reliable tool calls (run 1 had 13/13 GUI pass at 500), and increase integration timeout to 600s to accommodate the steer test. Workflow job timeout bumped to 20min for integration jobs (nightly + PR/push). * fix: tool-overlay-bounded — use fresh session to avoid race Tests 7-11 share a session created after extension tests killed the original. This new session has a race condition where wait-for-idle returns before the model starts streaming, causing 0.007s completions. Use with-fresh-session for the overlay test to guarantee a properly initialized session, matching how content-tool-output-shown succeeds (it uses the original session from test 1, which works reliably).

* refactor: remove npm cache from CI workflows The cache saved ~5-10s per job but caused a critical bug: the 'latest' cache key (deps-Linux-pi-latest) never busted, so nightly pi@latest tests were running pi 0.30.2 instead of the actual latest (0.52.9). npm install takes ~7-15s — not worth the complexity. * feat: bump pinned pi version 0.51.3 → 0.52.9 All RPC commands we use (get_commands, set_session_name, etc.) work correctly with 0.52.9. The 0.52.x series adds image support for steer/follow_up and auto-compaction resume — backward-compatible changes that don't affect our test suite. * trigger CI

…up (dnouri#116) Copying from the chat buffer preserves raw markdown by default — useful for pasting into docs, Slack, or any markdown-aware context. For plain text without **, backticks, or code fences: - Defcustom pi-coding-agent-copy-visible-text: set to t to make all copy commands (M-w, C-w) strip hidden markup automatically - Command pi-coding-agent-copy-visible: explicit one-off, bindable Uses filter-buffer-substring-function to strip invisible and display-empty text. Delegates to buffer-substring--filter when disabled.

Derive pi-coding-agent-input-mode from gfm-mode instead of text-mode. This gives the prompt buffer markdown highlighting (bold, italic, code spans, fenced blocks, headings) while keeping markup visible. All pi keybindings (C-c C-c, C-c C-k, TAB, M-p, M-n, etc.) override their gfm-mode counterparts via the derived mode keymap.

The pi CLI sends ANSI color codes in extension_ui_request setStatus messages (e.g. plan-mode sends \e[38;5;226m⏸ plan\e[39m for yellow text). Terminal UIs handle these natively, but Emacs header-lines render them literally, producing garbled output like: │ [38;5;226m⏸ plan[39m Apply ansi-color-filter-apply in --extension-ui-set-status to strip escape sequences at the ingestion boundary. The built-in ansi-color library is already required by render.el. Add test for ANSI stripping alongside existing setStatus tests.

Strip ANSI escape codes from extension status text

…ut (dnouri#119) Register a jit-lock function that runs after font-lock and strips markdown text properties (display, invisible, font-lock-multiline, face) from the pending tool overlay, restoring intended faces for both header and content regions.

* test: add thinking whitespace normalization coverage * fix: normalize thinking block whitespace while streaming * fix: harden thinking streaming normalization and state reset

Press f on any user turn in the chat buffer to fork the conversation from that point. Uses the get_tree RPC to map visible buffer position to entry ID, handling compaction (compacted-away turns aren't rendered, so the last-N visible headings are aligned to the active branch). Also fixes chat navigation (n/p): the old pattern searched for the never-present "You:" instead of setext headings, and now recenters the heading to the top of the window. Changes: - Setext heading detection: regex + underline predicate, shared across navigation and fork-at-point - Turn detection: collect headings, map point to 0-based turn index - Tree helpers: flatten-tree (hash index) and active-branch-user-ids (leaf-to-root walk) - execute-fork: extracted shared fork logic (RPC + state refresh + history reload + input prefill), used by both fork-at-point and the menu fork selector - resolve-fork-entry: tree-to-entryId mapping, independently testable - build-tree test helper: flat specs replace deeply nested tree construction in tests 49 new tests covering heading detection, navigation, turn indexing, tree walk scenarios, and fork-at-point integration.

* feat: DWIM toggle for pi-coding-agent entry point Add agent-shell-style DWIM behavior to the main `pi-coding-agent' command and a new `pi-coding-agent-toggle' command for toggling session window visibility. DWIM behavior for M-x pi-coding-agent (no prefix arg): - From a pi buffer: toggle visibility (hide windows) - From a non-pi buffer with existing project session: reuse it - No session for project: create a new one - C-u prefix or Lisp string arg: named session (unchanged) New command `pi-coding-agent-toggle': - Pi windows visible -> hide them - Pi windows hidden but session exists -> show them - No session -> user-error New functions in pi-coding-agent-ui.el: - `pi-coding-agent-project-buffers': find all pi chat buffers for the current project directory, ordered by buffer-list recency - `pi-coding-agent--hide-buffers': hide pi windows using a hybrid strategy: delete-window when the frame has other windows, bury-buffer for sole-window frames. Second pass handles the case where bury-buffer lands on the paired pi buffer. Bug fix in test-common.el: - Mock session cleanup called nonexistent functions `pi-coding-agent--chat-buffer-name' and `pi-coding-agent--input-buffer-name' (swallowed by ignore-errors, causing buffer leaks). Fixed to use `pi-coding-agent--buffer-name' with :chat/:input type argument. Tests: 5 new tests (502 total, all passing): - dwim-reuses-existing-session - new-session-with-prefix-arg - project-buffers-finds-session - project-buffers-excludes-other-projects - toggle-no-session-errors * refactor: simplify DWIM hide logic, input-first window deletion - Rename --hide-buffers → --hide-session-windows for clarity - Process input windows before chat windows: input windows are always child splits so delete-window succeeds cleanly, then chat windows are handled. Eliminates the second-pass hack that was needed when chat was processed first. - Bind project-buffers result once instead of calling twice - Move side effects out of cond conditions for readability * fix: preserve window layout when hiding session Don't delete-window the chat window — that collapses vertical splits. Instead, follow quit's pattern: delete input windows (child splits collapse into parent), then bury the chat buffer in its window (switches to previous buffer, window stays). Before: [code | pi-chat] → hide → [code] (split gone) After: [code | pi-chat] → hide → [code | prev-buf] (split kept) * fix: review cleanup — docs, toggle deps, test hygiene Address issues from code review: - Remove 'how we got there' paragraph from --hide-session-windows docstring (comments should explain what/why, not history) - Remove redundant negative comment ('don't delete the window') - Add --check-dependencies to pi-coding-agent-toggle (was missing) - Add toggle to Usage header in Commentary - Rewrite prefix-arg test: no longer nests inside with-mock-session macro unnecessarily; proper unwind-protect structure - Add test for DWIM reuse of named sessions (explicit coverage for the behavior change where project-buffers finds named sessions) --------- Co-authored-by: Conor Nash <conor@nbs.consulting> Co-authored-by: Daniel Nouri <daniel.nouri@gmail.com>

dnouri and others added 30 commits December 31, 2025 13:05

refactor: Use split-string-shell-command for arg parsing (dnouri#12)

75f8496

feat: Add visual feedback for compaction (dnouri#16)

a5754a0

Closes dnouri#5 - Show 'Compacting...' message while compaction runs - Display compaction result with token count and summary - Allow canceling auto-compaction with C-c C-k - Update context usage % after compaction

Merge pull request dnouri#17 from dnouri/feature/error-handling

05d2750

Error handling and session recovery

fix: Create new session when name provided from pi buffer

110103d

Previously, calling (pi "session-name") while already in a pi buffer would ignore the session argument and reuse the current session. Now we only reuse the current session when no session name is provided.

Merge pull request dnouri#18 from dnouri/fix-named-session-from-pi-bu…

71fc80c

…ffer Create new session when name provided from pi buffer

fix: Wait for 'sending' status in GUI tests

99fce68

pi-gui-test-streaming-p was only checking for 'streaming' status, but after pi-send the status is 'sending' (waiting for response). This caused a race condition where tests would complete before the LLM response arrived.

fix: Make tool-overlay test prompt more explicit

b4f283c

Small LLMs like qwen3:1.7b sometimes skip tool calls unless explicitly instructed. Changed prompt from 'Read the file' to 'Use the read tool to show the contents of' for reliability.

fix: Add timeout-minutes to CI workflows (dnouri#20)

3880fc9

- test-gui.yml: 15 minute timeout - test-integration.yml: 15 minute timeout - nightly.yml: 10 min for unit, 15 min for integration/GUI Prevents hung tests from consuming 6 hours of CI time. Normal runs complete in 3-5 minutes.

fix: clarify tool-use prompt in flaky GUI test (dnouri#22)

a35146a

Small LLMs sometimes skip tool calls with ambiguous prompts. Changed 'Use the read tool to show...' to 'Call the read tool on X and show me its contents...' which makes the tool call requirement explicit.

Update check.sh to use renamed file names (dnouri#37)

ea78df4

After the rename to pi-coding-agent, the check.sh script still referenced the old file names (pi.el, pi-core.el, etc.). Also added package initialization for markdown-mode dependency, matching the Makefile's approach.

dnouri and others added 28 commits February 8, 2026 01:57

chore: Bump version to 1.3.0

43706c7

feat: Show args and details for generic (extension) tools (dnouri#112)

0cbe484

chore: bump version to 1.3.1

c44120f

Merge pull request dnouri#118 from conornash/ansi-code-stripping

0dd2a33

Strip ANSI escape codes from extension status text

feat: Better copy defaults, opt-in GFM input highlighting (dnouri#120)

953c85a

fix: stabilize read and thinking markdown rendering (dnouri#122)

aedd1a0

chore: bump version to 1.3.2

88e20cd

fix: stabilize thinking block rendering while streaming (dnouri#123)

17536e4

* test: add thinking whitespace normalization coverage * fix: normalize thinking block whitespace while streaming * fix: harden thinking streaming normalization and state reset

support editor requests from pi

f1d0cd9

add debug logs

d32faf9

detect and resolve extensions creating a new session

4cd96a2

remove debug logs

86e8117

sync state between extension controlled interactions

a7ecf18

test for edge-case: no get_state call from non-chat callback buffer

4ffb05d

sync state from chat buffer

8d86d97

fix lint issues

7a2cf41

jorgeavaldez force-pushed the feat/rpc/support-editor-resolve-session-changes branch from 55001f1 to 7a2cf41 Compare February 18, 2026 20:27

dnouri force-pushed the master branch from 7311a02 to 1d38a20 Compare February 26, 2026 13:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix RPC extension UI session reconciliation for editor/newSession flows#127

Fix RPC extension UI session reconciliation for editor/newSession flows#127
jorgeavaldez wants to merge 135 commits intodnouri:masterfrom
jorgeavaldez:feat/rpc/support-editor-resolve-session-changes

jorgeavaldez commented Feb 18, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

jorgeavaldez commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changed

Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jorgeavaldez commented Feb 18, 2026 •

edited

Loading