Quick LLM scratchpad inside your Zsh — ask, get an answer, move on.
Toggle into a prompt where you can draft multi-line queries, send them to an LLM, and stream replies directly in your shell.
- Toggle between shell and LLM mode with Ctrl-Q
- Multi-line editing in LLM mode (Ctrl-J inserts a newline)
- Takes your entered shell command input into a prompt and returns it back
- LLM prompt is buffered until you submit it or toggle the tool with a command line input
- Cancel with Esc or Ctrl-C
- Spinner while waiting for first response
- Streamed output from your llm tool
- Works with any CLI LLM that supports stdin/stdout (Ollama, sgpt, AWS Q Chat, MLX LM, Google Gemini CLI, OpenAI Codex, etc.)
Interactive LLM scratchpad for Zsh — toggle into query mode and stream answers inline.
Clone and source manually:
git clone https://github.com/oleks-dev/zsh-llm-mode.git ~/.zsh-llm-mode
echo "source ~/.zsh-llm-mode/zsh-llm-mode.plugin.zsh" >> ~/.zshrc
Or with a plugin manager:
# zinit
zinit light oleks-dev/zsh-llm-mode
# antidote
antidote bundle oleks-dev/zsh-llm-mode
Set the backend command before loading the plugin:
export ZSH_LLM_MODE_CMD='ollama run phi4-mini:3.8b'
You can change it any time.
export ZSH_LLM_MODE_CMD='ollama run qwen3:8b --think=false --hidethinking'
export ZSH_LLM_MODE_CMD='sgpt --model gpt-4'
export ZSH_LLM_MODE_CMD='q chat --no-interactive'
export ZSH_LLM_MODE_CMD='mlx_lm.generate --model /Users/guest/.cache/huggingface/hub/models--mlx-community--Mistral-7B-Instruct-v0.3-4bit/snapshots/a4b8f870474b0eb527f466a03fbc187830d271f5 --prompt -'
export ZSH_LLM_MODE_CMD='gemini'
export ZSH_LLM_MODE_CMD='codex exec --model gpt-5'
Make sure your backend command reads from stdin and writes to stdout — that’s all the plugin needs. If your LLM cli tool works like
echo "hello" | gemini
then it can be used without problems.
By default plugin returns LLM output in full when received.
For streaming line by line you need to have stdbuf
or gstdbuf
which would be detected automatically.
On macOS you can install GNU coreutils and use gstdbuf
:
brew install coreutils
- Ctrl-Q → toggle LLM mode
- Ctrl-J → insert newline in LLM prompt
- Enter → send query
- Esc / Ctrl-C → cancel
Example:
LLM> (Enter=send, ^Q=switch, ^J=new line, Esc=cancel)
➜ ls -al what does this command do?
[LLM query]: ls -al what does this command do?
...
By default, toggle LLM mode with Ctrl-Q (mnemonic: Q for Question).
If Ctrl-Q is taken on your system (e.g. flow control), you can either:
- Disable flow control:
stty -ixon
- Or rebind to another key (example: Ctrl+G):
bindkey '^G' llm-toggle
MIT