Skip to content

feat(cli): prompt completion#4691

Merged
jacob314 merged 10 commits intogoogle-gemini:mainfrom
3ks:feat/prompt-completion
Aug 21, 2025
Merged

feat(cli): prompt completion#4691
jacob314 merged 10 commits intogoogle-gemini:mainfrom
3ks:feat/prompt-completion

Conversation

@3ks
Copy link
Contributor

@3ks 3ks commented Jul 22, 2025

TL;DR

This PR introduces a real-time Prompt Completion feature for writing prompt. The goal is to dramatically improve the user experience by proactively assisting users in crafting detailed and effective prompts, much like how Google provides Search Suggestions or GitHub Copilot offers Code Completion.

Using the gemini-flash-2.5:nothinking model for fast, near-instant suggestions. (just faster than thinking models)

prompt_complete2

Dive Deeper

Learning from Proven UX Patterns

We've seen how assistive input mechanisms have revolutionized major product categories:

  • Google Search provides Search Suggestions for billions of users, turning a vague idea into a precise query.

search_suggestion

  • GitHub Copilot provides Code Completion for tens of millions of developers, transforming a comment into functional code.

code_completion

  • Smart users will ask LLM to help them optimize prompt words
  • LLM Agent will automatically write prompts, break down requirements, and complete tasks.

This leads to a natural question: Why can't LLMs proactively provide Prompt Completion for billions of users? We should assist users in crafting effective prompts right from the input stage.

The Problem: The "Prompt-Writer's Block"

The parallel between these interactions is striking: a user provides partial input, and the system intelligently suggests the rest.

Currently, users often find themselves pausing while writing prompts, grappling with how to expand on their ideas or what details to include. Ironically, crafting the perfect prompt can take more time and mental effort than the LLM takes to generate a response. Prompt Completion directly solves this by making the prompt-writing process itself a creative and collaborative partnership with the AI.

A Proof-of-Concept

I have developed a working proof-of-concept for this feature within the gemini-cli, demonstrating its feasibility.

To ensure a seamless and fluid user experience, the prototype uses the gemini-flash-2.5:nothinking model, which prioritizes the rapid response times necessary to provide suggestions almost instantly after the user pauses typing.

prompt_complete

As a meta-demonstration of AI assistance, all of the code, comments, and even this pull request description were crafted with the help of an LLM.

I am not a front-end developer, so the implementation may not be reasonable enough

Prospect

I think Prompt Completion will become as natural and indispensable to all AI products as Search Suggestions are to search engines today. It represents the next logical step in enhancing the usability of AI for everyone.

The primary purpose of this PR is to introduce the concept of Prompt Completion and to spark a discussion about its potential.

Reviewer Test Plan

To validate this change, a reviewer should pull down the branch and test the core functionality in their local environment.

  1. Trigger Completion:

    • Start typing a prompt.
    • Pause for a moment (e.g., 1-2 seconds).
    • A suggestion for completing your prompt should appear inline.
  2. Test Various Scenarios: Please try a few different classes of prompts to test the quality of suggestions:

    • Simple Start: Write a short story about a robot who... (and pause)
    • Technical Query: Explain the difference between a virtual machine and a container, focusing on... (and pause)
    • Creative Task: Generate a marketing slogan for a new coffee brand that is... (and pause)
    • Chain of Thought: First, summarize the plot of Hamlet. Second, analyze the character of Ophelia. Third,... (and pause)

Testing Matrix

Legend: ✅ = Tested & Working, ❓ = Needs Testing, - = Not Applicable

🍏 macOS 🪟 Windows 🐧 Linux
npm run
npx
Docker
Podman - -
Seatbelt - - -

Linked issues / bugs

No linked issues. This is a new feature proposal.

@3ks 3ks requested a review from a team as a code owner July 22, 2025 21:53
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @3ks, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request implements a significant new feature: real-time prompt completion within the CLI. The primary goal is to streamline the prompt-writing process for users by offering intelligent, AI-generated suggestions as they type, drawing inspiration from successful patterns like search suggestions and code completion. This enhancement aims to make crafting effective prompts more intuitive and less prone to 'writer's block.'

Highlights

  • New Feature: Prompt Completion: Introduces a real-time AI-powered prompt completion feature to the CLI, aiming to enhance user experience by proactively assisting users in crafting detailed and effective prompts.
  • AI Model Integration: Leverages the gemini-flash-2.5:nothinking model for fast, near-instant suggestions, integrated via a new usePromptCompletion React hook.
  • User Interface Enhancements: Adds UI elements to display prompt suggestions inline, appearing after a brief pause in typing, and disappearing upon prompt submission.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an excellent prompt completion feature. The implementation is a great start. I've provided feedback to address a critical issue in the prompt sent to the model, and a couple of high-priority suggestions to improve maintainability and efficiency by removing duplicated code and replacing an inefficient polling mechanism. These changes will make the feature more robust and easier to maintain.

@3ks 3ks force-pushed the feat/prompt-completion branch from dbb32f4 to 3283f14 Compare July 23, 2025 09:47
@3ks
Copy link
Contributor Author

3ks commented Jul 24, 2025

/cc plz. @jacob314

@jacob314
Copy link
Contributor

Fyi @miguelsolorio interested in your thoughts on what the UX for this in a CLI tool should be.

Some ideas:

  1. surface this as ghost test in the CLI that can be accepted with (tab)
  2. surface this aligned with how we surface general autocomplete.

Concerns:
How does this align with the important autocomplete we show for @ and / tool commands? I would imagine we should only surface this when there are not manual autocompletes but we could also consider a world where both AI and other autocompletes are surfaced.

@miguelsolorio
Copy link
Contributor

I'm not sure I understand what problem this feature is aiming to solve as users tend to have a pretty good idea of what they want to do. If anything, this would be better as a "prompt enhancer". So I'd be more concerned of introducing this before addressing other pressing issues.

In terms of the UX, because there is significant delay between retrieving suggestions it really slows down the user's workflow to wait for the results. So the latency needs to be improved.

I also think this needs to be in the input as ghost text instead of suggestions.

@jacob314
Copy link
Contributor

Agree this needs to be in ghost text that only surfaces when completions are not present. Otherwise the fairly infrequently used suggestions will get in the way of navigating the input prompt and history I tried this locally and the latency also felt too high for me and it was also somewhat jarring that the suggestions did not feel grounded in my local project and GEMINI.md unlike the rest of the experience with Gemini CLI.

I'd be open to landing this as an experiment but it needs to be behind a setting in settings.json as this isn't something that is ready to be on for most users. If it became suitably popular and suitably polished we could later consider enabling it by default.

@3ks
Copy link
Contributor Author

3ks commented Jul 24, 2025

@miguelsolorio I agree with your view that users know what they want to do. However, users often need to pause and think about the details, especially when they need to input a very long prompt. (e.g. I didn’t type this paragraph fluently; I had to stop, think, and then continue).

The function of this feature is that the user provides a partial idea, and Gemini provides inspiration or even a complete prompt. This is similar to the role of search suggestions in a search engine.

This isn’t about the user needing to wait for Gemini’s response, but rather Gemini proactively providing suggestions when the user is pausing and thinking. The current logic is that if the user types anything while waiting for gemini suggestion, I assume they have their own idea, and the request is cancelled.

I completely agree with using ghost text to implement Prompt Completion. I will probably implement it in the next one or two days. It’s not my working hours right now, and I need to go to sleep.

Regarding the technical issues, you’re right that it takes a few seconds to get suggestions from Gemini. This is limited by the model and the network. I used the flash&nothinking model to give suggestions as quickly as possible. For further optimization, we could consider not sending the chat history in the request (it’s currently sent by default), but this would cause the suggestions to lose context and be based only on the content in the input box. If you have any other optimization ideas, I’d be happy to hear them.

@3ks
Copy link
Contributor Author

3ks commented Jul 24, 2025

@jacob314 Thanks for the detailed feedback and for trying it out locally.

this is very much a prototype implementation focused on exploring a new user experience for prompt input.

The quality of the suggestions isn’t great right now, and I see this as an area for code improvement. For example, we can improve the results by reading the directory’s GEMINI.md and optimizing the prompt we send to the Gemini AI.

Regarding the latency, I’d like to reiterate the core idea: this feature isn’t meant to interrupt a user’s flow, but to offer help when they naturally pause to think or get stuck. From that perspective, a delay of a few seconds can be acceptable, as those thinking pauses can often last from several seconds to even longer (I know mine certainly do!).

@3ks 3ks force-pushed the feat/prompt-completion branch from 3283f14 to 2f7c6fe Compare July 27, 2025 18:31
@3ks
Copy link
Contributor Author

3ks commented Jul 27, 2025

I’ve made some changes to the code:

  1. Added a config option to control whether prompt completion is enabled. It’s turned off by default.

  2. Implemented prompt completion using ghost text. (Since this needs to handle things like rendering multi-line ghost text, the existing code couldn’t be fully reused).

  3. Optimized the parameters for the Gemini call. The temperature was changed from 0.67 to 0.3, and the prompt was adjusted from focusing on creativity to focusing on the user’s intent and providing more factual suggestions.

/cc @miguelsolorio @jacob314

@3ks
Copy link
Contributor Author

3ks commented Jul 28, 2025

it was also somewhat jarring that the suggestions did not feel grounded in my local project and GEMINI.md unlike the rest of the experience with Gemini CLI.

@jacob314 Regarding this issue, this use case is different from having the LLM answer a question using UserMemory. The purpose of prompt completion is to get an extension of the current prompt text, not an answer to a question based on UserMemory. It is more difficult to achieve the desired results.

I’ve checked the code, and geminiCli.generateContent() does include userMemory information by default. When the user provides a sufficiently descriptive prompt, UserMemory can still be utilized effectively:

gemini

As you can see from this example, prompt completion correctly suggested the appropriate framework, test file format, and mocking methods.

In comparison, I am more worried about the problem of ghost text multi-line rendering. I am not sure whether the current implementation method is good enough. As I said before, I am not a front-end developer. The current code is generated by LLM.

Copy link
Contributor

@jacob314 jacob314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a lot better. Sorry for the slow review. I've now fixed my notification settings to be able to stay on top of the large number of pull requests more easily.
Some bugs:
I can get in this state when I move the cursor up to the location seen in this screenshot. I would have only expected completions when at the end of the file.
Image

The cursor disapears when completions are shown.
Image

The ghost text lacks word wrapping resulting in the content jumping when completions are suggested.
Image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does debounceMs really need to be configurable in settings? I would suggest just using a reasonable default. same for minLength. I'd suggest we only have hard coded option letting you just use a simple boolean setting which is the easiest to implement correctly. otherwise you have to deal with merging and complexities surfacing this in the settings UI dialog which has now landed.

@3ks 3ks force-pushed the feat/prompt-completion branch 2 times, most recently from f86ae35 to 53bd2a4 Compare August 15, 2025 17:21
@3ks
Copy link
Contributor Author

3ks commented Aug 15, 2025

Thanks for the review! Since this branch was pretty far behind, I’ve rebased it onto the latest release branch and re-implemented the changes with your suggestions in mind:

  • Fixed: Completions are now only triggered when the cursor is at the very end of the file (last line, last column).
  • Fixed: The cursor no longer disappears when completions are shown.
  • The word-wrapping logic for ghost text has been improved to prevent the content from jumping.
  • As you suggested, I’ve simplified the configuration to a single boolean setting to enable/disable the feature. minLength and debounceMs are now hardcoded constants.

/cc @jacob314

@3ks 3ks force-pushed the feat/prompt-completion branch from 53bd2a4 to 305ce44 Compare August 15, 2025 17:59
@3ks
Copy link
Contributor Author

3ks commented Aug 18, 2025

Just following up on this. Are there any further changes needed, or is this ready to be merged?

I’m hoping we can get this merged soon to avoid it falling behind main again, which might lead to more conflicts or another refactor.
/cc @jacob314

@jacob314
Copy link
Contributor

People on our team have been testing this and feedback is positive. I'll get this reviewed so you can merge today if possible.

@jacob314
Copy link
Contributor

I've pushed a commit with a couple minor tweaks.

  1. Switched this to require restart as it didn't work until I restarted. We can always switch that back if I got it wrong.
  2. Switched the model to FLASH_LITE to reduce latency. Autocomplete is most useful when it is far. Flash lite is also cheap so low risk of calling it more frequently. Also reduced the debounce to 250ms to reduce perceived latency further.
  3. Instructed model to only give 15 word completions. this keeps the completions short and to the point rather than suggesting a paragraph the user might need to delete if they accept and reject.

@3ks
Copy link
Contributor Author

3ks commented Aug 19, 2025

Thanks for pushing the tweaks. Should I pull them into my pr branch, or is no further action needed from me?

@3ks
Copy link
Contributor Author

3ks commented Aug 19, 2025

I’m having trouble locating the commit with your tweaks—I’ve checked in both google-gemini/gemini-cli and jacob314/gemini-cli.

I’m not sure what action is needed from me at this point. If you could clarify the next steps, I’d appreciate it.

/cc @jacob314

@3ks 3ks force-pushed the feat/prompt-completion branch from 305ce44 to 7d8a8ee Compare August 20, 2025 12:34
@3ks
Copy link
Contributor Author

3ks commented Aug 20, 2025

Since the branch ran into some minor configuration conflicts again and I still couldn’t locate your previous commit, I’ve pushed a new commit to address everything:

  • The merge conflicts have been resolved.
  • I’ve implemented your three suggestions. For the completion length, instead of a hardcoded limit of 15 words, I’ve instructed the model to aim for a 10-20 word range. This allows for a bit more flexibility in the suggestions.

Btw, you were right about gemini-2.5-flash-lite—it’s impressively fast.

Please let me know if you have any other feedback. I’m hoping we can get this merged soon to avoid the risk of more complex conflicts, which can be challenging to resolve for me. 😂

/cc plz @jacob314

@jacob314
Copy link
Contributor

I've pushed some minor fixes to get the build going so this can be landed. Thank you again for this feature. Really excited by how this turned out overall.

enabled: boolean;
}

const useDebounce = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this caused lint errors so I had to remove.

const charWidth = stringWidth(char);
if (ghostUsedWidth + charWidth > remainingWidth) {
break;
if (stringWidth(firstLineRaw) <= remainingWidth) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the previous wrapping made it a bit hard to read when the completion wrapped. this is generally boilerplate word wrapping. we could refactor this in a followup to better leverage word wrap support already in text-buffer.ts

Copy link
Contributor

@jacob314 jacob314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@3ks
Copy link
Contributor Author

3ks commented Aug 21, 2025

As a potential follow-up, we could consider making the completion length more flexible. Exposing it as a setting or offering different tiers (e.g., short, medium, long) might be a good improvement, as I’ve noticed some edge cases where 20 words is still too short and can make the suggestion feel a bit unnatural.

Thanks again for all your continuous feedback and help, and especially for the valuable refactoring you did.

I’m glad I could contribute this feature and am excited to see how it performs in the wild.

@jacob314 jacob314 enabled auto-merge August 21, 2025 07:54
@jacob314 jacob314 added this pull request to the merge queue Aug 21, 2025
Merged via the queue into google-gemini:main with commit 589f5e6 Aug 21, 2025
18 checks passed
thacio added a commit to thacio/auditaria that referenced this pull request Aug 21, 2025
silviojr pushed a commit that referenced this pull request Aug 21, 2025
Co-authored-by: Jacob Richman <jacob314@gmail.com>
silviojr pushed a commit that referenced this pull request Aug 22, 2025
Co-authored-by: Jacob Richman <jacob314@gmail.com>
silviojr pushed a commit that referenced this pull request Aug 27, 2025
Co-authored-by: Jacob Richman <jacob314@gmail.com>
acoliver referenced this pull request in vybestack/llxprt-code Sep 11, 2025
Co-authored-by: Jacob Richman <jacob314@gmail.com>
involvex pushed a commit to involvex/gemini-cli that referenced this pull request Sep 11, 2025
Co-authored-by: Jacob Richman <jacob314@gmail.com>
reconsumeralization pushed a commit to reconsumeralization/gemini-cli that referenced this pull request Sep 19, 2025
Co-authored-by: Jacob Richman <jacob314@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants