feat(cli): prompt completion#4691
Conversation
There was a problem hiding this comment.
Summary of Changes
Hello @3ks, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request implements a significant new feature: real-time prompt completion within the CLI. The primary goal is to streamline the prompt-writing process for users by offering intelligent, AI-generated suggestions as they type, drawing inspiration from successful patterns like search suggestions and code completion. This enhancement aims to make crafting effective prompts more intuitive and less prone to 'writer's block.'
Highlights
- New Feature: Prompt Completion: Introduces a real-time AI-powered prompt completion feature to the CLI, aiming to enhance user experience by proactively assisting users in crafting detailed and effective prompts.
- AI Model Integration: Leverages the
gemini-flash-2.5:nothinkingmodel for fast, near-instant suggestions, integrated via a newusePromptCompletionReact hook. - User Interface Enhancements: Adds UI elements to display prompt suggestions inline, appearing after a brief pause in typing, and disappearing upon prompt submission.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Code Review
This pull request introduces an excellent prompt completion feature. The implementation is a great start. I've provided feedback to address a critical issue in the prompt sent to the model, and a couple of high-priority suggestions to improve maintainability and efficiency by removing duplicated code and replacing an inefficient polling mechanism. These changes will make the feature more robust and easier to maintain.
dbb32f4 to
3283f14
Compare
|
/cc plz. @jacob314 |
|
Fyi @miguelsolorio interested in your thoughts on what the UX for this in a CLI tool should be. Some ideas:
Concerns: |
|
I'm not sure I understand what problem this feature is aiming to solve as users tend to have a pretty good idea of what they want to do. If anything, this would be better as a "prompt enhancer". So I'd be more concerned of introducing this before addressing other pressing issues. In terms of the UX, because there is significant delay between retrieving suggestions it really slows down the user's workflow to wait for the results. So the latency needs to be improved. I also think this needs to be in the input as ghost text instead of suggestions. |
|
Agree this needs to be in ghost text that only surfaces when completions are not present. Otherwise the fairly infrequently used suggestions will get in the way of navigating the input prompt and history I tried this locally and the latency also felt too high for me and it was also somewhat jarring that the suggestions did not feel grounded in my local project and GEMINI.md unlike the rest of the experience with Gemini CLI. I'd be open to landing this as an experiment but it needs to be behind a setting in settings.json as this isn't something that is ready to be on for most users. If it became suitably popular and suitably polished we could later consider enabling it by default. |
|
@miguelsolorio I agree with your view that users know what they want to do. However, users often need to pause and think about the details, especially when they need to input a very long prompt. (e.g. I didn’t type this paragraph fluently; I had to stop, think, and then continue). The function of this feature is that the user provides a partial idea, and Gemini provides inspiration or even a complete prompt. This is similar to the role of search suggestions in a search engine. This isn’t about the user needing to wait for Gemini’s response, but rather Gemini proactively providing suggestions when the user is pausing and thinking. The current logic is that if the user types anything while waiting for gemini suggestion, I assume they have their own idea, and the request is cancelled. I completely agree with using ghost text to implement Prompt Completion. I will probably implement it in the next one or two days. It’s not my working hours right now, and I need to go to sleep. Regarding the technical issues, you’re right that it takes a few seconds to get suggestions from Gemini. This is limited by the model and the network. I used the flash¬hinking model to give suggestions as quickly as possible. For further optimization, we could consider not sending the chat history in the request (it’s currently sent by default), but this would cause the suggestions to lose context and be based only on the content in the input box. If you have any other optimization ideas, I’d be happy to hear them. |
|
@jacob314 Thanks for the detailed feedback and for trying it out locally. this is very much a prototype implementation focused on exploring a new user experience for prompt input. The quality of the suggestions isn’t great right now, and I see this as an area for code improvement. For example, we can improve the results by reading the directory’s GEMINI.md and optimizing the prompt we send to the Gemini AI. Regarding the latency, I’d like to reiterate the core idea: this feature isn’t meant to interrupt a user’s flow, but to offer help when they naturally pause to think or get stuck. From that perspective, a delay of a few seconds can be acceptable, as those thinking pauses can often last from several seconds to even longer (I know mine certainly do!). |
3283f14 to
2f7c6fe
Compare
|
I’ve made some changes to the code:
|
@jacob314 Regarding this issue, this use case is different from having the LLM answer a question using UserMemory. The purpose of prompt completion is to get an extension of the current prompt text, not an answer to a question based on UserMemory. It is more difficult to achieve the desired results. I’ve checked the code, and geminiCli.generateContent() does include userMemory information by default. When the user provides a sufficiently descriptive prompt, UserMemory can still be utilized effectively: As you can see from this example, prompt completion correctly suggested the appropriate framework, test file format, and mocking methods. In comparison, I am more worried about the problem of ghost text multi-line rendering. I am not sure whether the current implementation method is good enough. As I said before, I am not a front-end developer. The current code is generated by LLM. |
jacob314
left a comment
There was a problem hiding this comment.
This is a lot better. Sorry for the slow review. I've now fixed my notification settings to be able to stay on top of the large number of pull requests more easily.
Some bugs:
I can get in this state when I move the cursor up to the location seen in this screenshot. I would have only expected completions when at the end of the file.

The cursor disapears when completions are shown.

The ghost text lacks word wrapping resulting in the content jumping when completions are suggested.

packages/cli/src/config/settings.ts
Outdated
There was a problem hiding this comment.
does debounceMs really need to be configurable in settings? I would suggest just using a reasonable default. same for minLength. I'd suggest we only have hard coded option letting you just use a simple boolean setting which is the easiest to implement correctly. otherwise you have to deal with merging and complexities surfacing this in the settings UI dialog which has now landed.
f86ae35 to
53bd2a4
Compare
|
Thanks for the review! Since this branch was pretty far behind, I’ve rebased it onto the latest release branch and re-implemented the changes with your suggestions in mind:
/cc @jacob314 |
53bd2a4 to
305ce44
Compare
|
Just following up on this. Are there any further changes needed, or is this ready to be merged? I’m hoping we can get this merged soon to avoid it falling behind main again, which might lead to more conflicts or another refactor. |
|
People on our team have been testing this and feedback is positive. I'll get this reviewed so you can merge today if possible. |
|
I've pushed a commit with a couple minor tweaks.
|
|
Thanks for pushing the tweaks. Should I pull them into my pr branch, or is no further action needed from me? |
|
I’m having trouble locating the commit with your tweaks—I’ve checked in both google-gemini/gemini-cli and jacob314/gemini-cli. I’m not sure what action is needed from me at this point. If you could clarify the next steps, I’d appreciate it. /cc @jacob314 |
305ce44 to
7d8a8ee
Compare
|
Since the branch ran into some minor configuration conflicts again and I still couldn’t locate your previous commit, I’ve pushed a new commit to address everything:
Btw, you were right about gemini-2.5-flash-lite—it’s impressively fast. Please let me know if you have any other feedback. I’m hoping we can get this merged soon to avoid the risk of more complex conflicts, which can be challenging to resolve for me. 😂 /cc plz @jacob314 |
|
I've pushed some minor fixes to get the build going so this can be landed. Thank you again for this feature. Really excited by how this turned out overall. |
| enabled: boolean; | ||
| } | ||
|
|
||
| const useDebounce = ( |
There was a problem hiding this comment.
this caused lint errors so I had to remove.
| const charWidth = stringWidth(char); | ||
| if (ghostUsedWidth + charWidth > remainingWidth) { | ||
| break; | ||
| if (stringWidth(firstLineRaw) <= remainingWidth) { |
There was a problem hiding this comment.
the previous wrapping made it a bit hard to read when the completion wrapped. this is generally boilerplate word wrapping. we could refactor this in a followup to better leverage word wrap support already in text-buffer.ts
|
As a potential follow-up, we could consider making the completion length more flexible. Exposing it as a setting or offering different tiers (e.g., short, medium, long) might be a good improvement, as I’ve noticed some edge cases where 20 words is still too short and can make the suggestion feel a bit unnatural. Thanks again for all your continuous feedback and help, and especially for the valuable refactoring you did. I’m glad I could contribute this feature and am excited to see how it performs in the wild. |
Co-authored-by: Jacob Richman <jacob314@gmail.com>
Co-authored-by: Jacob Richman <jacob314@gmail.com>
Co-authored-by: Jacob Richman <jacob314@gmail.com>
Co-authored-by: Jacob Richman <jacob314@gmail.com>
Co-authored-by: Jacob Richman <jacob314@gmail.com>
Co-authored-by: Jacob Richman <jacob314@gmail.com>


TL;DR
This PR introduces a real-time
Prompt Completionfeature for writing prompt. The goal is to dramatically improve the user experience by proactively assisting users in crafting detailed and effective prompts, much like how Google providesSearch Suggestionsor GitHub Copilot offersCode Completion.Using the
gemini-flash-2.5:nothinkingmodel for fast, near-instant suggestions. (just faster than thinking models)Dive Deeper
Learning from Proven UX Patterns
We've seen how assistive input mechanisms have revolutionized major product categories:
Search Suggestionsfor billions of users, turning a vague idea into a precise query.Code Completionfor tens of millions of developers, transforming a comment into functional code.The Problem: The "Prompt-Writer's Block"
The parallel between these interactions is striking: a user provides partial input, and the system intelligently suggests the rest.
Currently, users often find themselves pausing while writing prompts, grappling with how to expand on their ideas or what details to include. Ironically, crafting the perfect prompt can take more time and mental effort than the LLM takes to generate a response.
Prompt Completiondirectly solves this by making the prompt-writing process itself a creative and collaborative partnership with the AI.A Proof-of-Concept
I have developed a working proof-of-concept for this feature within the
gemini-cli, demonstrating its feasibility.To ensure a seamless and fluid user experience, the prototype uses the
gemini-flash-2.5:nothinkingmodel, which prioritizes the rapid response times necessary to provide suggestions almost instantly after the user pauses typing.As a meta-demonstration of AI assistance, all of the code, comments, and even this pull request description were crafted with the help of an LLM.
I am not a front-end developer, so the implementation may not be reasonable enough
Prospect
I think
Prompt Completionwill become as natural and indispensable to all AI products asSearch Suggestionsare to search engines today. It represents the next logical step in enhancing the usability of AI for everyone.The primary purpose of this PR is to introduce the concept of
Prompt Completionand to spark a discussion about its potential.Reviewer Test Plan
To validate this change, a reviewer should pull down the branch and test the core functionality in their local environment.
Trigger Completion:
Test Various Scenarios: Please try a few different classes of prompts to test the quality of suggestions:
Write a short story about a robot who...(and pause)Explain the difference between a virtual machine and a container, focusing on...(and pause)Generate a marketing slogan for a new coffee brand that is...(and pause)First, summarize the plot of Hamlet. Second, analyze the character of Ophelia. Third,...(and pause)Testing Matrix
Legend: ✅ = Tested & Working, ❓ = Needs Testing, - = Not Applicable
Linked issues / bugs
No linked issues. This is a new feature proposal.