Skip to content

fix: Refactor of gemini-cli provider for structured output#987

Closed
ben-vargas wants to merge 4 commits intoeyaltoledano:nextfrom
ben-vargas:fix/gemini-cli-refactor
Closed

fix: Refactor of gemini-cli provider for structured output#987
ben-vargas wants to merge 4 commits intoeyaltoledano:nextfrom
ben-vargas:fix/gemini-cli-refactor

Conversation

@ben-vargas
Copy link
Contributor

Fix: Refactor of Gemini CLI Provider for Structured Output

Summary

This PR completely refactors the gemini-cli provider to fix issue #983 where expand --all was producing identical generic subtasks for all tasks. The refactor simplifies the implementation from 664 lines to 213 lines while making it more robust and maintainable.

Problem Statement

Issue #983: Gemini CLI expand --all produces identical generic subtasks

When using the Gemini CLI provider with the tm expand --all command, instead of generating unique, contextual subtasks for each task, it was producing the same generic subtasks repeatedly:

- "Gather requirements and specifications"
- "Design the solution architecture" 
- "Implement core functionality"
- "Test and debug the implementation"

Root Cause Analysis

After extensive debugging, we discovered the core issue:

  • Gemini CLI was correctly generating structured JSON output
  • However, the Vercel AI SDK's generateObject method expects objects at the root level, not arrays
  • Commands like analyze-complexity and update-tasks expect array responses: [{...}, {...}]
  • But generateObject validates against schemas expecting: {key: [{...}, {...}]}
  • This mismatch caused validation errors, leading to fallback behavior and generic responses

Solution Overview

1. Provider-Level Solution

Instead of modifying individual commands, we implemented a clean solution at the provider level:

// Override generateText to intercept JSON requests
async generateText(params) {
    // Detect if this is a JSON request
    if (isJsonRequest) {
        // Determine the expected structure and redirect to generateObject
        // with the appropriate schema (wrapped or unwrapped)
        return this.generateObject({...params, schema});
    }
    // Non-JSON requests use normal generateText
    return super.generateText(params);
}

2. Gemini-CLI Specific Prompt Variants

For commands that expect arrays, we created gemini-cli specific variants that request properly structured objects:

// analyze-complexity.json
"gemini-cli": {
    "user": "...Respond ONLY with a valid JSON object matching the schema:\n{\n  \"analysis\": [\n    {...}\n  ]\n}"
}

// update-tasks.json  
"gemini-cli": {
    "user": "...Return only the updated tasks as a valid JSON object with a single key \"tasks\" containing the array..."
}

3. Automatic Provider Detection

Commands now automatically detect Gemini CLI and use the appropriate variant:

const provider = useResearch ? getResearchProvider(session) : getMainProvider(session);
const isGeminiCli = provider && provider.name === 'gemini-cli';
const promptVariant = isGeminiCli ? 'gemini-cli' : 'default';

Detailed File Changes

1. src/ai-providers/gemini-cli.js

Major refactor - reduced from 664 to 213 lines

What Changed:

  • Removed: All complex JSON extraction logic, retry mechanisms, manual parsing
  • Removed: Methods like _extractSystemMessage, _detectJsonRequest, _getJsonEnforcementPrompt, _isValidJson, extractJson
  • Added: Simple generateText override that detects JSON requests and redirects to generateObject
  • Added: Schema detection logic based on prompt content
  • Simplified: Client initialization - removed dynamic module loading

Why:

  • The original implementation was trying to force generateText to return structured JSON through prompts and parsing
  • The new approach leverages generateObject which is designed for structured output
  • Removes ~450 lines of error-prone JSON extraction and validation code

2. src/prompts/analyze-complexity.json

Added gemini-cli variant

What Changed:

  • Added "gemini-cli" variant that wraps the expected array in an object: {"analysis": [...]}
  • System prompt emphasizes returning only valid JSON object

Why:

  • generateObject expects an object at root level, not an array
  • The variant ensures Gemini CLI returns {"analysis": [{task1}, {task2}]} instead of [{task1}, {task2}]

3. src/prompts/update-tasks.json

Added gemini-cli variant

What Changed:

  • Added "gemini-cli" variant that wraps the tasks array: {"tasks": [...]}
  • Clear format example in the prompt

Why:

  • Same reason as analyze-complexity - need object wrapper for array response
  • Ensures consistent structure for generateObject validation

4. src/prompts/expand-task.json (+gemini-cli and gemini-cli-complexity variants)

Added gemini-cli specific variants with improved dependency examples

What Changed:

  • Added new "gemini-cli" variant for standard expansion
  • Added new "gemini-cli-complexity" variant for expansion from complexity reports
  • Both variants use proper Handlebars syntax: {{add nextSubtaskId 1}} instead of confusing [{{nextSubtaskId}} + 1]
  • Added clear examples showing empty arrays for no dependencies: "dependencies": []
  • More explicit rules about dependency references
  • Important: Did NOT modify complexity-report, research, or default variants to preserve behavior for other providers

Why:

  • When using tm expand --all with gemini-cli, it needs the object wrapper structure
  • The gemini-cli-complexity variant handles expansions from complexity reports specifically for Gemini CLI
  • Original syntax in other variants remains unchanged to avoid affecting other providers
  • Handlebars {{add}} helper properly calculates the next ID
  • Clear examples prevent Gemini from generating invalid dependency formats

5. scripts/modules/task-manager/analyze-task-complexity.js

Added provider detection and variant selection

What Changed:

  • Added imports for getMainProvider and getResearchProvider
  • Added provider detection logic to determine if using gemini-cli
  • Modified JSON parsing to handle both array and object responses
  • Variable renamed: variantpromptVariant for clarity

Why:

  • Automatically selects the correct prompt variant based on provider
  • Handles unwrapping of object-wrapped responses from gemini-cli
  • Maintains backward compatibility with other providers

6. scripts/modules/task-manager/update-tasks.js

Added provider detection and improved parsing

What Changed:

  • Added provider detection for gemini-cli
  • Enhanced parseUpdatedTasksFromText to handle object-wrapped responses
  • Variable renamed: variantpromptVariant
  • Improved error messages and debugging output

Why:

  • Same pattern as analyze-complexity for consistency
  • Better error handling for parsing failures
  • Clearer variable naming

7. scripts/modules/task-manager/expand-task.js

Added provider detection and fixed variant selection logic

What Changed:

  • Added imports for provider detection functions
  • Fixed variant selection logic to check for expansionPromptText && isGeminiCli first
  • Added new gemini-cli-complexity variant selection for complexity report expansions
  • Variable renamed: variantKeypromptVariant
  • Code formatting improvements per Biome linter

Why:

  • Critical fix: Without this, gemini-cli would use the array-based complexity-report variant and fail
  • Now correctly selects gemini-cli-complexity when expanding from complexity reports with gemini-cli
  • Consistency with other commands
  • Better code organization and readability

8. tests/unit/ai-providers/gemini-cli.test.js

Updated tests to match new implementation

What Changed:

  • Removed tests for deleted methods (_extractSystemMessage, _detectJsonRequest, etc.)
  • Updated existing tests to focus on public API behavior
  • Added proper mocking for generateObject method
  • Fixed test isolation issues

Why:

  • Tests were checking internal implementation details that no longer exist
  • New tests verify the actual behavior users care about
  • Better test structure with proper setup/teardown

9. tests/unit/scripts/modules/task-manager/expand-task.test.js

Fixed missing mock functions

What Changed:

  • Added getMainProvider and getResearchProvider to config-manager mock

Why:

  • Tests were failing because these functions were called but not mocked
  • Ensures tests run successfully with the new provider detection code

10. tests/unit/scripts/modules/task-manager/update-tasks.test.js

Fixed missing mock functions

What Changed:

  • Added getMainProvider and getResearchProvider to config-manager mock

Why:

  • Same as expand-task.test.js - needed for provider detection logic

Testing Results

Before the Fix

tm expand --all
- Every task got the same 4 generic subtasks
- No dependencies between subtasks
- No context-aware content

After the Fix

tm expand --all
- Task 1: Database schema, Express backend, React frontend, Docker setup
- Task 2: Registration API, login UI, protected routes, profile page
- Task 3: Contact CRUD API, list page, detail page, forms
- Each with proper dependencies (e.g., 1.2 depends on 1.1, 1.4 depends on 1.2 & 1.3)

Test Suite Results

  • All tests pass: 803 passed, 11 skipped (814 total)
  • No regressions in existing functionality
  • Gemini CLI specific behavior properly tested
  • The 11 skipped tests are intentionally skipped and not failures

Technical Details

Schema Detection Logic

The provider now detects the expected response structure by analyzing the prompt content:

  • Looks for "complexityScore" → analyze-complexity schema
  • Looks for "subtasks" → expand-task schema
  • Looks for "Return only the updated tasks" → update-tasks schema

Error Handling

  • Graceful fallback to regular generateText if generateObject fails
  • Preserved existing error messages and debugging capabilities
  • Cleaner error propagation without nested try-catch blocks

Compatibility

  • No breaking changes to the public API
  • Existing code continues to work without modifications
  • Other providers are unaffected by these changes

Impact

This refactor:

  1. Fixes the immediate issue (bug: expand --all generates identical subtasks for all tasks due to PromptManager cache issue #983) completely
  2. Makes the codebase more maintainable (70% code reduction in provider)
  3. Provides a pattern for handling similar provider quirks in the future
  4. Improves the overall user experience with Gemini CLI
  5. Establishes consistent naming conventions (promptVariant instead of variant)

Future Considerations

This pattern of provider-specific prompt variants could be extended to:

  • Handle other providers with unique requirements
  • Migrate to generateObject usage for prompts, more appropriate than generateText with user prompt for structured output

The clean separation of concerns makes it easy to add new variants without touching core logic.

Checklist

github-actions bot and others added 3 commits July 14, 2025 11:28
* fix: prevent CLAUDE.md overwrite by using imports

- Copy Task Master instructions to .taskmaster/CLAUDE.md
- Add import section to user's CLAUDE.md instead of overwriting
- Preserve existing user content
- Clean removal of Task Master content on uninstall

Closes eyaltoledano#929

* chore: add changeset for Claude import fix
…edano#968)

* feat: add task master (tm) custom slash commands

Add comprehensive task management system integration via custom slash commands.
Includes commands for:
- Project initialization and setup
- Task parsing from PRD documents
- Task creation, update, and removal
- Subtask management
- Dependency tracking and validation
- Complexity analysis and task expansion
- Project status and reporting
- Workflow automation

This provides a complete task management workflow directly within Claude Code.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* chore: add changeset

---------

Co-authored-by: neno-is-ooo <204701868+neno-is-ooo@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
@changeset-bot
Copy link

changeset-bot bot commented Jul 16, 2025

🦋 Changeset detected

Latest commit: 42fedf9

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
task-master-ai Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jul 16, 2025

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Major refactor to fix GitHub issue eyaltoledano#983 where gemini-cli's expand --all was
producing identical generic subtasks. The issue was that Gemini CLI returns
structured JSON correctly but generateObject expects objects at root level,
not arrays.

Changes:
- Simplified gemini-cli.js from 664 lines to 213 lines
- Implemented generateText override that detects JSON requests and redirects
  to generateObject with proper schema detection
- Added object-wrapper prompt variants for commands that expect arrays:
  - analyze-complexity: wraps array in {analysis: [...]}
  - update-tasks: wraps array in {tasks: [...]}
  - expand-task: already expects {subtasks: [...]}
- Updated command files to detect gemini-cli provider and use object-wrapper
  variants automatically
- Fixed expand-task prompts with clearer dependency examples using proper
  Handlebars syntax ({{add}} helper) instead of confusing [{{id}} + 1]
- Removed all the complex error handling, retries, and manual JSON parsing
  that was no longer needed

Result:
- Gemini CLI now works consistently across all commands
- expand --all generates unique, contextual subtasks with proper dependencies
- Clean architecture where provider-specific handling is encapsulated
- Commands remain provider-agnostic
@ben-vargas ben-vargas force-pushed the fix/gemini-cli-refactor branch from 0a80daf to 42fedf9 Compare July 16, 2025 03:06
Copy link
Collaborator

@Crunchyman-ralph Crunchyman-ralph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice PR, but I don't think we should have a different prompt per model, I think its worth looking into consolidating it, reduces code complexity and overhead.

Comment on lines +36 to +39
},
"gemini-cli": {
"system": "You are an AI assistant helping to update software development tasks based on new context.\nYou will be given a set of tasks and a prompt describing changes or new implementation details.\nYour job is to update the tasks to reflect these changes, while preserving their basic structure.\n\nGuidelines:\n1. Maintain the same IDs, statuses, and dependencies unless specifically mentioned in the prompt\n2. Update titles, descriptions, details, and test strategies to reflect the new information\n3. Do not change anything unnecessarily - just adapt what needs to change based on the prompt\n4. You should return ALL the tasks in order, not just the modified ones\n5. Return a complete valid JSON object with the updated tasks array\n6. VERY IMPORTANT: Preserve all subtasks marked as \"done\" or \"completed\" - do not modify their content\n7. For tasks with completed subtasks, build upon what has already been done rather than rewriting everything\n8. If an existing completed subtask needs to be changed/undone based on the new context, DO NOT modify it directly\n9. Instead, add a new subtask that clearly indicates what needs to be changed or replaced\n10. Use the existence of completed subtasks as an opportunity to make new subtasks more specific and targeted\n\nThe changes described in the prompt should be applied to ALL tasks in the list.",
"user": "Here are the tasks to update:\n{{{json tasks}}}\n\nPlease update these tasks based on the following new context:\n{{updatePrompt}}\n\nIMPORTANT: In the tasks JSON above, any subtasks with \"status\": \"done\" or \"status\": \"completed\" should be preserved exactly as is. Build your changes around these completed items.{{#if projectContext}}\n\n# Project Context\n\n{{projectContext}}{{/if}}\n\nReturn only the updated tasks as a valid JSON object with a single key \"tasks\" containing the array of updated tasks.\n\nExpected format:\n{\n \"tasks\": [\n // ... updated task objects\n ]\n}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't like the idea of having separate prompts, think we can merge it ?

Comment on lines +72 to +73
"gemini-cli": {
"system": "You are an AI assistant helping with task breakdown for software development.\nYou need to break down a high-level task into {{#if (gt subtaskCount 0)}}{{subtaskCount}}{{else}}an appropriate number of{{/if}} specific subtasks that can be implemented one by one.\n\nSubtasks should:\n1. Be specific and actionable implementation steps\n2. Follow a logical sequence\n3. Each handle a distinct part of the parent task\n4. Include clear guidance on implementation approach\n5. Have appropriate dependency chains between subtasks (using the new sequential IDs)\n6. Collectively cover all aspects of the parent task\n\nFor each subtask, provide:\n- id: Sequential integer starting from the provided nextSubtaskId\n- title: Clear, specific title\n- description: Detailed description\n- dependencies: Array of prerequisite subtask IDs (use the new sequential IDs)\n- details: Implementation details, the output should be in string\n- testStrategy: Optional testing approach\n\nRespond ONLY with a valid JSON object containing a single key \"subtasks\" whose value is an array matching the structure described. Do not include any explanatory text, markdown formatting, or code block markers.",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, think we can merge these into 1 prompt ?

Comment on lines +62 to +64
"gemini-cli-complexity": {
"condition": "expansionPrompt && isGeminiCli",
"system": "You are an AI assistant helping with task breakdown. Generate {{#if (gt subtaskCount 0)}}exactly {{subtaskCount}}{{else}}an appropriate number of{{/if}} subtasks based on the provided prompt and context.\n\nRespond ONLY with a valid JSON object containing a single key \"subtasks\" whose value is an array of the generated subtask objects.\n\nEach subtask must follow this exact structure:\n- id: Sequential integer starting from {{nextSubtaskId}}\n- title: Clear, actionable subtask title\n- description: What this subtask accomplishes\n- dependencies: Array of IDs this subtask depends on (use [] for no dependencies)\n- details: Implementation guidance\n- status: Must be \"pending\"\n- testStrategy: (optional) How to test this subtask\n\nDependency rules:\n- First subtask should have dependencies: []\n- Later subtasks can depend on earlier ones: dependencies: [{{nextSubtaskId}}] or [{{nextSubtaskId}}, {{add nextSubtaskId 1}}]\n- Only reference IDs from THIS response\n\nDo not include any other text or explanation.",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, merge into one potentially

Comment on lines +39 to +42
},
"gemini-cli": {
"system": "You are an expert software architect and project manager analyzing task complexity. Respond only with the requested valid JSON object.",
"user": "Analyze the following tasks to determine their complexity (1-10 scale) and recommend the number of subtasks for expansion. Provide a brief reasoning and an initial expansion prompt for each.{{#if useResearch}} Consider current best practices, common implementation patterns, and industry standards in your analysis.{{/if}}\n\nTasks:\n{{{json tasks}}}\n{{#if gatheredContext}}\n\n# Project Context\n\n{{gatheredContext}}\n{{/if}}\n\nRespond ONLY with a valid JSON object matching the schema:\n{\n \"analysis\": [\n {\n \"taskId\": <number>,\n \"taskTitle\": \"<string>\",\n \"complexityScore\": <number 1-10>,\n \"recommendedSubtasks\": <number>,\n \"expansionPrompt\": \"<string>\",\n \"reasoning\": \"<string>\"\n },\n ...\n ]\n}\n\nDo not include any explanatory text, markdown formatting, or code block markers before or after the JSON object."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if gemini cli needs to be wrapped, then why not wrap them all ? would allow us to have the same prompt, less overhead of prompt differences

logFn,
isMCP
isMCP,
promptVariant // Pass the promptVariant to handle gemini-cli
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now we're adding more complexity to the code instead of resolving the core issue

Copy link
Contributor Author

@ben-vargas ben-vargas Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, core issue is that we use generateText too much... but with all the path changes and other refactors going on, not super hot on trying that refactor right now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll merge the path changes stuff, and maybe you can tackle that ?

@Crunchyman-ralph
Copy link
Collaborator

Closing this since we're going in the #1034 direction as discussed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants