feat: Migrate Task Master to generateObject for structured AI responses#1065
feat: Migrate Task Master to generateObject for structured AI responses#1065ben-vargas wants to merge 8 commits intoeyaltoledano:nextfrom
Conversation
Fixed issue where AI was generating inconsistent subtask IDs (101-105, 601-603)
instead of sequential numbering (1, 2, 3...) after the generateObject migration.
Changes:
- Updated all expand-task prompt variants with forceful "CRITICAL" instructions
- Made ID requirements explicit with examples: id={{nextSubtaskId}}, id={{nextSubtaskId}}+1
- Added warning not to use parent task ID in subtask numbering
- Removed parseSubtasksFromText post-processing that was overwriting AI-generated IDs
This ensures subtasks display correctly as X.1, X.2, X.3 format and the
`tm show X.Y` command works as expected.
Integration tests that make real API calls cannot run in CI without proper API keys. These tests should either be mocked or run in a separate test suite with appropriate infrastructure.
WalkthroughThis change migrates all AI-driven task management operations from text-based output parsing to structured object generation using schemas and a unified AI service. It introduces centralized schema validation with Zod, updates prompt templates for explicit structured output, consolidates provider configuration, and removes manual JSON parsing logic. Extensive test updates accompany the migration. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant TaskManagerModule
participant UnifiedAIService
participant Provider
participant SchemaRegistry
User->>TaskManagerModule: Request task operation (add/update/expand/parse/analyze)
TaskManagerModule->>SchemaRegistry: Retrieve schema for command
TaskManagerModule->>UnifiedAIService: generateObjectService(prompt, schema, objectName)
UnifiedAIService->>Provider: generateObject(prompt, schema, mode, metadata)
Provider-->>UnifiedAIService: Structured JSON response
UnifiedAIService-->>TaskManagerModule: Validated object response
TaskManagerModule-->>User: Structured result (task/subtasks/analysis)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~40 minutes Suggested reviewers
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. ✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Actionable comments posted: 7
♻️ Duplicate comments (2)
scripts/modules/task-manager/parse-prd.js (1)
22-22: Schema migration follows pattern but raises reliability concernsThe migration to centralized schemas is architecturally sound, and the change from
objectName: 'tasks_data'toobjectName: 'tasks'suggests better alignment with the schema structure.However, the same reliability concerns with
generateObjectServiceapply here as mentioned in the retrieved learnings. Theparse-prdfunction is critical for task extraction from PRD documents, making reliability especially important.Consider the same verification and fallback mechanisms mentioned for add-task.js, as this function is critical for the initial task generation workflow.
Also applies to: 180-181
scripts/modules/task-manager/expand-task.js (1)
18-19: Revert togenerateTextServiceper coding guidelines.This migration to
generateObjectServicecontradicts the established coding guidelines for the same reasons mentioned inupdate-tasks.js. The guidelines explicitly warn about reliability issues withgenerateObjectService.Also applies to: 285-300
| } | ||
|
|
||
| let responseText = ''; | ||
| const responseText = ''; |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Remove unused variable.
The responseText variable is declared but never used. This appears to be leftover from the previous text-based implementation.
- const responseText = '';
let aiServiceResponse = null;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const responseText = ''; | |
| let aiServiceResponse = null; |
🤖 Prompt for AI Agents
In scripts/modules/task-manager/expand-task.js at line 279, the variable
responseText is declared but never used. Remove this unused variable declaration
to clean up the code and avoid confusion.
| if (appendMode) { | ||
| // Append mode still uses generateTextService since it returns plain text | ||
| aiServiceResponse = await generateTextService({ | ||
| role: serviceRole, | ||
| session: session, | ||
| projectRoot: projectRoot, | ||
| systemPrompt: systemPrompt, | ||
| prompt: userPrompt, | ||
| commandName: 'update-task', | ||
| outputType: isMCP ? 'mcp' : 'cli' | ||
| }); | ||
| } else { | ||
| // Full update mode uses generateObjectService for structured output | ||
| aiServiceResponse = await generateObjectService({ | ||
| role: serviceRole, | ||
| session: session, | ||
| projectRoot: projectRoot, | ||
| systemPrompt: systemPrompt, | ||
| prompt: userPrompt, | ||
| schema: COMMAND_SCHEMAS['update-task-by-id'], | ||
| objectName: 'task', | ||
| commandName: 'update-task', | ||
| outputType: isMCP ? 'mcp' : 'cli' | ||
| }); | ||
| } |
There was a problem hiding this comment.
Full update mode should also use generateTextService.
While the append mode correctly uses generateTextService, the full update mode uses generateObjectService which contradicts the coding guidelines about reliability issues. Both modes should use generateTextService with manual JSON parsing and Zod validation.
The conditional approach suggests awareness of potential issues, but the guidelines are clear that generateTextService should be used for all structured output needs, not just append mode.
🤖 Prompt for AI Agents
In scripts/modules/task-manager/update-task-by-id.js between lines 296 and 320,
the full update mode incorrectly uses generateObjectService instead of
generateTextService, which goes against the coding guidelines due to reliability
concerns. To fix this, replace the call to generateObjectService with
generateTextService in the full update mode branch, then manually parse the JSON
response and validate it using Zod as per the guidelines. This ensures both
append and full update modes consistently use generateTextService for structured
output.
| import { generateObjectService } from '../ai-services-unified.js'; | ||
| import { COMMAND_SCHEMAS } from '../../../src/schemas/registry.js'; |
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
Revert to generateTextService with manual parsing per coding guidelines.
The migration to generateObjectService contradicts the established coding guidelines that specifically warn against using it due to reliability issues across different providers and complex schemas. The guidelines explicitly state to "Use generateTextService and implement robust manual JSON parsing (with Zod validation after parsing) when structured output is needed."
Based on the learnings retrieved, generateObjectService has shown unreliability with some providers/schemas. Please verify if this migration has been thoroughly tested across all supported providers:
Also applies to: 209-227
🏁 Script executed:
#!/bin/bash
# Search for any documented issues or decisions about generateObjectService reliability
rg -A 10 "generateObjectService.*reliability|reliability.*generateObjectService" --type md
# Check if there are any recent updates to the coding guidelines
fd -e md -e mdc . .cursor/ | xargs rg -l "generateObjectService|generateTextService" | head -10
# Look for any test coverage specifically for generateObjectService
fd -e test.js -e spec.js . | xargs rg -l "generateObjectService.*mock|mock.*generateObjectService" | head -10Length of output: 1027
🌐 Web query:
What are the known reliability issues with generateObject in AI providers like OpenAI, Anthropic, and Google when using complex JSON schemas?
💡 Result:
When utilizing AI providers like OpenAI, Anthropic, and Google for generating structured outputs based on complex JSON schemas, several reliability issues have been identified:
1. Inconsistent Adherence to JSON Schemas
-
OpenAI: Users have reported that models like GPT-4o do not consistently respect JSON schemas. For instance, when instructed to return data in JSON format, the model sometimes outputs markdown or other formats, disregarding the specified schema. (community.openai.com)
-
Anthropic: The Claude 3 Sonnet model has exhibited issues where responses deviate from the input schema outlined in the tool specification. This lack of strict schema enforcement can lead to unpredictable results and increased complexity in error handling for developers. (github.com)
-
Google: The Gemini API has faced challenges with structured outputs, particularly when object properties are not returned as expected. Despite specifying required fields in the schema, certain properties may be omitted in the generated output. (discuss.ai.google.dev)
2. Difficulty in Handling Complex JSON Schemas
-
OpenAI: While OpenAI's GPT-4o can integrate with Pydantic data models, it may struggle with complex schemas, leading to errors or incomplete outputs. (dsssolutions.com)
-
Anthropic: Claude Sonnet 3.5 requires specific configurations, such as 'tool calls,' to reliably produce JSON outputs. However, it does not directly support Pydantic models, which can complicate the generation of complex JSON structures. (dsssolutions.com)
-
Google: Gemini 1.5 Pro necessitates the use of the genai.protos.Schema class for reliable JSON production. Additionally, there is no straightforward way to guide Gemini's output using field descriptions, making it challenging to handle complex JSON schemas effectively. (dsssolutions.com)
3. Latency and Performance Issues
- OpenAI: The initial processing of complex JSON schemas can introduce significant latency, sometimes up to a minute for intricate schemas. This delay is due to the model's need to preprocess and understand the schema before generating outputs. (sophiabits.com)
4. Schema Validation Challenges
- Anthropic: There is a lack of strict enforcement of input schemas, leading to potential inconsistencies in the generated outputs. This absence of schema validation can result in outputs that do not conform to the specified structure, necessitating additional validation steps on the user's part. (github.com)
In summary, while AI providers like OpenAI, Anthropic, and Google offer capabilities for generating structured outputs based on JSON schemas, challenges persist in ensuring consistent adherence to complex schemas, handling intricate JSON structures, managing latency, and performing schema validation. Developers should be aware of these issues and implement additional validation and error-handling mechanisms to ensure the reliability of AI-generated structured data.
Citations:
- 1: https://community.openai.com/t/gpt-4o-doesnt-consistently-respect-json-schema-on-tool-use/751125?utm_source=openai
- 2: Model responses may violate input schema in tool specifications anthropics/anthropic-sdk-python#619
- 3: https://discuss.ai.google.dev/t/issue-with-structured-output/92283?utm_source=openai
- 4: https://dsssolutions.com/2024/08/28/enforcing-json-outputs-in-commercial-llms/?utm_source=openai
- 5: https://dsssolutions.com/2024/08/28/enforcing-json-outputs-in-commercial-llms/?utm_source=openai
- 6: https://dsssolutions.com/2024/08/28/enforcing-json-outputs-in-commercial-llms/?utm_source=openai
- 7: https://sophiabits.com/blog/openai-structured-outputs-deep-dive?utm_source=openai
- 8: Model responses may violate input schema in tool specifications anthropics/anthropic-sdk-python#619
Revert to generateTextService for structured output per coding guidelines
The change in scripts/modules/task-manager/update-tasks.js (imports at lines 24–25 and the service call at lines 209–227) replaces manual JSON parsing with generateObjectService, but our guidelines and real-world experience show that generateObjectService is prone to inconsistent schema adherence across providers (OpenAI, Anthropic, Google) and can break on complex JSON schemas. Please revert to using generateTextService with manual JSON.parse and Zod validation for all structured-output needs.
Locations to update:
- scripts/modules/task-manager/update-tasks.js
• Lines 24–25: swapgenerateObjectServiceimport forgenerateTextService
• Lines 209–227: replacegenerateObjectService(...)call withgenerateTextService(...), then parse and validate the response manually
🤖 Prompt for AI Agents
In scripts/modules/task-manager/update-tasks.js at lines 24-25 and 209-227,
revert the import of generateObjectService back to generateTextService. Replace
the generateObjectService call with generateTextService, then manually parse the
returned text as JSON and validate it using Zod according to the coding
guidelines for structured output. This ensures consistent schema adherence and
robustness across providers.
|
|
||
| // For providers that don't support tool mode (like claude-code), | ||
| // we need to ensure the schema is properly communicated in the prompt | ||
| const needsExplicitSchema = this.name === 'Claude Code'; | ||
|
|
||
| const result = await generateObject({ | ||
| model: client(params.modelId), | ||
| messages: params.messages, | ||
| schema: params.schema, | ||
| mode: 'auto', | ||
| mode: needsExplicitSchema ? 'json' : 'auto', | ||
| schemaName: params.objectName, | ||
| schemaDescription: `Generate a valid JSON object for ${params.objectName}`, | ||
| maxTokens: params.maxTokens, | ||
| temperature: params.temperature | ||
| }); |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Good adaptation for providers without tool mode support.
The logic to handle providers like "Claude Code" that don't support tool mode is well-implemented. Switching to JSON mode and adding explicit schema parameters ensures consistent structured output generation across different providers.
Consider making the provider-specific logic more extensible:
- const needsExplicitSchema = this.name === 'Claude Code';
+ const needsExplicitSchema = this.requiresExplicitSchema?.() || false;Then providers can override a requiresExplicitSchema() method instead of hardcoding provider names in the base class.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // For providers that don't support tool mode (like claude-code), | |
| // we need to ensure the schema is properly communicated in the prompt | |
| const needsExplicitSchema = this.name === 'Claude Code'; | |
| const result = await generateObject({ | |
| model: client(params.modelId), | |
| messages: params.messages, | |
| schema: params.schema, | |
| mode: 'auto', | |
| mode: needsExplicitSchema ? 'json' : 'auto', | |
| schemaName: params.objectName, | |
| schemaDescription: `Generate a valid JSON object for ${params.objectName}`, | |
| maxTokens: params.maxTokens, | |
| temperature: params.temperature | |
| }); | |
| // For providers that don't support tool mode (like claude-code), | |
| // we need to ensure the schema is properly communicated in the prompt | |
| - const needsExplicitSchema = this.name === 'Claude Code'; | |
| + const needsExplicitSchema = this.requiresExplicitSchema?.() || false; | |
| const result = await generateObject({ | |
| model: client(params.modelId), | |
| messages: params.messages, | |
| schema: params.schema, | |
| mode: needsExplicitSchema ? 'json' : 'auto', | |
| schemaName: params.objectName, | |
| schemaDescription: `Generate a valid JSON object for ${params.objectName}`, | |
| maxTokens: params.maxTokens, | |
| temperature: params.temperature | |
| }); |
🤖 Prompt for AI Agents
In src/ai-providers/base-provider.js around lines 206 to 220, the current code
hardcodes the check for 'Claude Code' to decide if explicit schema is needed.
Refactor this by adding a method requiresExplicitSchema() in the base provider
class that returns false by default, and override it in specific provider
subclasses like Claude Code to return true. Replace the hardcoded check with a
call to this method to make the logic extensible and maintainable.
| export const BaseTaskSchema = z.object({ | ||
| id: z.number().int().positive(), | ||
| title: z.string().min(1).max(200), | ||
| description: z.string().min(1), | ||
| status: TaskStatusSchema, | ||
| dependencies: z.array(z.union([z.number().int(), z.string()])).default([]), | ||
| priority: z | ||
| .enum(['low', 'medium', 'high', 'critical']) | ||
| .nullable() | ||
| .default(null), | ||
| details: z.string().nullable().default(null), | ||
| testStrategy: z.string().nullable().default(null) | ||
| }); |
There was a problem hiding this comment.
Missing required 'subtasks' field in BaseTaskSchema.
According to the retrieved learnings, "Each task object must include all required properties (id, title, description, status, dependencies, priority, details, testStrategy, subtasks)". The schema is missing the subtasks field.
Add the missing subtasks field:
export const BaseTaskSchema = z.object({
id: z.number().int().positive(),
title: z.string().min(1).max(200),
description: z.string().min(1),
status: TaskStatusSchema,
dependencies: z.array(z.union([z.number().int(), z.string()])).default([]),
priority: z
.enum(['low', 'medium', 'high', 'critical'])
.nullable()
.default(null),
details: z.string().nullable().default(null),
- testStrategy: z.string().nullable().default(null)
+ testStrategy: z.string().nullable().default(null),
+ subtasks: z.array(z.any()).default([])
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export const BaseTaskSchema = z.object({ | |
| id: z.number().int().positive(), | |
| title: z.string().min(1).max(200), | |
| description: z.string().min(1), | |
| status: TaskStatusSchema, | |
| dependencies: z.array(z.union([z.number().int(), z.string()])).default([]), | |
| priority: z | |
| .enum(['low', 'medium', 'high', 'critical']) | |
| .nullable() | |
| .default(null), | |
| details: z.string().nullable().default(null), | |
| testStrategy: z.string().nullable().default(null) | |
| }); | |
| export const BaseTaskSchema = z.object({ | |
| id: z.number().int().positive(), | |
| title: z.string().min(1).max(200), | |
| description: z.string().min(1), | |
| status: TaskStatusSchema, | |
| dependencies: z.array(z.union([z.number().int(), z.string()])).default([]), | |
| priority: z | |
| .enum(['low', 'medium', 'high', 'critical']) | |
| .nullable() | |
| .default(null), | |
| details: z.string().nullable().default(null), | |
| testStrategy: z.string().nullable().default(null), | |
| subtasks: z.array(z.any()).default([]) | |
| }); |
🤖 Prompt for AI Agents
In src/schemas/base-schemas.js around lines 13 to 25, the BaseTaskSchema is
missing the required 'subtasks' field. Add a 'subtasks' property to the schema
as an array of task objects or IDs, depending on your data model, ensuring it
matches the expected type and includes a default value if necessary to maintain
schema completeness.
| export const SubtaskSchema = z.object({ | ||
| id: z.number().int().positive(), | ||
| title: z.string().min(5).max(200), | ||
| description: z.string().min(10), | ||
| dependencies: z.array(z.number().int()).default([]), | ||
| details: z.string().min(20), | ||
| status: z.enum(['pending', 'done', 'completed']).default('pending'), | ||
| testStrategy: z.string().nullable().default(null) | ||
| }); |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Inconsistencies between SubtaskSchema and BaseTaskSchema.
Several inconsistencies that could cause issues:
- Status enum mismatch: SubtaskSchema uses
['pending', 'done', 'completed']while TaskStatusSchema includes'in-progress', 'blocked', 'cancelled', 'deferred' - Details field inconsistency: Required in SubtaskSchema but nullable in BaseTaskSchema
- Dependencies type mismatch: SubtaskSchema only accepts integers while BaseTaskSchema accepts both integers and strings
Align the schemas for consistency:
export const SubtaskSchema = z.object({
id: z.number().int().positive(),
title: z.string().min(5).max(200),
description: z.string().min(10),
- dependencies: z.array(z.number().int()).default([]),
- details: z.string().min(20),
- status: z.enum(['pending', 'done', 'completed']).default('pending'),
+ dependencies: z.array(z.union([z.number().int(), z.string()])).default([]),
+ details: z.string().nullable().default(null),
+ status: TaskStatusSchema.default('pending'),
testStrategy: z.string().nullable().default(null)
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export const SubtaskSchema = z.object({ | |
| id: z.number().int().positive(), | |
| title: z.string().min(5).max(200), | |
| description: z.string().min(10), | |
| dependencies: z.array(z.number().int()).default([]), | |
| details: z.string().min(20), | |
| status: z.enum(['pending', 'done', 'completed']).default('pending'), | |
| testStrategy: z.string().nullable().default(null) | |
| }); | |
| export const SubtaskSchema = z.object({ | |
| id: z.number().int().positive(), | |
| title: z.string().min(5).max(200), | |
| description: z.string().min(10), | |
| dependencies: z.array(z.union([z.number().int(), z.string()])).default([]), | |
| details: z.string().nullable().default(null), | |
| status: TaskStatusSchema.default('pending'), | |
| testStrategy: z.string().nullable().default(null) | |
| }); |
🤖 Prompt for AI Agents
In src/schemas/base-schemas.js around lines 27 to 35, align SubtaskSchema with
BaseTaskSchema by updating the status enum to include all values from
TaskStatusSchema, making the details field nullable instead of required, and
changing the dependencies array to accept both integers and strings to match the
base schema's type flexibility.
| export const UpdatedTaskSchema = BaseTaskSchema.extend({ | ||
| subtasks: z.array(z.any()).nullable().default(null) | ||
| }); |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Consider using SubtaskSchema instead of z.any() for type safety
The schema extension pattern is good, but the subtasks field uses z.array(z.any()) which is overly permissive. Since SubtaskSchema is available in base-schemas.js (as seen in the expand-task schema), consider using it for better type safety:
-import { BaseTaskSchema } from './base-schemas.js';
+import { BaseTaskSchema, SubtaskSchema } from './base-schemas.js';
export const UpdatedTaskSchema = BaseTaskSchema.extend({
- subtasks: z.array(z.any()).nullable().default(null)
+ subtasks: z.array(SubtaskSchema).nullable().default(null)
});This would provide proper validation for subtask objects and maintain consistency with other schemas in the codebase.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export const UpdatedTaskSchema = BaseTaskSchema.extend({ | |
| subtasks: z.array(z.any()).nullable().default(null) | |
| }); | |
| -import { BaseTaskSchema } from './base-schemas.js'; | |
| +import { BaseTaskSchema, SubtaskSchema } from './base-schemas.js'; | |
| export const UpdatedTaskSchema = BaseTaskSchema.extend({ | |
| - subtasks: z.array(z.any()).nullable().default(null) | |
| + subtasks: z.array(SubtaskSchema).nullable().default(null) | |
| }); |
🤖 Prompt for AI Agents
In src/schemas/update-tasks.js around lines 4 to 6, the subtasks field uses
z.array(z.any()), which is too permissive and lacks type safety. Replace z.any()
with SubtaskSchema imported from base-schemas.js to ensure proper validation of
each subtask object and maintain consistency with other schemas. Update the
import statements accordingly to include SubtaskSchema.
|
Closing as stale, work was done in mid July and never merged. Can be revisited in the future. |
|
Work is not lost! But since we're dying in technical debt, we shall slowly fix that and integrate generateObject back and use this PR as reference. Thanks for the amazing work you do for us @ben-vargas !!! |
|
@ben-vargas any chance we can revive this branch ? I'll take it over and bring it to the finish line. Here's what I need to do with it to get it merged:
|
|
@Crunchyman-ralph - I kept a local copy of it which I've re-pushed to github so it can be reopened. |
|
@ben-vargas just reviewed the code its looking good, lets re-open this! |
Summary
This PR migrates Task Master from
generateTexttogenerateObjectfor AI command processing, providing significant improvements in reliability, maintainability, and performance.Note: This PR replaces the previously closed PR #1034, which was automatically closed during branch maintenance and worktree reorganization. The code and functionality remain identical.
Key Changes:
Additional Fixes:
Migration Details
Commands Migrated
analyze-complexity- Uses structured ComplexityAnalysisResponseSchemaupdate-task-by-id- Full update mode uses generateObject; append mode still uses generateTextexpand-task- Uses structured ExpandTaskResponseSchemaupdate-tasks- Uses structured UpdatedTasksResponseSchemaadd-task- Already used generateObject with AiTaskDataSchemaparse-prd- Already used generateObject with prdResponseSchemaArchitecture Changes
Before:
After:
Benefits Realized
Test Plan
Breaking Changes
None - This is an internal refactor that maintains the same external API.
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
Tests