-
Notifications
You must be signed in to change notification settings - Fork 53
Description
PDD-CLI Generation Issues Discovered via CRM Test Case
This meta-issue tracks 26 systematic bugs discovered in the PDD-CLI code generation tool during a CRM (Contact Relationship Management) application generation test. These issues represent fundamental gaps in PDD's generation capabilities, not domain-specific CRM problems.
Purpose
The CRM application served as a comprehensive test case to stress-test PDD-CLI's generation capabilities across:
- Backend (Python, Cloud Functions, Firestore)
- Frontend (React, TypeScript, Next.js)
- Testing (pytest, Playwright E2E)
- Infrastructure (Firebase, GitHub integration)
Every bug listed here is a PDD-CLI bug - something the tool got wrong when generating code. These issues apply to any codebase PDD generates, not just CRM applications.
Issue Categories
1. Schema Validation Failures (5 issues)
PDD generates code without validating against actual schemas/interfaces:
- XSS Vulnerability in Mermaid Diagram Tooltips #411 - Generates field references without schema validation
- Generated tests use incorrect sys.modules paths for mocking #412 - Constructor arguments not validated against model definitions
- Generated tests use fragile try/except ImportError pattern for mocking #413 - Function calls without checking actual exports
- Investigate: <include> tags in step outputs bypass preprocessing #414 - React component prop types misunderstood
- Add failing tests for #411: XSS in Mermaid tooltips #415 - Assumes OOP patterns when codebase uses functional patterns
Pattern: PDD infers names/structures rather than parsing actual definitions.
2. E2E Test Generation Issues (6 issues)
PDD generates brittle, assumption-based E2E tests:
- Add failing tests for issue #409: Environment Variable Pollution #416 - E2E tests with exact string matches before seeing actual UI
- Add additional test coverage for PDD tag brace escaping fix #417 - Doesn't account for dynamic UI text (counts, dates)
- Resource leak: Temp directories not cleaned up when git clone fails #418 - Assumes verbose button text when UIs use concise labels
- Bug: Agentic fix doesn't push commits when exiting early at Step 2 #419 - Uses fragile CSS utility classes for test selectors
- Additional suggestions for pdd connect UI improvements #420 - Uses text matching instead of semantic selectors
- Fix #418: Clean up temp directories when git clone fails #421 - Assumes component library structure without verification
Pattern: PDD generates tests based on assumptions rather than actual rendered DOM.
3. Test Environment Issues (3 issues)
PDD doesn't consider test environment constraints:
- Add failing tests for #419: Unpushed commits in early exit #422 - Imports production dependencies at module level without considering test environment
- Architecture generation: Missing public/ directory for Next.js frontend causes Docker build failure #423 - E2E tests without async data loading waits
- pdd example created interactive file: blocking pdd sync #424 - Test data without proper CSV escaping
Pattern: PDD assumes all environments have same capabilities.
4. API Selection & Test Performance (3 issues)
PDD chooses APIs without considering non-functional requirements:
- get agentic default for python (pdd connect frontend) #425 - Chooses convenient APIs without considering consistency requirements
- Architecture generation: docker-compose.yml and Dockerfiles have dev/prod conflicts that prevent startup #426 - Uses fixed timeouts instead of polling for actual conditions
- Code generation ignores interface contracts: frontend/backend field and endpoint mismatches #427 - E2E assertions use default timeouts for async operations
Pattern: PDD optimizes for code simplicity over correctness/reliability.
5. Code Quality Issues (2 issues)
PDD generates code that violates language conventions:
- Add failing tests for #1: install_completion quiet parameter bug #428 - Generates code with PEP 8 style violations
- Basename sanitization inconsistency causes CLI mode failures for subdirectory modules #429 - React hooks without proper memoization
Pattern: PDD doesn't run linters/formatters on generated code.
6. Incomplete Feature Implementation (7 issues)
PDD generates partial implementations:
- Auto-fix skips fingerprint save causing incomplete metadata (sync_orchestration.py:1350) #430 - Feature code without required environment configuration
- Add failing tests for basename sanitization bug (#429) #431 - Incorrectly applies HTML escaping to template output
- Add failing tests for #430: auto-fix fingerprint skip bug #432 - Data models without extraction/parsing logic
- Add failing tests for issue #392: pdd change KeyError at Step 5 #433 - Handlers not wired up to runtime
- Add failing tests for #412: incorrect sys.modules paths #434 - Dialogs without overflow handling for variable content
- Add failing tests for #393: format injection at step 5.5 #435 - Analytics modules with incomplete metric calculations
- feature: pdd connect - enable / support Makefile generation to create a turnkey experience #436 - Features without corresponding test coverage
Pattern: PDD implements visible code but forgets "glue" (config, wiring, tests).
Root Causes
These 26 issues stem from several fundamental gaps in PDD-CLI:
- No Static Analysis: PDD doesn't parse existing code to understand schemas, exports, patterns
- Assumption-Based Generation: PDD infers based on names/context rather than verifying facts
- No Validation Phase: PDD doesn't run linters, type checkers, or tests on generated code
- Incomplete Domain Knowledge: PDD doesn't understand API consistency models, test best practices, etc.
- Partial Implementation: PDD generates obvious code but misses configuration, wiring, edge cases
Impact Summary
- P0 issues: 0
- P1 issues: 10 (runtime failures, build breaks, high-frequency bugs)
- P2 issues: 11 (flaky tests, partial functionality, medium-frequency bugs)
- P3 issues: 5 (code quality, UX issues, low-frequency bugs)
For Contributors
These issues were discovered during generation of a CRM test application for the PDD Cloud project. While examples reference CRM-specific files (admin_crm_actions.py, AnalyticsDashboard.tsx, etc.), the problems are generic PDD-CLI bugs that affect any codebase.
Each issue includes:
- Generic problem description (applicable to any domain)
- Concrete example showing what PDD generated vs should have generated
- Root cause in PDD's generation logic
- Prevention strategies for fixing PDD
All issues have been fixed in the CRM codebase (commit 34a651d5 and related commits), but the underlying PDD-CLI bugs remain and will affect future generations.
Related Issues
- test_cli_command_help fails on main: PROMPT_FILE now optional #400-403 - Missing features identified during CRM generation (not PDD bugs, but feature gaps)
- Bug: Keyring Operations Hang Indefinitely in CI/CD Environments (macOS) #404-410 - Additional CRM-specific issues (not included in this meta-issue)
Next Steps
- For PDD-CLI development: Use these issues as requirements for improving code generation
- For users: Be aware of these patterns when reviewing PDD-generated code
- For testing: The CRM test case serves as a regression suite for PDD improvements
Total Issues: 26 individual bugs across 6 categories
Test Case: CRM application generation for PDD Cloud
Branch: crm_generation
Main Fix Commit: 34a651d5 - "fix: Fix CRM analytics bugs and test failures"