You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Analysis of the Structural Gaps and Contradictions of the BMAD Method as Identified by BMAD Agents Themselves During an Epic Retrospective
Analysis Document for the Creators of the BMAD Method: Structural Gaps and Contradictions (Version v6 Stable)
Premise
This text was developed through an ongoing discussion with the various agents involved in the retrospective of an epic. I am therefore presenting the summary produced together with them, along with my final considerations after the Mermaid diagram.
This document summarizes the structural weaknesses, limitations, and intrinsic contradictions of the BMAD method that emerged during the retrospective of Epic 1. The analysis is the result of direct comparison between the user (Project Lead) and the AI agents on the team (Scrum Master, Senior Dev, Architect, QA, and Developer) designed to support the method itself.
1. The Fundamental Contradiction: Target Audience vs. Supervision Requirements
The most serious problem at the foundation of the BMAD method is a conceptual paradox. The method is born and promoted as a tool suitable for both technical users and inexperienced users. In reality, both a technical user and an inexperienced user would struggle with some parts of the process, because a technical user cannot make decisions on mountains of code not written directly by them and almost impossible to trace logically within chaotic and disorganized AI-written code, while for the non-technical user the already clear problems included remain fully present. However, the core principle on which the code review process is based (also cited by the user alexeyv in the official discussions) is "human filtering": it is assumed that the human being must filter and evaluate the problems found by the artificial intelligence.
In Step 4 of the review workflow, the system asks the user to choose between different options for handling the detected defects, such as having the review LLM automatically correct the code (Option 1) or creating follow-up tasks for the developer agent (Option 2). This request causes the entire system to collapse in on itself: an inexperienced or non-technical user does not have the skills to read and understand a complex mountain of code, nor to make architectural decisions about how to solve the problems. At the same time, even a technical user cannot realistically make reliable architectural decisions over large volumes of chaotic, AI-generated code they did not write and that are nearly impossible to trace logically. As highlighted by the Architect agent himself (Winston): the two premises cannot coexist. Either the method is for inexperienced/non-programmer users (and then the review workflow must be completely autonomous), or it requires technical supervision (and then it is not truly a tool accessible to non-technical users).
2. The Structural Gap in the Development Workflow: Lack of Safeguards
When the user selects Option 2 (delegating follow-ups to the original development agent, such as Opus), a serious structural limitation emerges. In Step 8 of the workflow dev-story/workflow.md, review tasks [AI-Review] are handled with a simple double-check system ("Reviewed finding" and "Implemented fix").
The gap lies in the fact that there is no safety mechanism (safeguard) that forces the developer agent to reread the original code, understand the real nature of the problem, or verify that the fix is actually effective. The developer agent often reads only the task title, producing code that appears to solve the error without understanding the real "why" behind it.
3. Practical Evidence: The Pattern of "Superficial Fixes"
During the development of Epic 1, the absence of safeguards generated a repeated and ineffective pattern of superficial implementations:
Story 1.2: Faced with a critical error on "wake recovery," the developer agent simply changed the name of an IPC command instead of implementing the necessary HTTP probe. The review LLM (Codex) had to reopen the finding as "High" in a second cycle.
Story 1.3: A missing test on "computed style" was resolved by inserting a useless assertion on CSS classes. The test passed, but it did not verify what was actually required.
Story 1.5: Several route loaders reported as problematic were "resolved" by the agent by inserting simple empty returns (stubs) and TODO comments.
In all these cases, the tasks were checked off as "resolved" by the agent, forcing multiple review cycles and delegating discovery of the superficiality to the review agent.
4. Limitations of the Stable Version (v6) and Inaccessible Solutions
The team's AI agents openly acknowledge that the stable version (v6) has a "design hole": it assumes a level of technical competence that the target user does not possess. The BMAD team seems to be aware of this limitation, to the point that in a more recent system ("Quick-Flow New Preview") an automatic triage with 5 categories was introduced (patch, intent_gap, bad_spec, defer, reject) so that only trivial fixes are resolved automatically, limiting human intervention. However, this system is not present in the stable version being used, leaving users in a dead end where the options provided are not concretely applicable by those who do not know how to program.
Conclusion
The BMAD method, in its current v6 Stable form, delegates critical decision-making responsibilities to users who are not trained to take them, while at the same time lacking the necessary quality controls (safeguards) to prevent development agents from bypassing the controls themselves. This invalidates the product's basic assumption as a "no-code/low-code" tool accessible to everyone.
flowchart TD
classDef gapNode fill:#FFEBEE,stroke:#C62828,color:#7a0000,font-weight:bold
classDef warnNode fill:#FFF8E1,stroke:#F57F17,color:#7a3800
classDef okNode fill:#E8F5E9,stroke:#2E7D32,color:#1B5E20
classDef normalNode fill:#F5F5F5,stroke:#888,color:#222
classDef patternNode fill:#FBE9E7,stroke:#BF360C,color:#7f2000
START(["👤 User / Project Lead\nnon-technical OR technical"]):::normalNode
START --> S1
subgraph FLOW["BMAD — Development Cycle"]
S1["📋 Story definition & planning"]:::normalNode
S2["🤖 Dev Agent — Code generation"]:::normalNode
S3["🔍 Review LLM — Defect analysis"]:::normalNode
S1 --> S2 --> S3
end
S3 --> STEP4
subgraph GAP1["⚠️ GAP 1 — Fundamental Contradiction · Step 4"]
STEP4{{"Step 4: Human must choose defect handling"}}:::warnNode
PARADOX["❌ Non-technical user: lacks skills to make architectural decisions\n───────────────────────────\n❌ Technical user: cannot trace chaotic AI-generated code"]:::gapNode
STEP4 -. exposes paradox .-> PARADOX
end
STEP4 -->|Option 1| OPT1["Auto-fix by Review LLM"]:::okNode
STEP4 -->|Option 2| OPT2["Delegate to Dev Agent"]:::normalNode
subgraph GAP2["⚠️ GAP 2 — No Safeguards · Step 8"]
S8["🤖 Dev Agent handles AI-Review tasks"]:::normalNode
CHECK["✓ Reviewed finding\n✓ Implemented fix\n— simple double-check only"]:::warnNode
NOSAFE["❌ No forced re-read of original code\n❌ No root-cause understanding\n❌ No fix-effectiveness check"]:::gapNode
S8 --> CHECK
CHECK -. structural gap .-> NOSAFE
end
OPT2 --> S8
subgraph GAP3["⚠️ GAP 3 — Evidence: Pattern of Superficial Fixes · Epic 1"]
P1["Story 1.2\nRenamed IPC command ≠ implemented HTTP probe\n→ Reopened as HIGH"]:::patternNode
P2["Story 1.3\nUseless CSS assertion ≠ verified real requirement\n→ Test passed, nothing verified"]:::patternNode
P3["Story 1.5\nEmpty stubs + TODO checked off as ✓ resolved"]:::patternNode
end
NOSAFE --> P1
NOSAFE --> P2
NOSAFE --> P3
P1 --> REOPEN["⟳ Review agent reopens findings\nForces new review cycle"]:::warnNode
P2 --> REOPEN
P3 --> REOPEN
REOPEN -->|"♻️ no exit condition → infinite loop"| S3
OPT1 --> DONE(["✅ Story Complete"]):::okNode
CHECK -->|"genuine fix — rare"| DONE
Loading
My personal considerations
I hope this view can be useful. Within my small team, I tried using BMAD in different versions to give people who are not explicitly technical in programming the chance to participate in development. After a great many attempts, first and foremost by me, as I am a programmer, I can say that in my opinion, at the moment, the BMAD method suffers from too many contradictions for the development of complex architectures in the context of programming.
The process is so cumbersome and extremely typified that, without appropriate guardrails, it makes the entire development of a complex solution far more difficult than simply using any memory-management tool such as claude-mem, together with a defined planning structure, without using the whole BMAD system. The human intervention required by the BMAD method is so deep that it makes it much more complex than it needs to be outside of it. A good CLAUDE.md file, starting prompt documentation, and structured step-by-step plans with checks on control files are definitely more functional.
Where BMAD is great, in my opinion.
BMAD is great when it comes to brainstorming, research, deeper analysis, advanced planning, and multi-domain support for a project, provided that it remains purely within the management of ideas. When it comes to practical execution, it becomes unnecessarily complex even for extremely small projects. A small MVP, now the usual kind seen in the various tutorials proposed online by unoriginal content creators, would perhaps require 10 to 15 times the time with BMAD compared to a normal traditional development process using Claude Code or Codex. Is that time worth the final result? For me, no. There is no such substantial difference, especially for non-technical users, who would not know how to grasp certain technical nuances.
I am a FAN of BMAD. It has helped me gather ideas, concepts, structured plans, research, and new perspectives in different areas of my work. I am certain that we are at the beginning of a fantastic journey. My post as an ISSUE is actually an attempt, perhaps clumsy and too critical, to offer a suggestion to the BMAD developers, who will probably not read this issue, but I hope it will be noticed by someone in the community who is deeply involved in BMAD’s development and willing to take it into consideration.
Analysis of the Structural Gaps and Contradictions of the BMAD Method as Identified by BMAD Agents Themselves During an Epic Retrospective
Analysis Document for the Creators of the BMAD Method: Structural Gaps and Contradictions (Version v6 Stable)
Premise
This document summarizes the structural weaknesses, limitations, and intrinsic contradictions of the BMAD method that emerged during the retrospective of Epic 1. The analysis is the result of direct comparison between the user (Project Lead) and the AI agents on the team (Scrum Master, Senior Dev, Architect, QA, and Developer) designed to support the method itself.
1. The Fundamental Contradiction: Target Audience vs. Supervision Requirements
The most serious problem at the foundation of the BMAD method is a conceptual paradox. The method is born and promoted as a tool suitable for both technical users and inexperienced users. In reality, both a technical user and an inexperienced user would struggle with some parts of the process, because a technical user cannot make decisions on mountains of code not written directly by them and almost impossible to trace logically within chaotic and disorganized AI-written code, while for the non-technical user the already clear problems included remain fully present. However, the core principle on which the code review process is based (also cited by the user alexeyv in the official discussions) is "human filtering": it is assumed that the human being must filter and evaluate the problems found by the artificial intelligence.
In Step 4 of the review workflow, the system asks the user to choose between different options for handling the detected defects, such as having the review LLM automatically correct the code (Option 1) or creating follow-up tasks for the developer agent (Option 2). This request causes the entire system to collapse in on itself: an inexperienced or non-technical user does not have the skills to read and understand a complex mountain of code, nor to make architectural decisions about how to solve the problems. At the same time, even a technical user cannot realistically make reliable architectural decisions over large volumes of chaotic, AI-generated code they did not write and that are nearly impossible to trace logically. As highlighted by the Architect agent himself (Winston): the two premises cannot coexist. Either the method is for inexperienced/non-programmer users (and then the review workflow must be completely autonomous), or it requires technical supervision (and then it is not truly a tool accessible to non-technical users).
2. The Structural Gap in the Development Workflow: Lack of Safeguards
When the user selects Option 2 (delegating follow-ups to the original development agent, such as Opus), a serious structural limitation emerges. In Step 8 of the
workflow dev-story/workflow.md, review tasks[AI-Review]are handled with a simple double-check system ("Reviewed finding" and "Implemented fix").The gap lies in the fact that there is no safety mechanism (safeguard) that forces the developer agent to reread the original code, understand the real nature of the problem, or verify that the fix is actually effective. The developer agent often reads only the task title, producing code that appears to solve the error without understanding the real "why" behind it.
3. Practical Evidence: The Pattern of "Superficial Fixes"
During the development of Epic 1, the absence of safeguards generated a repeated and ineffective pattern of superficial implementations:
In all these cases, the tasks were checked off as "resolved" by the agent, forcing multiple review cycles and delegating discovery of the superficiality to the review agent.
4. Limitations of the Stable Version (v6) and Inaccessible Solutions
The team's AI agents openly acknowledge that the stable version (v6) has a "design hole": it assumes a level of technical competence that the target user does not possess. The BMAD team seems to be aware of this limitation, to the point that in a more recent system ("Quick-Flow New Preview") an automatic triage with 5 categories was introduced (
patch,intent_gap,bad_spec,defer,reject) so that only trivial fixes are resolved automatically, limiting human intervention. However, this system is not present in the stable version being used, leaving users in a dead end where the options provided are not concretely applicable by those who do not know how to program.Conclusion
The BMAD method, in its current v6 Stable form, delegates critical decision-making responsibilities to users who are not trained to take them, while at the same time lacking the necessary quality controls (safeguards) to prevent development agents from bypassing the controls themselves. This invalidates the product's basic assumption as a "no-code/low-code" tool accessible to everyone.
flowchart TD classDef gapNode fill:#FFEBEE,stroke:#C62828,color:#7a0000,font-weight:bold classDef warnNode fill:#FFF8E1,stroke:#F57F17,color:#7a3800 classDef okNode fill:#E8F5E9,stroke:#2E7D32,color:#1B5E20 classDef normalNode fill:#F5F5F5,stroke:#888,color:#222 classDef patternNode fill:#FBE9E7,stroke:#BF360C,color:#7f2000 START(["👤 User / Project Lead\nnon-technical OR technical"]):::normalNode START --> S1 subgraph FLOW["BMAD — Development Cycle"] S1["📋 Story definition & planning"]:::normalNode S2["🤖 Dev Agent — Code generation"]:::normalNode S3["🔍 Review LLM — Defect analysis"]:::normalNode S1 --> S2 --> S3 end S3 --> STEP4 subgraph GAP1["⚠️ GAP 1 — Fundamental Contradiction · Step 4"] STEP4{{"Step 4: Human must choose defect handling"}}:::warnNode PARADOX["❌ Non-technical user: lacks skills to make architectural decisions\n───────────────────────────\n❌ Technical user: cannot trace chaotic AI-generated code"]:::gapNode STEP4 -. exposes paradox .-> PARADOX end STEP4 -->|Option 1| OPT1["Auto-fix by Review LLM"]:::okNode STEP4 -->|Option 2| OPT2["Delegate to Dev Agent"]:::normalNode subgraph GAP2["⚠️ GAP 2 — No Safeguards · Step 8"] S8["🤖 Dev Agent handles AI-Review tasks"]:::normalNode CHECK["✓ Reviewed finding\n✓ Implemented fix\n— simple double-check only"]:::warnNode NOSAFE["❌ No forced re-read of original code\n❌ No root-cause understanding\n❌ No fix-effectiveness check"]:::gapNode S8 --> CHECK CHECK -. structural gap .-> NOSAFE end OPT2 --> S8 subgraph GAP3["⚠️ GAP 3 — Evidence: Pattern of Superficial Fixes · Epic 1"] P1["Story 1.2\nRenamed IPC command ≠ implemented HTTP probe\n→ Reopened as HIGH"]:::patternNode P2["Story 1.3\nUseless CSS assertion ≠ verified real requirement\n→ Test passed, nothing verified"]:::patternNode P3["Story 1.5\nEmpty stubs + TODO checked off as ✓ resolved"]:::patternNode end NOSAFE --> P1 NOSAFE --> P2 NOSAFE --> P3 P1 --> REOPEN["⟳ Review agent reopens findings\nForces new review cycle"]:::warnNode P2 --> REOPEN P3 --> REOPEN REOPEN -->|"♻️ no exit condition → infinite loop"| S3 OPT1 --> DONE(["✅ Story Complete"]):::okNode CHECK -->|"genuine fix — rare"| DONEMy personal considerations
I hope this view can be useful. Within my small team, I tried using BMAD in different versions to give people who are not explicitly technical in programming the chance to participate in development. After a great many attempts, first and foremost by me, as I am a programmer, I can say that in my opinion, at the moment, the BMAD method suffers from too many contradictions for the development of complex architectures in the context of programming.
The process is so cumbersome and extremely typified that, without appropriate guardrails, it makes the entire development of a complex solution far more difficult than simply using any memory-management tool such as claude-mem, together with a defined planning structure, without using the whole BMAD system. The human intervention required by the BMAD method is so deep that it makes it much more complex than it needs to be outside of it. A good
CLAUDE.mdfile, starting prompt documentation, and structured step-by-step plans with checks on control files are definitely more functional.Where BMAD is great, in my opinion.
BMAD is great when it comes to brainstorming, research, deeper analysis, advanced planning, and multi-domain support for a project, provided that it remains purely within the management of ideas. When it comes to practical execution, it becomes unnecessarily complex even for extremely small projects. A small MVP, now the usual kind seen in the various tutorials proposed online by unoriginal content creators, would perhaps require 10 to 15 times the time with BMAD compared to a normal traditional development process using Claude Code or Codex. Is that time worth the final result? For me, no. There is no such substantial difference, especially for non-technical users, who would not know how to grasp certain technical nuances.
I am a FAN of BMAD. It has helped me gather ideas, concepts, structured plans, research, and new perspectives in different areas of my work. I am certain that we are at the beginning of a fantastic journey. My post as an ISSUE is actually an attempt, perhaps clumsy and too critical, to offer a suggestion to the BMAD developers, who will probably not read this issue, but I hope it will be noticed by someone in the community who is deeply involved in BMAD’s development and willing to take it into consideration.
Warm regards to everyone, and GO BMAD.