A multi-agent backend that turns one topic into a fully illustrated, publish-ready article. Five specialized agents wired into Spring AI Alibaba's
StateGraph, with token-level streaming, parallel image generation across six providers, an explicit phase state machine you can intervene in, and atomic VIP-quota enforcement. Spring Boot 3 on the back, Vue 3 + Ant Design Vue on the front.
Sign up, or use one of the pre-seeded demo accounts (all share password 12345678):
| Role | Account | What it can do |
|---|---|---|
| Admin | admin |
Everything — user management, statistics dashboard, unlimited generations |
| VIP | vip |
Unlimited generations + AI image generation + LLM-authored SVG diagrams |
| User | user |
5 free generations · photo / icon / mermaid / meme images |
| Test | test |
Same as user, kept clean for fresh demos |
Want to test the upgrade flow yourself? Sign up as a regular user, hit VIP in the nav, and pay with the Stripe test card
4242 4242 4242 4242(any future expiry, any CVC). The webhook flips your role tovip.
The pipeline is three independent StateGraphs, one per phase, built and compiled per request inside ArticleAgentOrchestrator. Phases 1 and 2 are single-node graphs; Phase 3 is the four-node sequential graph above. Splitting it this way is what lets the user interrupt between phases — pick from the title candidates, edit or re-prompt the outline, then commit to the body. Every phase that calls an LLM streams tokens back to the browser over SSE; image generation streams per-image events as they finish.
| # | Agent | Implementation notes | I/O |
|---|---|---|---|
| 1 | TitleGeneratorAgent |
Prompt asks for 3–5 distinct angles, ≤30 words, with numbers / emotional hooks. Style suffix (TECH / EMOTIONAL / EDUCATIONAL / HUMOROUS) appended to steer tone. JSON parsed via GsonUtils.unwrapJson to tolerate code-fence wrappers. |
ChatModel.call() → list of {mainTitle, subTitle} |
| 2 | OutlineGeneratorAgent |
Tokens flow through StreamHandlerContext and out as AGENT2_STREAMING: SSE frames. Optional userDescription is interpolated into the prompt — that's how the user steers tone or angle without prompt-engineering. |
ChatModel.stream() → OutlineResult of 3–5 sections |
| 3 | ContentGeneratorAgent |
Receives the full outline as JSON to keep section boundaries; emits AGENT3_STREAMING: SSE frames so the UI renders text as it's being written. |
ChatModel.stream() → Markdown body with [image_position_N] placeholders |
| 4 | ImageAnalyzerAgent |
The model decides what kind of image each spot wants (photo / AI-render / mermaid / icon / meme / SVG). Output is filtered against the article's enabledImageMethods — any disallowed kind is rewritten to the first allowed alternative, so a non-VIP can't get VIP image kinds even if the LLM picked one. |
ChatModel.call() → {contentWithPlaceholders, imageRequirements[]} |
| — | ParallelImageGenerator |
Groups requirements by imageSource, runs one CompletableFuture per provider, joins with allOf().join(). Each successful image emits an IMAGE_COMPLETE SSE frame so the UI can render images progressively. Failures are isolated per image; a thread-safe CopyOnWriteArrayList collects whatever succeeded. |
Pure code → List<ImageResult> with R2 URLs |
| 5 | ContentMergerAgent |
Defensive placeholder substitution — warns on missing slots and tolerates three different upstream result shapes (ArticleState.ImageResult, ImageGenerationTool.ImageGenerationResult, raw Map). |
Pure code → fullContent (final Markdown) |
Every LLM-touching agent is annotated with @AgentExecution(...). An AOP aspect (AgentExecutionAspect) intercepts every call and writes a row into agent_log (taskId, prompt, duration, status, error message). The save is fired async via AgentLogService.saveLogAsync, so logging never sits on the hot path.
ImageServiceStrategy auto-discovers all ImageSearchService beans at @PostConstruct and registers them in an EnumMap<ImageMethodEnum, ImageSearchService>. For each requirement it:
- Resolves the chosen provider (or falls back to
getDefaultSearchMethod()if the source is unknown). - Calls the service. If it returns nothing usable, hands off to the strategy-defined alternative.
- If that fails too, drops to Picsum — random photo, but always a valid image.
- Uploads the bytes to Cloudflare R2 (S3-compatible) and returns the public URL.
Net effect: an article never ships with broken images, only with degraded ones.
The article isn't a fire-and-forget async job — it's a stateful conversation the user can step through, abandon, or resume.
ArticlePhaseEnum.canTransitionTo(...) validates every move in code — illegal transitions throw a BusinessException instead of silently corrupting state. A separate ArticleStatusEnum (PENDING / PROCESSING / COMPLETED / FAILED) tracks orthogonal lifecycle health for list views and admin dashboards.
Streaming an LLM response through a state graph is awkward — graph state gets serialized between nodes, and Consumer<String> is not serializable. The fix:
StreamHandlerContextholds the per-request callback in aThreadLocal<Consumer<String>>.- The orchestrator binds it before
graph.invoke(...)and clears it in afinallyblock. - Agents pull it via
StreamHandlerContext.send(token)— no graph-state coupling. SseEmitterManager(aConcurrentHashMap<taskId, SseEmitter>) handles the wire side with timeout / completion / error callbacks that auto-evict the emitter.
Event types: AGENT1_COMPLETE, TITLES_GENERATED, AGENT2_STREAMING, AGENT2_COMPLETE, OUTLINE_GENERATED, AGENT3_STREAMING, AGENT3_COMPLETE, AGENT4_COMPLETE, IMAGE_COMPLETE, AGENT5_COMPLETE, MERGE_COMPLETE, ALL_COMPLETE, ERROR.
- Atomic quota deduction —
UPDATE user SET quota = quota - 1 WHERE id = ? AND quota > 0inside a@Transactionalboundary. Affected-rows = 0 ⇒BusinessException("Out of quota"). No read-then-write window, no need for a distributed lock on the hot path. - VIP / admin bypass — role check skips the deduction entirely.
- Stripe checkout —
PaymentService.createVipPaymentSession()issues a Checkout session;StripeWebhookControllerverifies signatures withWebhook.constructEvent(...)before flipping the user to VIP and recording apayment_recordrow. - Refunds — reverse the VIP flag and refund through Stripe in one call.
- Image-method gating —
ArticleServiceImpl.validateImageMethodsrejects the request up front if a non-VIP asks forNANO_BANANAorSVG_DIAGRAM. Inside the pipeline the agent's choices are filtered again, defense in depth.
| Layer | Stack |
|---|---|
| Backend | Java 21 · Spring Boot 3.5.9 · Spring AI Alibaba 1.1.0 (StateGraph) · Spring AI OpenAI 1.0.1 · MyBatis-Flex · Stripe Java · AWS SDK v2 (S3) · OkHttp · Jsoup · Knife4j · Hutool · Lombok |
| LLM | Gemini 2.5 Flash (text) and Gemini 2.5 Flash Image / Nano Banana (images), called via the OpenAI-compatible endpoint and the Google Gen AI Java SDK |
| Storage | MySQL 8 · Redis (sessions + Redisson distributed locks) · Cloudflare R2 (images) |
| Frontend | Vue 3.5 · TypeScript 5.8 · Vite 7 · Pinia · Vue Router · Ant Design Vue · ECharts · Axios |
| Infra | Docker Compose (backend · frontend behind nginx · MySQL · Redis) · GitHub Actions deploy workflow |
- Docker Desktop (or Docker Engine + Compose v2)
- A Gemini API key — free, get one at https://aistudio.google.com/apikey
- A Pexels API key — free, get one at https://www.pexels.com/api/
git clone https://github.com/zxuhan/folio-writer.git
cd folio-writer
cp .env.example .env
# open .env, set GEMINI_API_KEY and PEXELS_API_KEY (everything else has a default)
docker compose up -d --buildThat's it. First boot takes ~2 minutes (MySQL initialises five SQL migrations and the backend pulls Maven deps).
| Service | URL |
|---|---|
| Frontend | http://localhost:8080 |
| Backend API | http://localhost:8123/api |
| API docs | http://localhost:8123/api/doc.html |
Log in with one of the demo accounts above (admin / vip / user / test, password 12345678).
Optional keys. R2 (
R2_ACCESS_KEY_ID,R2_SECRET_ACCESS_KEY,R2_ACCOUNT_ID,R2_BUCKET,R2_PUBLIC_URL) makes generated images persist to Cloudflare R2 — without them, image uploads silently fail and only Picsum/Pexels URLs survive. Stripe (STRIPE_API_KEY,STRIPE_WEBHOOK_SECRET) is only needed if you want the VIP upgrade flow.MySQL and Redis stay on the internal Docker network — uncomment the
ports:block indocker-compose.ymlif you want to attach a client.
You'll need JDK 21, Maven 3.9+, Node 20+, and a MySQL 8 + Redis running locally.
# backend
cp src/main/resources/application-local.yml.example src/main/resources/application-local.yml
# edit it: API keys, MySQL URL, Redis host
mvn spring-boot:run
# frontend (separate shell)
cd frontend
npm install
npm run devBackend listens on :8123/api, frontend on :5173.
src/main/java/com/zxuhan/template/
├── agent/
│ ├── agents/ # 5 agents (Title / Outline / Content / Image / Merger)
│ ├── parallel/ParallelImageGenerator.java # CompletableFuture image fan-out
│ ├── tools/ImageGenerationTool.java # @Tool wrapper callable from agents
│ ├── context/StreamHandlerContext.java # ThreadLocal SSE bridge
│ ├── config/AgentConfig.java
│ └── ArticleAgentOrchestrator.java # builds + invokes the 3 phase StateGraphs
├── annotation/AgentExecution.java # AOP marker
├── aop/AgentExecutionAspect.java # auto-logs every agent call
├── service/
│ ├── ImageServiceStrategy.java # provider selection + fallback
│ ├── {Pexels,Mermaid,Iconify,EmojiPack,NanoBanana,SvgDiagram}Service.java
│ ├── R2Service.java # S3-compatible upload
│ ├── ArticleAgentService.java # legacy synchronous fallback path
│ └── ArticleAsyncService.java # @Async entry into the orchestrator
├── controller/ # Article · User · Payment · StripeWebhook · Statistics · Health
├── manager/SseEmitterManager.java
├── model/{entity,dto,vo,enums}/
└── config/ # CORS · JSON · Async · per-provider config
sql/ # base schema + incremental migrations
frontend/ # Vue 3 + Vite SPA
docs/ # D2 diagram sources + rendered SVGs