feat(changelog): Add AI-powered summaries for verbose changelog sections #688

BYK · 2025-12-29T20:48:15Z

Summary

Adds AI-powered summarization for changelog sections. Uses GitHub Models API by default, with a local fallback when no token is available.

Features

Neutral, factual summaries: Avoids promotional language ("enhanced", "improved") - just states what changed
Expandable details: Original items are preserved in a collapsible <details> block
Section Summaries: Condenses verbose bullet lists into readable prose (58-64% compression)
Top-Level Summary: Optional executive summary paragraph for entire changelog
GitHub Models API: Uses GPT-4o-mini by default
Local Fallback: Uses Falconsai/text_summarization (~60MB) when no token available
Zero Config: Works with existing GITHUB_TOKEN or gh CLI
Threshold-based: Only summarizes sections/releases exceeding the threshold

Examples

Example 1: Craft 2.16.0 (Small Release)

Input (6 items):

### New Features
- Strip commit patterns from changelog entries
- Add support for custom changelog entries from PR descriptions
- Support for multiple entries and nested items
- Add changelog preview action and CLI command
- Make release workflow reusable for external repos
- Add version templating for layer names

Output (with expandable details):

### New Features
Changelog entries now support custom descriptions, multiple items, previews, reusable workflows, and version templating for layers.

<details>
<summary>Show 6 items</summary>

- Strip commit patterns from changelog entries
- Add support for custom changelog entries from PR descriptions
- Support for multiple entries and nested items
- Add changelog preview action and CLI command
- Make release workflow reusable for external repos
- Add version templating for layer names

</details>

Top-Level Summary (with topLevel: "always"):

"The software release includes several new features: the ability to strip commit patterns from changelog entries, support for custom changelog entries derived from pull request descriptions, and support for multiple entries and nested items. Additionally, a changelog preview action and CLI command have been added."

Example 2: Sentry 25.12.0 (Large Release)

Real-world test with Sentry 25.12.0 (31 items across 3 sections):

Section Summaries

Section	Items	Words In → Out	Compression
ACI	11	98 → 41	58%
Agents	8	58 → 24	59%
Seer & Triage	12	80 → 31	61%

ACI Section (neutral tone):

"The metric monitor form now defaults to errors, alerts have a disabled option, test notification errors are displayed, navigation improvements were made, and new features include issue details, detector configurations, and direct log sending to Sentry."

Agents Section:

"Added markdown rendering with raw value switching, error icon preservation, browser JS onboarding, relocated analytics events, Seer feature tracking, anomaly thresholds for metric monitors, parallelized stats queries, and restored SPA auth page."

Top-Level Summary (106 words)

"The latest software release includes several updates across three main areas: ACI, Agents, and Seer & Triage. In the ACI section, the metric monitor form now defaults to the number of errors, and alerts have been updated to include a disabled status and display test notification errors in the UI. The Agents section introduces markdown rendering, the ability to switch to raw values, and a new onboarding process for browser JavaScript. Additionally, the Seer & Triage updates involve changes to support repo type checks, column renaming for broader applicability, and the removal of unnecessary calls."

Sections with ≤5 items are left unchanged.

Configuration

aiSummaries:
  enabled: true
  kickInThreshold: 5  # Only summarize sections with >5 items
  model: "openai/gpt-4o-mini"  # default
  topLevel: "threshold"  # "always" | "never" | "threshold" | true | false

Top-Level Summary Options

Value	Behavior
`"always"` or `true`	Always generate a top-level summary paragraph
`"never"` or `false`	Never generate a top-level summary
`"threshold"` (default)	Only generate if total items > kickInThreshold

Available Models

GitHub Models (requires GITHUB_TOKEN):

model: "openai/gpt-4o-mini"        # Default
model: "openai/gpt-4o"             # Most capable
model: "mistral-ai/ministral-3b"   # More aggressive compression

Local (no token needed):

model: "local:Falconsai/text_summarization"  # 60MB, extractive

Details Block

When AI summarization is applied, the original items are preserved in an expandable <details> block:

Summary text here.

<details>
<summary>Show 6 items</summary>

- Original item 1
- Original item 2
- ...

</details>

This allows users to expand and see the full details when needed.

Authentication

Uses your GitHub token automatically:

From GITHUB_TOKEN environment variable, or
From gh auth token (GitHub CLI)

Falls back to local model if no token available.

Files Changed

File	Description
`src/utils/ai-summary.ts`	Dual-mode summarization with neutral prompts, details formatting
`src/__tests__/ai-summary.test.ts`	Unit tests (40 tests)
`src/__tests__/ai-summary.integration.test.ts`	Integration tests (16 tests with Sentry 25.12.0 data)
`src/__tests__/ai-summary.eval.ts`	Quality evals with vitest-evals
`src/schemas/projectConfig.schema.ts`	Updated config schema with topLevel
`README.md`	Documentation with before/after examples

Commands

yarn test                         # Unit tests (706 tests)
GITHUB_TOKEN=... yarn test:evals  # AI quality evals

Model Selection Journey

We tested various local models before settling on the current hybrid approach:

Model	Size	Quality	Issue
SmolLM2-360M	360MB	❌	Hallucinations, off-topic filler
SmolLM2-1.7B	1.7GB	❌	Refused to summarize
Qwen2-0.5B	500MB	⚠️	Concatenation, not summarization
Qwen2.5-1.5B	1GB	❌	Requires cmake to compile
Flan-T5-Large	1.5GB	❌	ONNX parsing errors
Falconsai/text_summarization	60MB	✅	Works well for extractive
GitHub Models	API	✅	Best abstractive quality

Conclusion: Small local LLMs (<2GB) struggle with true abstractive summarization. GitHub Models API provides superior quality; local model serves as a reasonable fallback.

github-actions · 2025-12-29T20:49:02Z

Semver Impact of This PR

🟡 Minor (new features)

📋 Changelog Preview

This is how your changes will appear in the changelog.
Entries from this PR are highlighted with a left border (blockquote style).

New Features ✨

(changelog) Add AI-powered summaries for verbose changelog sections by BYK in #688

Bug Fixes 🐛

(changelog) Disable author mentions in PR preview comments by BYK in #684
(github) Clean up orphaned draft releases on publish failure by BYK in #681
(publish) Fail early on dirty git repository by BYK in #683

_{🤖 This preview updates automatically when you update the PR.}

Add optional AI-powered summarization for changelog sections using GitHub Models API. Uses your existing GitHub token—no additional API keys required. Features: - Summarizes sections with >5 items into concise prose (40-60% compression) - Uses GPT-4o-mini by default via GitHub Models API - Configurable model selection (GPT-4o, Llama, etc.) - Graceful degradation if token unavailable - Eval tests using vitest-evals for quality validation Configuration: aiSummaries: enabled: true kickInThreshold: 5 model: openai/gpt-4o-mini

- Change default model to mistral-ai/ministral-3b (71-87% compression) - Add local fallback using Falconsai/text_summarization (~60MB) - Fallback activates when no GITHUB_TOKEN available - Support local: prefix for explicit local model selection - Update README with before/after example and model options - Update tests to cover both API and local paths (20 tests)

- Move manual test script to proper Vitest integration test - Tests real changelog sections from Sentry 25.12.0 release - Validates compression ratio and threshold behavior - Skips automatically if GITHUB_TOKEN not available

GPT-4o-mini produces higher quality summaries with better readability compared to Ministral-3b.

Adds topLevel config option to control executive summary generation: - 'always' or true: Always generate top-level summary - 'never' or false: Never generate top-level summary - 'threshold' (default): Only generate if total items > kickInThreshold The top-level summary creates a single paragraph (up to 5 sentences) summarizing the entire release, ideal for large releases. Also adds summarizeChangelog() and shouldGenerateTopLevel() functions with full test coverage (34 tests).

- Add tests for summarizeChangelog with Craft 2.16.0 and Sentry 25.12.0 - Add tests for shouldGenerateTopLevel with all mode combinations - Update README with both section and top-level summary examples - Total: 16 integration tests, 34 unit tests (700 tests overall)

- Update prompts to avoid promotional language (no 'enhanced', 'improved', etc.) - Add formatSummaryWithDetails() for wrapping original items in <details> - Add 6 new unit tests for formatSummaryWithDetails - Update README with neutral tone examples and details block demo - Total: 40 unit tests, 16 integration tests (706 tests overall)

BYK force-pushed the byk/feat/changelog-ai-summary branch 3 times, most recently from 54efab4 to 2d363db Compare December 29, 2025 21:20

BYK force-pushed the byk/feat/changelog-ai-summary branch from 2d363db to 0a7d479 Compare December 29, 2025 21:59

BYK added 9 commits December 30, 2025 02:12

test: Add Sentry 25.12.0 changelog summarization test

c0ea04a

feat: Switch default model to openai/gpt-4o-mini

2b60efb

GPT-4o-mini produces higher quality summaries with better readability compared to Ministral-3b.

Merge branch 'master' into byk/feat/changelog-ai-summary

b3a2786

fix: wrap case block with lexical declaration in braces

34c27c8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(changelog): Add AI-powered summaries for verbose changelog sections #688

feat(changelog): Add AI-powered summaries for verbose changelog sections #688

Uh oh!

BYK commented Dec 29, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Dec 29, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

feat(changelog): Add AI-powered summaries for verbose changelog sections #688

Are you sure you want to change the base?

feat(changelog): Add AI-powered summaries for verbose changelog sections #688

Uh oh!

Conversation

BYK commented Dec 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Features

Examples

Example 1: Craft 2.16.0 (Small Release)

Example 2: Sentry 25.12.0 (Large Release)

Section Summaries

Top-Level Summary (106 words)

Configuration

Top-Level Summary Options

Available Models

Details Block

Authentication

Files Changed

Commands

Model Selection Journey

Uh oh!

github-actions bot commented Dec 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Semver Impact of This PR

📋 Changelog Preview

New Features ✨

Bug Fixes 🐛

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

BYK commented Dec 29, 2025 •

edited

Loading

github-actions bot commented Dec 29, 2025 •

edited

Loading