feat: add thumbnail A/B testing pipeline (t207) by marcusquinn · Pull Request #896 · marcusquinn/aidevops

marcusquinn · 2026-02-10T03:23:19Z

Summary

Add thumbnail-factory-helper.sh CLI tool for generating, scoring, and A/B testing multiple thumbnail variants per video
Add youtube/thumbnail-ab-testing.md subagent documenting the full 5-phase pipeline
Wire into existing YouTube pipeline as Worker 5 (thumbnails)

What's New

`thumbnail-factory-helper.sh` (new script)

SQLite-backed CLI with 10 commands: brief, generate, score, record-score, batch-score, competitors, compare, ab-status, history, report.

Key features:

Generate design briefs from video metadata via YouTube API
Create thumbnail variants via DALL-E 3 (or prompt files without API key)
Score thumbnails against weighted quality rubric (face 25%, contrast 20%, text space 15%, brand 15%, emotion 15%, clarity 10%)
Track A/B test results with pass/fail threshold (7.5/10)
Download and analyse competitor thumbnails
Generate performance reports with score distribution

`youtube/thumbnail-ab-testing.md` (new subagent)

Full pipeline documentation covering:

Brief generation from video metadata
Variant creation (10 concept types, multiple generation backends)
Rubric-based scoring (6 criteria, weighted, 7.5 threshold)
YouTube Studio A/B testing integration
Pattern storage via memory system

Updated files

youtube.md — Added subagent, updated architecture diagram and workflow
youtube/optimizer.md — Cross-reference to thumbnail pipeline
youtube/pipeline.md — Added Worker 5 (thumbnails) with instructions and supervisor config
content/optimization.md — Updated script reference from planned to implemented
content/production/image.md — Cross-reference to thumbnail pipeline
content/distribution/youtube/README.md — Added subagent to table
subagent-index.toon — Registered new subagent and script

Quality

ShellCheck: zero violations (SC1091 info-only for source path, expected)
Script tested: help command runs successfully
Follows existing patterns: sources shared-constants.sh, uses print_* functions, SQLite state management
Graceful degradation: generates prompt files when no OPENAI_API_KEY is set

Task

Closes t207: Thumbnail A/B testing pipeline

Summary by CodeRabbit

New Features
- YouTube thumbnail A/B testing pipeline for generating, scoring, and comparing thumbnail variants
- Competitor analysis to discover similar video thumbnails for reference
- Automated scoring system with customizable assessment rubrics
- Batch processing capabilities and comprehensive history tracking
- Detailed reporting and analysis of A/B test results

Add thumbnail-factory-helper.sh CLI tool and youtube/thumbnail-ab-testing.md subagent for generating, scoring, and A/B testing multiple thumbnail variants per video. New files: - scripts/thumbnail-factory-helper.sh: SQLite-backed CLI with commands for brief generation, DALL-E 3 variant creation, rubric-based scoring (face, contrast, text space, brand, emotion, clarity), competitor analysis, A/B test tracking, and performance reporting - youtube/thumbnail-ab-testing.md: Full pipeline documentation covering 5-phase workflow (brief -> generate -> score -> test -> analyse) Updated files: - youtube.md: Add thumbnail-ab-testing subagent and architecture diagram - youtube/optimizer.md: Cross-reference thumbnail pipeline - youtube/pipeline.md: Add Worker 5 (thumbnails) to automated pipeline - content/optimization.md: Update script reference from planned to implemented - content/production/image.md: Cross-reference thumbnail pipeline - content/distribution/youtube/README.md: Add subagent to table - subagent-index.toon: Register new subagent and script

gemini-code-assist · 2026-02-10T03:23:23Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

coderabbitai · 2026-02-10T03:23:33Z

Walkthrough

This PR introduces a comprehensive thumbnail A/B testing pipeline for YouTube content optimization, consisting of a 1,110-line Bash helper script (thumbnail-factory-helper.sh) that orchestrates thumbnail generation, scoring, competitor analysis, and reporting, alongside detailed documentation integrating the feature into the YouTube agent framework.

Changes

Cohort / File(s)	Summary
Thumbnail A/B Testing Implementation `.agents/scripts/thumbnail-factory-helper.sh`, `.agents/youtube/thumbnail-ab-testing.md`	New comprehensive Bash script with 14+ public commands (brief, generate, score, batch-score, record-score, competitors, ab-status, history, report, compare) providing full thumbnail lifecycle management including OpenAI DALL-E 3 integration, SQLite persistence, and weighted scoring (face, contrast, text, brand, emotion, clarity); paired with 365-line reference documentation detailing workflow, data structures, scoring rubrics, and integration patterns.
YouTube Framework Integration `.agents/youtube.md`, `.agents/youtube/pipeline.md`, `.agents/youtube/optimizer.md`	Updated main YouTube agent orchestration to include new thumbnail-ab-testing subagent; added Worker 5 (Thumbnail A/B Testing) to pipeline with input/output flows and detailed instructions; added documentation cross-references to optimizer guide.
Service Registry & Index Updates `.agents/subagent-index.toon`, `.agents/content/distribution/youtube/README.md`	Registered thumbnail-factory-helper.sh as new public script entry; added thumbnail-ab-testing subagent to YouTube Distribution README subagents table; updated YouTube service description to include thumbnail A/B testing capability.
Documentation Cross-References `.agents/content/optimization.md`, `.agents/content/production/image.md`	Expanded thumbnail factory description from style-library reference to include scoring and A/B testing; added See Also and Cross-References entries linking to new youtube/thumbnail-ab-testing.md documentation.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

Possibly related issues

t207: Thumbnail A/B testing pipeline — generate and test multiple thumbnail variants per video #830: Directly implements the thumbnail A/B testing pipeline (thumbnail-ab-testing docs and thumbnail-factory-helper.sh) that fulfills the t207 feature requirement.

Possibly related PRs

feat(content): add distribution channel reference agents (t199.8) #880: Both PRs modify the YouTube distribution README's subagents list and registration, establishing parallel infrastructure updates.

Poem

Thumbnails born in factory light,
Scored by rubric, tested right,
Variants clash in A/B fight,
Winners crowned by click-through might! 📹✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: add thumbnail A/B testing pipeline (t207)' directly describes the main change—introduction of a comprehensive thumbnail A/B testing pipeline with CLI script and documentation.
Docstring Coverage	✅ Passed	Docstring coverage is 85.71% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/t207

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

.agents/subagent-index.toon (1)
87-87: ⚠️ Potential issue | 🟡 Minor

Scripts count not updated — header says [58] but there are now 59 entries.

Adding thumbnail-factory-helper.sh on Line 146 brings the total script count to 59. The TOON header on Line 87 still reads scripts[58]. This should be incremented to maintain accurate metadata, keeping that zero-technical-debt standard intact.
🔧 Proposed fix
-<!--TOON:scripts[58]{name,purpose}:
+<!--TOON:scripts[59]{name,purpose}:

🤖 Fix all issues with AI agents

In @.agents/scripts/thumbnail-factory-helper.sh:
- Around line 306-318: The script requests DALL·E images at 1792x1024 (see the
prompt and the curl body using "size": "1792x1024") but never resizes the
downloaded image to THUMB_WIDTH x THUMB_HEIGHT (1280x720); add a post-download
resize step after the response is saved that detects and uses ImageMagick's
convert or macOS sips to resize/crop to exactly ${THUMB_WIDTH}x${THUMB_HEIGHT}
while preserving aspect and center-cropping as needed, and wire this into the
same code path that handles the saved image (refer to the variables response,
prompt, THUMB_WIDTH, THUMB_HEIGHT and the download/save logic) so thumbnails are
produced at the target size.
- Around line 264-266: The credential extraction uses grep -oP (PCRE) which
isn't available on macOS; update the fallback block that assigns api_key (the
line using grep -oP 'OPENAI_API_KEY="\\K[^"]+') to use a POSIX-safe tool (sed or
awk) instead so it works on BSD grep systems—e.g., parse
$HOME/.config/aidevops/credentials.sh with awk or sed to extract the
OPENAI_API_KEY value and assign it to the api_key variable, keeping the same
fallback behavior (head -1 || true).
- Around line 44-106: The init_db function currently swallows sqlite3 errors
(the sqlite3 "$THUMB_DB" ... 2>/dev/null || true), which hides failures; modify
init_db (and the sqlite3 invocation) to stop discarding stderr, capture sqlite3
exit status, and on non-zero exit print a clear error to stderr including
$THUMB_DB and the sqlite3 error text (e.g., echo "ERROR: failed to initialize
$THUMB_DB: <stderr>" >&2) and exit or return a non-zero code instead of silently
succeeding; ensure the caller sees the non-zero return so downstream steps don't
run on silent failure.
- Around line 155-162: The assignments to thumb_url and title use unsafe node -e
with sed-escaped JSON (thumb_url and title variables) which risks command
injection and broken parsing; change both to pipe the raw video_data into node
via stdin (same safe stdin-piping pattern used earlier) and read JSON from
process.stdin inside the Node snippet instead of embedding the string, and apply
the same stdin-piping fix to the other occurrences flagged (the node -e uses
around lines assigning other thumbnail/title values referenced in the comment).
- Around line 518-535: In cmd_record_score validate each score argument
(score_face, score_contrast, score_text, score_brand, score_emotion,
score_clarity) is numeric and within 1–10 before calling init_db or the node
total calculation; if any value is non-numeric or out of range, print a clear
error and exit non-zero. Implement the checks early in cmd_record_score (e.g.,
using a numeric test and range comparison for each variable) and only proceed to
run the node snippet and insert into SQLite when all six values pass validation;
ensure invalid inputs do not reach the DB to avoid NaN entries breaking
aggregates.
- Around line 842-851: The IFS='|' read into overall can produce empty variables
(videos, variants, avg_score, best_score, passing, winners, avg_ctr) if the
sqlite3 query fails, causing integer comparisons later (the -lt/-eq checks
referencing variants, passing, winners) to blow up; after the IFS='|' read,
coerce each numeric variable to a safe default like 0 (e.g. reset variants,
passing, winners, videos, avg_score, best_score, avg_ctr to ${var:-0} or
equivalent) so the later numeric comparisons and percent formatting always
operate on valid integers/floats; apply this to the variables parsed from
overall and any variables used in the subsequent -lt/-eq checks.
- Around line 236-239: The script interpolates unsanitized variables (video_id,
variant_path, variant_label, style, image_path, etc.) directly into sqlite3 SQL
which allows SQL injection; add a single helper function (e.g.,
escape_sql_literal) that performs the same single-quote escaping used for title
(sed "s/'/''/g") and call it before any SQL interpolation, then replace raw uses
in the INSERT/UPDATE calls for thumbnail_briefs and other DB operations
referenced by cmd_generate, _generate_prompt_files, cmd_record_score,
cmd_ab_status, cmd_history, and cmd_compare so all user-supplied fields are
escaped consistently.

In @.agents/youtube/thumbnail-ab-testing.md:
- Line 113: The table row for "DALL-E 3 (via script)" currently shows a price of
$0.08/img but HD 1792×1024 images cost $0.12/img; update that table entry to
reflect "$0.12/img" (or clarify "Standard $0.08/img, HD 1792×1024 $0.12/img") in
the .agents/youtube/thumbnail-ab-testing.md row that contains "DALL-E 3 (via
script)" so the pricing is accurate.

🧹 Nitpick comments (1)

.agents/scripts/thumbnail-factory-helper.sh (1)

38-38: Unused constant HELP_SHOW_MESSAGE.

HELP_SHOW_MESSAGE is declared but never referenced anywhere in the script. ShellCheck would flag this as SC2034 but it's globally suppressed on line 2.

coderabbitai · 2026-02-10T03:28:05Z