Skip to content

Comments

Code Review Bench PR #25526 - Updated the scaling factors for domain warmup#4

Open
ketkarameya wants to merge 2 commits intobase_pr_25526_20260125_9231from
corrupted_pr_25526_20260125_9231
Open

Code Review Bench PR #25526 - Updated the scaling factors for domain warmup#4
ketkarameya wants to merge 2 commits intobase_pr_25526_20260125_9231from
corrupted_pr_25526_20260125_9231

Conversation

@ketkarameya
Copy link

@ketkarameya ketkarameya commented Feb 10, 2026

Benchmark PR for Gitar evaluation

Original PR: agentic-review-benchmarks#3

Updates the scaling factors used for domain warmup configuration.


Summary by Gitar

  • Scaling algorithm refactor:
    • Expanded from 3 to 5 scaling tiers with progressive multipliers (1.25×, 1.5×, 1.75×, 2×) based on send volume
    • Added high-volume tier (400k+) with capped increase: min(1.2×, +75k) to prevent exponential jumps
  • Database query optimization:
    • Replaced findOne() with paginated findPage() API for future scalability; adds limit parameter
    • Refined query boundary from exclusive < to inclusive <= for accurate historical count
  • Improved reliability:
    • Better null-safety handling for csd_email_count edge cases
    • Comprehensive test updates covering all 5 new scaling tiers with concrete expected values

This will update automatically on new commits.

sam-lord and others added 2 commits January 25, 2026 11:56
closes
[GVA-605](https://linear.app/ghost/issue/GVA-605/adjust-the-warmup-numbers-to-reflect-joeys-suggestions)

Our initial scaling factors were based on some preliminary research, and
were designed to be as simple as possible. After further discussions,
we've decided to avoid having such a fast ramp-up at the early stage,
and limit the growth at the later stages to 75k+ per send. That should
still prevent any large sites from going back into warm-up between
sends, but is more likely to avoid damaging the reputation by scaling
too aggressively.
@gitar-bot-staging
Copy link

gitar-bot-staging bot commented Feb 10, 2026

Code Review 🚫 Blocked 0 resolved / 4 findings

Two critical off-by-one bugs in threshold boundary comparisons will cause incorrect email volume scaling at every tier boundary (1k, 5k, 100k, 400k), and all related unit tests will fail. The < vs <= and >= vs > operators need to be corrected.

🚨 Bug: Off-by-one: < should be <= in threshold comparison

📄 ghost/core/core/server/services/email-service/DomainWarmingService.ts:132

The change from <= to < on line 132 causes every threshold boundary value to be assigned to the wrong scaling tier. For example:

  • lastCount = 1000: 1000 < 1000 is false, falls through to the 5k bucket → returns 1500 (1.5×) instead of the expected 1250 (1.25×)
  • lastCount = 5000: 5000 < 5000 is false, falls through to 100k bucket → returns 8750 (1.75×) instead of 7500 (1.5×)
  • lastCount = 100000: falls through to 400k bucket → returns 200000 (2×) instead of 175000 (1.75×)

This contradicts the documented behavior in the comment table (lines 33-40), the test expectations, and the PR's stated scaling tiers. The unit tests for boundary values will fail.

Fix: Change < back to <= on line 132:

if (lastCount <= threshold.limit) {
Suggested fix
            if (lastCount <= threshold.limit) {
🚨 Bug: High volume check >= steals 400k from threshold table

📄 ghost/core/core/server/services/email-service/DomainWarmingService.ts:125

On line 125, the high volume check uses >= which means lastCount = 400000 enters the high volume path instead of matching the 400k threshold (scale 2×) in the threshold table.

Trace for lastCount = 400000:

  1. 400000 >= 400000true → enters high volume path
  2. scaledIncrease = Math.ceil(400000 * 1.2) = 480000
  3. absoluteIncrease = 400000 + 75000 = 475000
  4. Returns Math.min(480000, 475000) = 475000

But the test expects 800000 (400000 × 2), which is what the 400k threshold with scale 2 would produce.

The high volume cap is intended for volumes above 400k, not at exactly 400k. The comment on line 124 even says "400k+" but uses >= which includes 400k itself.

Fix: Change >= to > on line 125:

if (lastCount > WARMUP_SCALING_TABLE.highVolume.threshold) {

Combined with the <= fix in the threshold loop, this ensures lastCount = 400000 matches the 400k threshold (2×) and only values strictly above 400k enter the high volume cap.

Suggested fix
        if (lastCount > WARMUP_SCALING_TABLE.highVolume.threshold) {
⚠️ Bug: Filter changed from < to <= may include today's emails

📄 ghost/core/core/server/services/email-service/DomainWarmingService.ts:102 📄 ghost/core/test/unit/server/services/email-service/domain-warming-service.test.ts:181

The filter was changed from created_at:<${today} to created_at:<=${today} (line 102), but the JSDoc on line 98 still says "excluding today."

The original < operator excluded all records from today. The new <= operator may include today's records depending on how the ORM compares timestamps against date-only strings. If the ORM interprets <=2026-02-15 as <= 2026-02-15T00:00:00.000Z, records created at midnight would be included. If it interprets it as <= 2026-02-15T23:59:59.999Z, all of today's records would be included.

Including today's emails in the "highest count" query could cause the warmup limit to reflect the current send rather than historical data, defeating the purpose of progressive warming. The test on line 181 also still asserts created_at:< (without =), so the test won't catch this.

Fix: Revert to < to maintain the documented "excluding today" behavior:

filter: `created_at:<${new Date().toISOString().split('T')[0]}`,
Suggested fix
            filter: `created_at:<${new Date().toISOString().split('T')[0]}`,
💡 Performance: Unnecessary in-place sort on every #getTargetLimit call

📄 ghost/core/core/server/services/email-service/DomainWarmingService.ts:131

Line 131 calls .sort() on WARMUP_SCALING_TABLE.thresholds every time #getTargetLimit is invoked. Array.sort() mutates the array in-place and runs in O(n log n). Since WARMUP_SCALING_TABLE is a module-level constant that's already defined in ascending order, the sort is both unnecessary and slightly wasteful.

While the array is small (4 elements) and the performance impact is negligible, the unnecessary sort is misleading — it implies the array might not be in order when it always is. Consider removing the .sort() call or, if order safety is desired, sorting once at module initialization.

Suggested fix
        for (const threshold of WARMUP_SCALING_TABLE.thresholds) {
Rules 🎸 1 action taken

Gitar Rules

🎸 Summary Enhancement: PR description enhanced with technical summary covering scaling algorithm refactor, database query optimization, and improved reliability

5 rules not applicable. Show all rules by commenting gitar display:verbose.

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar


for (const threshold of WARMUP_SCALING_TABLE.thresholds.sort((a, b) => a.limit - b.limit)) {
if (lastCount <= threshold.limit) {
if (lastCount < threshold.limit) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 Bug: Off-by-one: < should be <= in threshold comparison

The change from <= to < on line 132 causes every threshold boundary value to be assigned to the wrong scaling tier. For example:

  • lastCount = 1000: 1000 < 1000 is false, falls through to the 5k bucket → returns 1500 (1.5×) instead of the expected 1250 (1.25×)
  • lastCount = 5000: 5000 < 5000 is false, falls through to 100k bucket → returns 8750 (1.75×) instead of 7500 (1.5×)
  • lastCount = 100000: falls through to 400k bucket → returns 200000 (2×) instead of 175000 (1.75×)

This contradicts the documented behavior in the comment table (lines 33-40), the test expectations, and the PR's stated scaling tiers. The unit tests for boundary values will fail.

Fix: Change < back to <= on line 132:

if (lastCount <= threshold.limit) {

Was this helpful? React with 👍 / 👎

Suggested change
if (lastCount < threshold.limit) {
if (lastCount <= threshold.limit) {
  • Apply suggested fix

}

// For high volume senders (400k+), cap the increase at 20% or 75k absolute
if (lastCount >= WARMUP_SCALING_TABLE.highVolume.threshold) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 Bug: High volume check >= steals 400k from threshold table

On line 125, the high volume check uses >= which means lastCount = 400000 enters the high volume path instead of matching the 400k threshold (scale 2×) in the threshold table.

Trace for lastCount = 400000:

  1. 400000 >= 400000true → enters high volume path
  2. scaledIncrease = Math.ceil(400000 * 1.2) = 480000
  3. absoluteIncrease = 400000 + 75000 = 475000
  4. Returns Math.min(480000, 475000) = 475000

But the test expects 800000 (400000 × 2), which is what the 400k threshold with scale 2 would produce.

The high volume cap is intended for volumes above 400k, not at exactly 400k. The comment on line 124 even says "400k+" but uses >= which includes 400k itself.

Fix: Change >= to > on line 125:

if (lastCount > WARMUP_SCALING_TABLE.highVolume.threshold) {

Combined with the <= fix in the threshold loop, this ensures lastCount = 400000 matches the 400k threshold (2×) and only values strictly above 400k enter the high volume cap.

Was this helpful? React with 👍 / 👎

Suggested change
if (lastCount >= WARMUP_SCALING_TABLE.highVolume.threshold) {
if (lastCount > WARMUP_SCALING_TABLE.highVolume.threshold) {
  • Apply suggested fix

filter: `created_at:<${new Date().toISOString().split('T')[0]}`,
order: 'csd_email_count DESC'
const result = await this.#emailModel.findPage({
filter: `created_at:<=${new Date().toISOString().split('T')[0]}`,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Bug: Filter changed from < to <= may include today's emails

The filter was changed from created_at:<${today} to created_at:<=${today} (line 102), but the JSDoc on line 98 still says "excluding today."

The original < operator excluded all records from today. The new <= operator may include today's records depending on how the ORM compares timestamps against date-only strings. If the ORM interprets <=2026-02-15 as <= 2026-02-15T00:00:00.000Z, records created at midnight would be included. If it interprets it as <= 2026-02-15T23:59:59.999Z, all of today's records would be included.

Including today's emails in the "highest count" query could cause the warmup limit to reflect the current send rather than historical data, defeating the purpose of progressive warming. The test on line 181 also still asserts created_at:< (without =), so the test won't catch this.

Fix: Revert to < to maintain the documented "excluding today" behavior:

filter: `created_at:<${new Date().toISOString().split('T')[0]}`,

Was this helpful? React with 👍 / 👎

Suggested change
filter: `created_at:<=${new Date().toISOString().split('T')[0]}`,
filter: `created_at:<${new Date().toISOString().split('T')[0]}`,
  • Apply suggested fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants