Skip to content

Conversation

@amikofalvy
Copy link
Collaborator

No description provided.

amikofalvy and others added 25 commits January 24, 2026 02:15
- Use Turbo's --filter='...[origin/main]' to only run checks on packages
  that have changed since origin/main
- Add fallback to run full check if origin/main doesn't exist (fresh clone)
- Add documentation comments explaining the affected-only behavior

This significantly speeds up pre-push validation when only a few packages
are modified, targeting < 30 seconds with cache.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Document the setup command: npx turbo login && npx turbo link
- Document expected behavior: FULL TURBO and cache hits
- Document how to verify caching: pnpm turbo run build --summarize
- Include CI/CD configuration for TURBO_TOKEN and TURBO_TEAM

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…asks

- Add test:fast task with dependsOn=['build'], cache=true for fast unit tests
- Add test:slow task with dependsOn=['build'], cache=true for database-backed tests
- Add test:integration task with dependsOn=['build'], cache=false for integration tests

This enables the test pyramid optimization by separating tests into tiers.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add test:fast script: turbo test:fast
- Add test:slow script: turbo test:slow
- Add test:integration script: turbo test:integration

Enables running different test tiers from root with pnpm test:fast/slow/integration.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Create agents-api/vitest.fast.config.ts based on existing config
- Set include to ['**/*.test.ts', '**/*.test.tsx']
- Set exclude to include *.slow.test.ts and *.integration.test.ts
- Set testTimeout to 5000ms (5 seconds)
- Enable fileParallelism for parallel test execution
- Add test:fast script to agents-api/package.json

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Create agents-api/vitest.slow.config.ts for database-backed tests
- Set include to ['**/*.slow.test.ts']
- Set testTimeout to 60000ms (60 seconds)
- Keep setup files that initialize PGlite from original config
- Add test:slow script to agents-api/package.json

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Create packages/agents-core/vitest.fast.config.ts
- Set include to ['**/*.test.ts', '**/*.test.tsx']
- Set exclude for *.slow.test.ts, *.integration.test.ts, and integration/
- Set testTimeout to 5000ms (5 seconds)
- Add test:fast script to packages/agents-core/package.json

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Create packages/agents-core/vitest.slow.config.ts
- Set include to ['**/*.slow.test.ts']
- Set testTimeout to 60000ms (60 seconds)
- Add test:slow script to packages/agents-core/package.json

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Modify check:prepush task to depend on test:fast instead of test
- This makes pre-push checks faster by running only fast unit tests
- Slow and integration tests still run in CI for full coverage

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Create SLOW_TESTS_TO_RENAME.md listing 35 test files that use database
- Tests identified by searching for createTestClient and PGlite imports
- Document reason for classification (PGlite setup, migrations, timeouts)
- Include renaming instructions using git mv

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Rename 5 test files from *.test.ts to *.slow.test.ts:
- agent.test.ts -> agent.slow.test.ts
- agentFull.test.ts -> agentFull.slow.test.ts
- apiKeys.test.ts -> apiKeys.slow.test.ts
- artifactComponents.test.ts -> artifactComponents.slow.test.ts
- contextConfigs.test.ts -> contextConfigs.slow.test.ts

These tests use database operations and are now excluded from test:fast.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Rename 30 additional test files from *.test.ts to *.slow.test.ts:
- CRUD route tests (credentialReferences, credentialStores, dataComponents, etc.)
- Evaluation CRUD tests (datasets, evaluators, etc.)
- Data layer tests (agentFull, artifactComponents, conversations, etc.)
- Run domain tests (dataChat, delegationTaskCreation)

Delete SLOW_TESTS_TO_RENAME.md as all tests are now renamed.

Total: 35 tests moved to slow tier, significantly reducing test:fast time.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Rename 4 test files that use database operations to *.slow.test.ts:
- cascading-delete.test.ts -> cascading-delete.slow.test.ts
- projectLifecycle.test.ts -> projectLifecycle.slow.test.ts
- projectMetadata.test.ts -> projectMetadata.slow.test.ts
- client.test.ts -> client.slow.test.ts

These tests use PGlite/createTestClient and are now categorized properly.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add scripts/generate-test-db-snapshot.ts that creates pre-compiled PGlite
database snapshots with all migrations applied. This enables tests to skip
migration execution for faster initialization.

The script:
- Creates PGlite instances for both manage and runtime databases
- Applies all Drizzle migrations from packages/agents-core/drizzle/
- Exports database state using PGlite's dumpDataDir('gzip') method
- Saves snapshots to test-fixtures/ directory

Added `pnpm test:generate-snapshot` command to root package.json.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Modify test database client functions to automatically load pre-compiled
snapshots when available, significantly speeding up test initialization.

- Add snapshot loading support to createTestManageDatabaseClient
- Add snapshot loading support to createTestRuntimeDatabaseClient
- Add snapshot loading support to *NoMigrations variants
- Update agents-api setup.ts to skip migrations when snapshots exist
- Update agents-core setup.ts with improved logging
- Add test-fixtures/ to .gitignore

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add step to generate PGlite database snapshots before running tests.
This pre-compiles database migrations so tests don't need to run them
individually, speeding up test initialization.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Document analysis of why agents-cli tests run sequentially:
- Identified sequential execution settings in vitest.config.ts and vitest.config.ci.ts
- Analyzed 38 test files to understand parallelization blockers
- Identified key blockers: global mock pollution, file system race conditions,
  process state modifications, and child process execution
- Documented which tests need sequential execution (command tests with heavy mocking)
- Documented which tests could safely run in parallel (generator tests, pure utilities)
- Provided recommendations for potential parallelization strategy

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Mark US-017 as complete with notes about findings.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Created vitest.parallel.config.ts for agents-cli that enables parallel
execution for tests identified as safe in US-017 analysis:
- Generator tests (13 files) - pure code generation functions
- Pure utility tests (package-manager, url, json-comparator, templates, ci-environment)
- Component parser test

Config enables fileParallelism=true, singleThread=false, maxConcurrency=8
for up to ~6x speedup on generator tests while keeping existing sequential
config for tests with global mocks, file system operations, or process spawning.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ared setup

- Parallelize entity creation in createMultipleAgents helpers using Promise.all
  - agent.slow.test.ts: Two-phase parallel creation (agents, then sub-agents)
  - subAgents.slow.test.ts: Direct parallel sub-agent creation
- Add shared tenant/project setup using beforeAll in tools.slow.test.ts
- Convert 3 "not found" tests to use shared tenant infrastructure
- Add OPTIMIZATION comments documenting the changes

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add conditional logic to run affected-only checks on pull_request events
- Use Turbo's --filter='...[origin/main]' to detect affected packages
- Full check suite still runs on push to main/master for complete validation
- Fetch main branch for comparison when running affected-only checks
- Add comments explaining the affected-only testing strategy

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Configure Vitest to output JUnit XML test reports with per-test timing
data for identifying slow tests and tracking performance trends.

- Add JUnit reporter to vitest.config.ci.ts with output to test-results/
- Add CI step to upload test-results directory as GitHub Actions artifact
- Document how to access and analyze timing reports in AGENTS.md

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@changeset-bot
Copy link

changeset-bot bot commented Jan 24, 2026

⚠️ No Changeset found

Latest commit: 7974331

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@vercel
Copy link

vercel bot commented Jan 24, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
agents-api Ready Ready Preview, Comment Jan 24, 2026 6:15am
agents-docs Error Error Jan 24, 2026 6:15am
agents-manage-ui Ready Ready Preview, Comment Jan 24, 2026 6:15am

Request Review

@claude
Copy link

claude bot commented Jan 24, 2026

Claude finished @amikofalvy's task —— View job


TypeScript Staff Engineer & System Architect Review

Reviewing this comprehensive test infrastructure overhaul. This PR introduces significant changes to testing strategy, CI/CD, and build processes.

Todo List:

  • Analyze test separation strategy (slow vs fast)
  • Review CI/CD pipeline modifications
  • Examine database setup and snapshot generation
  • Evaluate parallel testing implementation
  • Check for architectural and design considerations
  • Assess potential impacts and risks
  • Provide comprehensive feedback and recommendations

Executive Summary

This PR implements a sophisticated testing infrastructure overhaul with three key innovations:

  1. Test categorization (fast vs slow) with separate configurations
  2. Database snapshot generation for faster test initialization
  3. Affected-only testing in CI to reduce build times

While the technical execution is solid, there are several architectural concerns and potential risks that warrant attention before merging.

🔥 Critical Issues

1. Database Snapshot Management & Race Conditions

Location: packages/agents-core/src/db/manage/test-manage-client.ts:34-82

Issue: The snapshot loading strategy has multiple concerning patterns:

if (snapshotPath) {
  const snapshotData = readFileSync(snapshotPath); // Synchronous file I/O
  const blob = new Blob([snapshotData], { type: 'application/gzip' });
  client = new PGlite({ loadDataDir: blob });
} else {
  client = new PGlite();
  // Falls back to migration
}

Problems:

  • Race condition risk: Multiple test files loading the same snapshot simultaneously could cause file contention
  • Memory overhead: Each test loads the entire compressed snapshot into memory (potentially 10-50MB per test instance)
  • Synchronous I/O: Blocking reads in test setup violates Node.js best practices
  • Snapshot versioning: No mechanism to detect stale snapshots after schema changes

Recommendation: Implement snapshot versioning with checksums and async loading:

// Add version checking
const snapshotVersion = await getSnapshotVersion(snapshotPath);
const currentSchemaVersion = await getCurrentSchemaVersion();
if (snapshotVersion !== currentSchemaVersion) {
  // Regenerate or fall back to migrations
}

2. Parallel Test Resource Management

Location: agents-api/vitest.fast.config.ts:24-27

Issue: Both fast and slow configs use identical threading parameters:

poolOptions: {
  threads: {
    maxThreads: 10,
    minThreads: 4,
  },
}

Problems:

  • Resource contention: Fast tests (5s timeout) and slow tests (60s timeout) competing for the same thread pool
  • Memory scaling: 10 threads × database snapshots × test isolation = potential 500MB-1GB memory usage
  • No resource bounds: No memory limits or CPU throttling for database operations

Recommendation: Differentiate resource allocation by test type:

// Fast tests - CPU bound, low memory
maxThreads: 8, minThreads: 2

// Slow tests - I/O bound, higher memory
maxThreads: 4, minThreads: 1

⚠️ High-Impact Concerns

3. CI Strategy Complexity vs. Value Trade-off

Location: .github/workflows/ci.yml:81-99

Issue: The affected-only testing strategy adds significant complexity:

if [ "${{ github.event_name }}" == "pull_request" ]; then
  git fetch origin main --depth=1 2>/dev/null || git fetch origin master --depth=1 2>/dev/null || true
  pnpm turbo run lint typecheck test --filter='...[origin/main]'

Concerns:

  • Silent failures: The || true could hide legitimate git fetch failures
  • Inconsistent behavior: Different execution paths for PRs vs pushes increases CI debugging complexity
  • False positives: Turbo's change detection might miss transitive dependencies (e.g., schema changes affecting tests in other packages)

Alternative approach: Consider GitHub's built-in changed files API:

- uses: dorny/paths-filter@v2
  with:
    filters: |
      agents-api: 'agents-api/**'
      agents-core: 'packages/agents-core/**'

4. Test File Naming Convention Inconsistency

Location: 39 renamed files from .test.ts to .slow.test.ts

Issue: The mass renaming creates maintenance overhead without clear categorization criteria.

Missing guidelines:

  • What qualifies as a "slow" test? (Database I/O? Duration? Setup complexity?)
  • How do developers know which config to use for new tests?
  • What's the migration strategy for existing tests that become slow over time?

Recommendation: Document explicit criteria in CONTRIBUTING.md:

## Test Categories
- **Fast tests** (`.test.ts`): < 5s, no database, pure logic
- **Slow tests** (`.slow.test.ts`): Database operations, external APIs, file I/O
- **Integration tests** (`.integration.test.ts`): End-to-end workflows

🧐 Architectural Design Concerns

5. Database Setup Strategy Anti-pattern

Location: agents-api/src/__tests__/setup.ts:127-167

Issue: Complex migration logic in global setup with path resolution heuristics:

const isInPackageDir = process.cwd().includes('agents-api');
const manageMigrationsPath = isInPackageDir
  ? '../packages/agents-core/drizzle/manage'
  : './packages/agents-core/drizzle/manage';

Problems:

  • Fragile path resolution: Breaks if directory structure changes or tests run from different contexts
  • Hidden dependencies: Migration paths are implicit, making the system harder to debug
  • Monorepo coupling: Tight coupling between test setup and repository structure

Better approach: Use explicit configuration or environment variables:

const MANAGE_MIGRATIONS_DIR = process.env.MANAGE_MIGRATIONS_DIR 
  || resolve(__dirname, '../../packages/agents-core/drizzle/manage');

6. Lack of Test Performance Monitoring

Location: CI configuration uploads test timing but no analysis

Missing capabilities:

  • No alerting for test performance regressions
  • No trending analysis of slow tests over time
  • No automatic promotion of fast tests that become slow

Recommendation: Implement performance budgets:

# In CI
- name: Check test performance budget
  run: |
    if [ "$(cat test-results/timing.json | jq '.maxDuration')" -gt "300000" ]; then
      echo "Test duration exceeded 5 minute budget"
      exit 1
    fi

🔍 Implementation Quality Issues

7. Error Handling Gaps

Location: Multiple files lack comprehensive error handling

Examples:

  • scripts/generate-test-db-snapshot.ts:42-47: Migration failures only log but don't provide recovery guidance
  • packages/agents-core/src/db/manage/test-manage-client.ts:110-111: Database cleanup errors silently ignored

8. Missing Integration Tests for New Infrastructure

Critical gap: No tests validate that:

  • Snapshot generation produces valid databases
  • Fast/slow test separation works correctly
  • Affected-only testing catches real issues

💡 Positive Aspects

The PR demonstrates solid engineering practices:

  1. Comprehensive documentation: The CLI parallelization analysis in TEST_PARALLELIZATION_NOTES.md shows thoughtful analysis
  2. Incremental adoption: Allowing fallback to migrations when snapshots aren't available
  3. CI optimization: Affected-only testing will significantly reduce CI times for focused changes
  4. Resource management: Thoughtful thread pool configuration for different test types

🎯 Recommendations

Immediate Actions (Before Merge):

  1. Add snapshot versioning to prevent stale snapshot issues
  2. Implement resource budgets for memory and CPU usage
  3. Add integration tests for the new testing infrastructure
  4. Document test categorization criteria for future maintainers

Follow-up Work (Post-merge):

  1. Monitor test performance and adjust resource allocation
  2. Implement test performance budgets in CI
  3. Consider test sharding if the test suite continues growing

Long-term Considerations:

  1. Database test strategy evolution: Consider containerized databases vs in-memory for better production parity
  2. Test infrastructure as code: Move configuration to a dedicated directory for better maintenance

Summary

This is a well-engineered solution to real performance problems, but the complexity introduced requires careful management. The database snapshot strategy is innovative but needs safeguards against race conditions and stale data. The affected-only CI testing is valuable but adds debugging complexity.

Recommendation: Approve with required changes - address the snapshot versioning and resource management issues before merge, then monitor closely in the first few weeks to ensure the new infrastructure performs as expected.


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants