Skip to content

Conversation

@aaronsteers
Copy link
Contributor

@aaronsteers aaronsteers commented Oct 16, 2025

docs: Add comprehensive Mock Server testing documentation

Summary

Created a new comprehensive guide (docs/TESTING_CONNECTORS.md) for testing Airbyte connectors using the CDK's HttpMocker utilities. The documentation was requested to expand on the existing brief Mock Server section in CONTRIBUTING.md.

Key additions:

  • Quick start guide with minimal example
  • Detailed documentation of core components (HttpMocker, HttpRequest, HttpResponse, response builders)
  • Common testing patterns (full refresh, incremental, pagination, error handling, authentication, substreams)
  • Advanced techniques (request builders, time freezing, assertion helpers)
  • Real-world examples adapted from Salesforce and Hubspot connectors
  • Best practices and troubleshooting sections

The documentation references existing test examples in the CDK repo and production connectors to provide practical, realistic guidance for connector developers.

Review & Testing Checklist for Human

  • Verify code examples are accurate - Spot check a few examples to ensure they match the actual CDK API (particularly the Quick Start and testing patterns sections)
  • Check links in "Additional Resources" - Verify the relative paths and GitHub links work correctly
  • Review for clarity - Ensure the documentation flows well and would be helpful for connector developers who are new to the testing utilities

Test Plan

  1. Pick 2-3 simple examples from the doc (e.g., Quick Start, Full Refresh test)
  2. Try to run them against an actual connector to verify they work as documented
  3. Check that the response builder examples match the actual response_builder.py API

Notes

  • This is documentation-only with no code changes, so there's no risk of breaking functionality
  • Examples are intentionally simplified for clarity (use placeholders like MySource())
  • Based on real examples from source-salesforce and source-hubspot integration tests
  • All lint/format checks passed

Session Info:

Summary by CodeRabbit

  • Documentation
    • Added a comprehensive testing guide covering mock HTTP utilities, testing patterns for various scenarios, authentication flows, error handling, advanced techniques, real-world examples, and troubleshooting best practices.

- Created new TESTING_CONNECTORS.md guide covering HttpMocker utilities
- Included quick start examples and core component documentation
- Added testing patterns for full refresh, incremental, pagination, and error handling
- Documented advanced techniques like request builders and time freezing
- Included real-world examples from Salesforce and Hubspot connectors
- Added best practices and troubleshooting sections

Co-Authored-By: AJ Steers <aj@airbyte.io>
Copilot AI review requested due to automatic review settings October 16, 2025 19:26
@devin-ai-integration
Copy link
Contributor

Original prompt from AJ Steers
Received message in Slack channel #ask-devin-ai:

@Devin - Refer to the Mock Server testing in the Python CDK repo. We want to expand docs for that, specifically in a concise but clean manner. You may use any existing docs you find, and/or refer to existing examples. One good example reference is the Salesforce source. Hubspot may be decent also.
Thread URL: https://airbytehq-team.slack.com/archives/C08BHPUMEPJ/p1760642445436139?thread_ts=1760642445.436139

@devin-ai-integration
Copy link
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This CDK Version

You can test this version of the CDK using the following:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@devin/1760642702-expand-mock-server-testing-docs#egg=airbyte-python-cdk[dev]' --help

# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch devin/1760642702-expand-mock-server-testing-docs

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test - Runs connector tests with the updated CDK
  • /poe build - Regenerate git-committed build artifacts, such as the pydantic models which are generated from the manifest JSON schema in YAML.
  • /poe <command> - Runs any poe command in the CDK environment

📝 Edit this welcome message.

@github-actions github-actions bot added the docs label Oct 16, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 16, 2025

📝 Walkthrough

Walkthrough

A comprehensive testing guide for Airbyte connectors has been added, documenting mock HTTP utilities, request/response definitions, and testing patterns for streams, pagination, authentication, and error handling.

Changes

Cohort / File(s) Summary
Connector Testing Documentation
docs/TESTING_CONNECTORS.md
New guide covering HttpMocker usage, HttpRequest/HttpResponse definitions, Response Builders, and testing patterns for full-refresh and incremental streams, pagination, error handling, authentication flows, and parent-child substreams, with real-world examples and best practices.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes


A quick thought: Since this documentation establishes testing patterns for the CDK, would it be helpful to consider adding a reference link to this guide in the main Airbyte documentation or connector developer quickstart (if one exists)? That way, new contributors can easily discover these best practices, wdyt?

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title clearly indicates that the PR adds comprehensive Mock Server testing documentation, accurately reflecting the main change of introducing the new docs/TESTING_CONNECTORS.md guide and using the standard "docs:" prefix for clarity.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch devin/1760642702-expand-mock-server-testing-docs

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds comprehensive documentation for testing Airbyte connectors using the CDK's HttpMocker utilities. The documentation expands on the existing brief Mock Server section in CONTRIBUTING.md by providing detailed guidance for connector developers.

Key changes:

  • Created a complete testing guide with examples ranging from basic to advanced patterns
  • Documented core components (HttpMocker, HttpRequest, HttpResponse, response builders) with practical code examples
  • Provided real-world examples adapted from production connectors (Salesforce and Hubspot)

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

## Additional Resources

- [CDK Test Utilities API Reference](../airbyte_cdk/test/)
- [Example Tests in CDK](../../unit_tests/sources/mock_server_tests/)
Copy link

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The relative paths in these links may be incorrect. The first link uses ../airbyte_cdk/test/ which assumes this doc is in a subdirectory of the CDK root, but the second link uses ../../unit_tests/ which suggests a different directory structure. Verify these paths work correctly from the docs/ directory.

Suggested change
- [Example Tests in CDK](../../unit_tests/sources/mock_server_tests/)
- [Example Tests in CDK](../unit_tests/sources/mock_server_tests/)

Copilot uses AI. Check for mistakes.
# Second page
http_mocker.get(
HttpRequest("https://api.example.com/users", query_params={"page": "page2"}),
HttpResponse('{"users": [{"id": 2}], "next_page": null}', 200)
Copy link

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JSON contains null which is not valid JSON. In JSON, null values should be written as null without quotes, but this appears to be inside a string literal. It should be either \"next_page\": null (without quotes around null) or the string should use proper JSON escaping.

Copilot uses AI. Check for mistakes.
self._http_mocker.get(
HttpRequest(
"https://instance.salesforce.com/services/data/v57.0/queryAll",
query_params={"q": f"SELECT * FROM Account WHERE SystemModstamp >= {cursor_value}"}
Copy link

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The f-string formatting will not work correctly for datetime objects in SOQL queries. Salesforce SOQL requires specific datetime formatting (ISO 8601). The cursor_value should be formatted using .isoformat() or similar method to ensure proper SOQL syntax.

Suggested change
query_params={"q": f"SELECT * FROM Account WHERE SystemModstamp >= {cursor_value}"}
query_params={"q": f"SELECT * FROM Account WHERE SystemModstamp >= {cursor_value.isoformat()}"}

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
docs/TESTING_CONNECTORS.md (2)

203-216: Double-check HttpMocker pagination example structure

The pagination snippet calls read(source, config={}, catalog=catalog) without defining source or catalog, and it never sets up the mock responses to match a CatalogBuilder. Could we either show their setup inline or mention that they come from the quick-start example to avoid confusing readers, wdyt?


382-451: Validate real-world example against actual builders

In the Salesforce/Hubspot sections, the helper classes (ConfigBuilder, ContactsResponseBuilder, etc.) look custom. Could we add a note that these builders come from their respective connector test suites (with paths) or swap in simplified inline builders so readers don’t hunt for missing definitions, wdyt?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 55ea028 and eb3960a.

📒 Files selected for processing (1)
  • docs/TESTING_CONNECTORS.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/TESTING_CONNECTORS.md

[grammar] ~83-~83: There might be a mistake here.
Context: ... the mocked response to return. You can specify body (string or bytes), status code, an...

(QB_NEW_EN)


[grammar] ~477-~477: There might be a mistake here.
Context: ...doesn't match any mocked request. Check: - URL matches exactly (including protocol)...

(QB_NEW_EN)


[grammar] ~489-~489: There might be a mistake here.
Context: .... ### Request Not Being Mocked Ensure: - You're using the @HttpMocker() decorat...

(QB_NEW_EN)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Manifest Server Docker Image Build
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (Fast)

Comment on lines +224 to +237
@HttpMocker()
def test_retry_on_rate_limit(http_mocker):
# Return rate limit error, then success
http_mocker.get(
HttpRequest("https://api.example.com/users"),
[
HttpResponse('{"error": "Rate limit exceeded"}', 429),
HttpResponse('{"users": [{"id": 1}]}', 200)
]
)

output = read(source, config={}, catalog=catalog)
assert len(output.records) == 1
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Clarify retry example response list usage

In the rate-limit example, http_mocker.get(..., [HttpResponse(...), HttpResponse(...)]) assumes the mocker accepts a list of responses. From HttpMocker's current API, the overload expects response_sequence=[...] as a keyword argument. To keep the example runnable, could we switch to http_mocker.get(request, response_sequence=[...]), wdyt?

🤖 Prompt for AI Agents
In docs/TESTING_CONNECTORS.md around lines 224 to 237 the rate-limit example
passes a list of HttpResponse objects positionally to http_mocker.get, but
HttpMocker expects the list via the keyword response_sequence; update the
example to call http_mocker.get(HttpRequest(...),
response_sequence=[HttpResponse(...), HttpResponse(...)]) so the mocker uses the
intended response sequence and the snippet is runnable.

Comment on lines +344 to +353
@HttpMocker()
def test_request_count(http_mocker):
request = HttpRequest("https://api.example.com/users")
http_mocker.get(request, HttpResponse('{"users": []}', 200))

# Perform sync
read(source, config={}, catalog=catalog)

# Assert request was called exactly once
http_mocker.assert_number_of_calls(request, 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Verify assert_number_of_calls availability

http_mocker.assert_number_of_calls isn’t part of the public API in HttpMocker today (the helper lives on the matcher registry). Should we update the snippet to use the supported http_mocker.assert_called_once(request) helper instead—or add a short note if we’re introducing a new helper, wdyt?

🤖 Prompt for AI Agents
In docs/TESTING_CONNECTORS.md around lines 344 to 353, the example calls
http_mocker.assert_number_of_calls which is not part of the public HttpMocker
API; replace that call with the supported helper
http_mocker.assert_called_once(request) in the snippet (or if you intend to add
a new public helper, add a short note above the snippet documenting the new API
and its behavior), ensuring the example uses only public HttpMocker methods and
the text reflects whichever approach you choose.

Comment on lines +361 to +372
from airbyte_cdk.models import AirbyteStreamStatus

@HttpMocker()
def test_stream_status(http_mocker):
http_mocker.get(request, response)

output = read(source, config={}, catalog=catalog)
statuses = output.get_stream_statuses("users")

assert statuses[0] == AirbyteStreamStatus.STARTED
assert statuses[1] == AirbyteStreamStatus.RUNNING
assert statuses[2] == AirbyteStreamStatus.COMPLETE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Confirm stream status helper exists

output.get_stream_statuses("users") doesn’t exist on the object returned by read (it’s a ReadOutput dataclass with records, logs, states). Maybe we can show how to filter output.logs for AirbyteLogMessage status entries instead, wdyt?

🤖 Prompt for AI Agents
In docs/TESTING_CONNECTORS.md around lines 361 to 372, the example calls
output.get_stream_statuses("users") which doesn't exist on the ReadOutput
dataclass; instead, filter output.logs for AirbyteMessage/AirbyteLogMessage
entries whose type is STREAM_STATUS and then extract the stream name and stream
status values for the "users" stream. Update the example to iterate over
output.logs, select only stream-status log messages, map them to their status
enums (AirbyteStreamStatus) and assert those values in order for the "users"
stream.

Comment on lines +495 to +499

- [CDK Test Utilities API Reference](../airbyte_cdk/test/)
- [Example Tests in CDK](../../unit_tests/sources/mock_server_tests/)
- [Salesforce Connector Tests](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-salesforce/unit_tests/integration/)
- [Hubspot Connector Tests](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-hubspot/unit_tests/integrations/)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix broken relative link for unit-test API reference

The link to ../airbyte_cdk/test/ 404s from docs/. Pointing directly to the rendered API docs (or the GitHub directory) would keep navigation smooth. Could we update it to https://github.com/airbytehq/airbyte/tree/master/airbyte-cdk/python/airbyte_cdk/test or similar, wdyt?

🤖 Prompt for AI Agents
In docs/TESTING_CONNECTORS.md around lines 495 to 499 the relative link
"../airbyte_cdk/test/" 404s from the docs site; replace that relative path with
a working absolute link such as
"https://github.com/airbytehq/airbyte/tree/master/airbyte-cdk/python/airbyte_cdk/test"
(or the rendered API docs URL) so the unit-test API reference points to the
correct GitHub directory or published docs.

@github-actions
Copy link

PyTest Results (Fast)

3 802 tests  ±0   3 790 ✅ ±0   6m 30s ⏱️ -2s
    1 suites ±0      12 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit eb3960a. ± Comparison against base commit 55ea028.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants