-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revamp QA checks into a battery included package #35322
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Ignored Deployment
|
This stack of pull requests is managed by Graphite. Learn more about stacking. Join @alafanechere and the rest of your teammates on |
a1e38c7
to
0687057
Compare
0687057
to
0656cda
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very promising to me, thanks @alafanechere ! Not approving now, because i'll get a closer look tomorrow to see the big picture of the checks themselves.
@@ -17,7 +17,7 @@ GitPython = "^3.1.29" | |||
pydantic = "^1.9" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What holds our back to switch to the pydantic 2.0.0? Just curious?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know, nothing I'm aware of, but it's a different package / project than the current one :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't yet know the full surface area of us using Pydantic. If we're using 1.* everywhere, should we treat 2.0+ upgrade as a separate lane of work, or do you mostly expect things to work smoothly with such an upgrade?
Is is drop-in compatible syntax-wise?
If we're using it in the CDK itself and in airbyte-ci — I think airbyte-ci could be our guinea pig for such a migration? /cc @bazarnov
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the only package that still demands the 1.* version is CAT, and yet, let the airbyte-ci be the guide indeed.
) | ||
|
||
expected_title = f"# {connector.name_from_metadata} Migration Guide" | ||
expected_version_header_start = "## Upgrading to " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have all the critical rules covered?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ported over what currently exists in qa_checks.py
. Let me know if you think additional checks should be implemented. I'm open for it, but in different PRs and with related new GH issues 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does look great to me — since the potential blast radius of just having the tool is close to zero, I'm comfy approving this.
Caveats: I want follow-up work of:
- Publishing the actual generated documentation
- Moving our doc guide to our doc pages instead of hackmd
- Actually switching to use
connectors-qa
.
Please wait for @bazarnov's review and work with him to get a go ahead from API Sources, but I'm personally happy.
@@ -17,7 +17,7 @@ GitPython = "^3.1.29" | |||
pydantic = "^1.9" | |||
PyGithub = "^1.58.0" | |||
rich = "^13.0.0" | |||
pydash = "^7.0.4" | |||
pydash = "^6.0.2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why downgrade?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
connectors_qa
depends on two other internal packages which are using pydash
but on different version:
medata_service/lib
usesv6
connector_ops
usesv7
I'm aligning to metadata_service/lib
version because this package is also a dependency of metadata_service/orchestrator
, which is also declaring a dependency on pydash
.
Aligning to v6
makes me modifying a single package (connector_ops
) instead of 2 (metadata_service/lib
/ metadata_service/orchestrator
). A part from that I don't think there's a good reason to stay on v6
.
@@ -0,0 +1,87 @@ | |||
# Connectors QA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two questions here:
- Does
connectors_qa
have to live in this particular directory, or is it just logical spot for it in the monorepo? I.e. could we just put it in the root of the repository instead? - If it's fully independent of the rest of connector_ops stuff (don't think so), is there a good overall readme and list of directories and what project is where? We have many things — registry service, airbyte-ci, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@natikgadzhi I put it there because it's a package used in the CI context. But we could put it at the root of the repo, I don't mind 😄 .
It's dependent on a couple of other internal packages. I will explicitly list the dependency and why they exists in this README.
from .documentation import ENABLED_CHECKS as DOCUMENTATION_CHECKS | ||
|
||
ENABLED_CHECKS = ( | ||
DOCUMENTATION_CHECKS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was about to suggest that you could have a list of lists and then flatten, but then I remembered how flattening lists in Python is expressed kek.
DOCKER_INDEX = "docker.io" | ||
DOCKERFILE_NAME = "Dockerfile" | ||
DOCUMENTATION_STANDARDS_URL = "https://hackmd.io/Bz75cgATSbm7DjrAqgl4rw" | ||
GRADLE_FILE_NAME = "build.gradle" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is Gradle used internally, and for what?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@natikgadzhi these checks run also on our java connectors which are using gradle.
This file is used in this program to determine if a connector is a java one.
DOCKER_HUB_USERNAME_ENV_VAR_NAME = "DOCKER_HUB_USERNAME" | ||
DOCKER_INDEX = "docker.io" | ||
DOCKERFILE_NAME = "Dockerfile" | ||
DOCUMENTATION_STANDARDS_URL = "https://hackmd.io/Bz75cgATSbm7DjrAqgl4rw" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow wow wow hold up, why are our standards in a 3rd party service and not a page on our very own docs site? Any specific reason? /cc @girarda @alafanechere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No idea. This is the first time I see this page. I think this was hacked together by dev-rel two years ago, but that's only based off this slack thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@natikgadzhi @girarda this doc was written by our previous documentation team. I believe it was drafted when I was working on the original qa_checks.py
and never got checked in our main repo. I believe the doc is good enough to join our Resources
section of Contributing to Airbyte
. I can do it in a follow up PR.
![Screenshot 2024-02-16 at 08 09 30](https://private-user-images.githubusercontent.com/5551758/305332943-dbed8285-34a3-466d-aff4-95c5b5b50e8e.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk0MjU4ODMsIm5iZiI6MTczOTQyNTU4MywicGF0aCI6Ii81NTUxNzU4LzMwNTMzMjk0My1kYmVkODI4NS0zNGEzLTQ2NmQtYWZmNC05NWM1YjViNTBlOGUucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxMyUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTNUMDU0NjIzWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9ZTI5YzUwODQ1MWNkZmMwZTE4MTdmYWU4ZDQzNTdhMWY1ZTg1ZDAyNTIwOWJkNDhkZGUzNTVmMjg4ZTIzMDFmNyZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.pzkoT4qpJ2WRmyrPSMxeFQSI4PNj1FWYxP8znNNr28k)
2370f63
to
2183818
Compare
@natikgadzhi I generated the documentation + add an integration test to make sure it's kept up to date in #35324 |
2183818
to
869d4e7
Compare
58de1c1
to
e418127
Compare
e418127
to
8f55503
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @alafanechere for this improvement.
* ✨ source-surveymonkey: migrate to poetry (airbytehq#35168) * ✨ source-monday: migrate to poetry (airbytehq#35146) * ✨ source-salesforce: migrate to poetry (airbytehq#35147) * ✨ source-intercom: migrate to poetry (airbytehq#35148) * ✨ source-iterable: migrate to poetry (airbytehq#35150) * ✨ source-mixpanel: migrate to poetry (airbytehq#35151) * ✨ source-typeform: migrate to poetry (airbytehq#35152) * ✨ source-twilio: migrate to poetry (airbytehq#35153) * ✨ source-notion: migrate to poetry (airbytehq#35155) * ✨ source-zendesk-talk: migrate to poetry (airbytehq#35156) * ✨ source-amplitude: migrate to poetry (airbytehq#35162) * ✨ source-jira: migrate to poetry (airbytehq#35160) * ✨ source-google-ads: migrate to poetry (airbytehq#35158) * 🐛 Source Slack: Join to the channels while `read` instead of `discovery` (airbytehq#35131) * ✨ source-hubspot: migrate to poetry (airbytehq#35165) * ✨ source-pinterest: migrate to poetry (airbytehq#35159) * ✨ source-sentry: migrate to poetry (airbytehq#35145) * ✨ source-chargebee: migrate to poetry (airbytehq#35169) * source-snapchat-marketing: adopt our base image (airbytehq#35170) * ✨ source-snapchat-marketing: migrate to poetry (airbytehq#35171) * source-faker: adopt our base image (airbytehq#35172) * ✨ source-faker: migrate to poetry (airbytehq#35174) * ✨ source-amazon-ads: migrate to poetry (airbytehq#35180) * Source Github: add integration tests (airbytehq#34933) * ✨ source-bing-ads: migrate to poetry (airbytehq#35179) * ✨ source-instagram: migrate to poetry (airbytehq#35177) * ✨ source-facebook-marketing: migrate to poetry (airbytehq#35178) * destination-async-framework: make emission of state from FlushWorkers synchronized (airbytehq#35144) * ✨ source-freshdesk: migrate to poetry (airbytehq#35187) * 🐛 source-mysql Support special chars in dbname (airbytehq#34580) * AirbyteLib: Release 0.1.0 (airbytehq#35184) * 📚 Adjust documentation for corepack (airbytehq#35192) * ✨ source-recharge: migrate to poetry (airbytehq#35182) * ✨ source-tiktok-marketing: migrate to poetry (airbytehq#35161) * Bump Airbyte version from 0.50.48 to 0.50.49 * ✨ Destination Postgres: DV2 GA (airbytehq#35042) Co-authored-by: Marius Posta <marius@airbyte.io> Co-authored-by: Evan Tahler <evan@airbyte.io> * Destination snowflake: reorder auth spec options (airbytehq#35194) * ✨ source-zendesk-chat: migrate to poetry (airbytehq#35185) * ✨ source-sendgrid: migrate to poetry (airbytehq#35181) * ✨ source-gitlab: migrate to poetry (airbytehq#35167) * ✨ source-airtable: migrate to poetry (airbytehq#35149) * ✨ source-google-search-console: migrate to poetry (airbytehq#35163) * 🐛Source Amazon Seller Partner: add integration tests (airbytehq#33996) * ✨ source-s3: migrate to poetry (airbytehq#35164) * ✨ source-shopify: migrate to poetry (airbytehq#35166) * ✨ source-file: migrate to poetry (airbytehq#35186) * ✨ source-slack: migrate to poetry (airbytehq#35157) * ✨ source-harvest: migrate to poetry (airbytehq#35154) * Source Chargebee: Updates schemas for validation and missing fields errors, updates test bypass, adds expected records, adds custom error handling, adds incremental support for three streams (airbytehq#34053) * Don't emit final state if there is an underlying stream failure (airbytehq#34869) Co-authored-by: Xiaohan Song <xiaohan@airbyte.io> * Remove IAM Role Setup instructions from s3.md (airbytehq#35190) * Bump Airbyte version from 0.50.49 to 0.50.50 * airbyte-ci: run `poetry check` before `poetry install` on poetry package install (airbytehq#35212) * ✨ Source File: add fixed width file format support (airbytehq#34678) Co-authored-by: mgreene <michael.greene@gravie.com> Co-authored-by: Serhii Lazebnyi <serhii.lazebnyi@globallogic.com> Co-authored-by: Serhii Lazebnyi <53845333+lazebnyi@users.noreply.github.com> * source-postgres: adopt CDK 0.20.4 (airbytehq#35224) * 🐛 Set cdc record subsequent record wait time to initial wait time as a workaround (airbytehq#35114) * AirbyteLib: docs: add Colab quicklink (airbytehq#35215) * AirbyteLib: support secrets in dotenv files (airbytehq#35244) * Add airbyte trace utility to emit analytics messages & emit messages for MongoDB, Postgres & MySQL (airbytehq#35036) * AirbyteLib: Docs: fix colab badge (airbytehq#35248) * AirbyteLib: improve json schema type detection (airbytehq#35263) * 🏥 Source Mixpanel: update stream Funnels with custom_event_id and custom_event fields fields (airbytehq#35203) * write logs to file in addition to stdout when running java connector tests (airbytehq#35236) * destination-duckdb: remove superfluous build.gradle file (airbytehq#35277) * fix `:airbyte-integrations:connectors:destination-duckdb' could not be found in project` (airbytehq#35279) * destination-e2e-test,dev-null: use CDK 0.20.6 (airbytehq#35278) * AirbyteLib: Add support for JSON and VARIANT types (airbytehq#35117) Co-authored-by: Joe Reuter <joe@airbyte.io> * Docs: add deprecation note for normalization and custom transformation (airbytehq#35275) * 🎉 Source Intercom: Update the API Version to `2.10` (airbytehq#35176) * 🐛 Source Harvest: Revert poetry update (airbytehq#35296) * AirbyteLib: Mark and deprioritize slow tests (airbytehq#35298) * source-clickhouse: adopt CDK 0.20.4 (airbytehq#35235) * source-cockroachdb: adopt CDK 0.20.4 (airbytehq#35234) * source-db2: adopt CDK 0.20.4 (airbytehq#35233) * source-dynamodb: adopt CDK 0.20.4 (airbytehq#35232) * source-e2e-test: adopt CDK 0.20.4 (airbytehq#35231) * source-elasticsearch: adopt CDK 0.20.4 (airbytehq#35230) * source-kafka: adopt CDK 0.20.4 (airbytehq#35229) * source-oracle: adopt CDK 0.20.4 (airbytehq#35225) * source-redshift: adopt CDK 0.20.4 (airbytehq#35223) * source-scaffold-java-jdbc: adopt CDK 0.20.4 (airbytehq#35222) * source-sftp: adopt CDK 0.20.4 (airbytehq#35221) * source-snowflake: adopt CDK 0.20.4 (airbytehq#35220) * source-teradata: adopt CDK 0.20.4 (airbytehq#35219) * source-tidb: adopt CDK 0.20.4 (airbytehq#35218) * Throw cdc cursor error * Revert bad commit * AirbyteLib: suppress duckdb reflection warnings (airbytehq#35300) * Source Google Ads: temporary patch to avoid 500 Internal server error (airbytehq#35280) * 🐛 python cdk: mask oauth access key (airbytehq#34931) * 🤖 Bump patch version of Python CDK * Emit multiple error trace messages and continue syncs by default (airbytehq#35129) * 🤖 Bump minor version of Python CDK * ✨Source Amazon Seller Partner: add `VendorOrders` stream (airbytehq#35273) * File-based CDK: enqueue AirbyteMessage of type record instead of sending to the message repository (airbytehq#35318) * 🤖 Bump patch version of Python CDK * 🚨🚨🐛 Source Gitlab fix merge_request_commits stream (airbytehq#34548) * java CDK: improve blobstore module structure (airbytehq#35285) * source-mysql: add and adopt TestDatabaseWithInvalidDatabaseName (airbytehq#35210) * ✨ Source File: support ZIP file (airbytehq#32354) Co-authored-by: Serhii Lazebnyi <53845333+lazebnyi@users.noreply.github.com> Co-authored-by: Serhii Lazebnyi <serhii.lazebnyi@globallogic.com> * destination-async-framework: move the state emission logic into GlobalAsyncStateManager (airbytehq#35240) * 🐛 Source Harvest: Fix pendulum parsing error (airbytehq#35305) Co-authored-by: Christo Grabowski <108154848+ChristoGrab@users.noreply.github.com> * ✨ Source GitHub: updating branches schema and unpin on cloud (airbytehq#35271) Co-authored-by: maxi297 <maxime@airbyte.io> Co-authored-by: Maxime Carbonneau-Leclerc <3360483+maxi297@users.noreply.github.com> * AirbyteLib: Fix no-such-table-error (airbytehq#35311) Co-authored-by: Bindi Pankhudi <bindi@airbyte.com> Co-authored-by: Aaron Steers <aj@airbyte.io> * 📝 add instructions for soft reset (airbytehq#35335) * [source-postgres] Add test for legacy version of postgres (airbytehq#35329) * Source Klaviyo: added transform config for profile stream (airbytehq#35336) * 🏥 Source Hubspot: updated marketing emails schema and expected records (airbytehq#35328) * gradle: split off python cdk (airbytehq#35306) * gradle: overall simplification (airbytehq#35307) * docs: typos (airbytehq#35302) * Docs: Update stripe.md (airbytehq#35142) * Test PR to check Slack notifications (airbytehq#35363) * airbyte-ci: remove reference to buildConnectorImage (airbytehq#35364) * Source S3: revert rollback to 4.4.1 (airbytehq#35055) Co-authored-by: Augustin <augustin@airbyte.io> * 🐛 Source OpsGenie: fix parsing of updated_at timestamps from OpsGenie (airbytehq#35269) Co-authored-by: marcosmarxm <marcosmarxm@gmail.com> * Archive `destination-kvdb` (airbytehq#35370) * Add `archived` as connector support level (airbytehq#35355) * Remove `octavia-cli` (airbytehq#33950) * Docs: update k8s instructions for upgrade (airbytehq#35108) * Destination redshift: delete some unused files (airbytehq#35314) * re-add destination-kvdb as archived connector (airbytehq#35377) * destination-kvdb - publish for real (airbytehq#35379) * Support user-specified test read limits in `connector_builder` code (airbytehq#35312) * 🤖 Bump patch version of Python CDK * destination-kvdb bump to publish (airbytehq#35381) * ✨ Source Paypal Transactions: Siver Certification (airbytehq#34510) Co-authored-by: Alexandre Girard <alexandre@airbyte.io> Co-authored-by: alafanechere <augustin.lafanechere@gmail.com> Co-authored-by: Augustin <augustin@airbyte.io> * Revamp QA checks into a battery included package (airbytehq#35322) * 🏥 Source Pinterest: updated expected records (airbytehq#35353) * .github: fix python CDK publish (airbytehq#35391) * 🐛 Source Amazon Seller Partner: Fix check for Vendor accounts (airbytehq#35331) * doc: Document our connectors QA checks (airbytehq#35324) * airbyte-ci: use connectors-qa instead of connector_ops.qa_check (airbytehq#35325) * Update `metadata-service` to latest version + docs (airbytehq#35419) * Bump destination-kvdb again to test metadata for archival (airbytehq#35422) * connectors_qa: make `CheckPublishToPyPiIsEnabled` only run on source connectors (airbytehq#35426) * gradle: remove archived connectors (airbytehq#35423) * ✨Source Facebook Marketing: add integration tests (airbytehq#35061) * Delete `requirements.txt` on poetry managed connectors (airbytehq#35406) * update doc to reference poetry (airbytehq#35414) * 🧹 remove qa_checks.py (airbytehq#35434) * connectors-qa: fix connector type attribute access (airbytehq#35435) * java-connectors: add thread name as part of the log message (airbytehq#35199) * doc: remove Node requirements on config based getting started tutorial (airbytehq#35436) * airbyte-ci: disable telemetry with env var (airbytehq#35438) * airbyte-ci: disable a flaky test (airbytehq#35418) * ci: check for required reviewers on destinations (airbytehq#35428) * destination-kvdb QA checks (airbytehq#35424) Co-authored-by: Augustin <augustin@airbyte.io> * Add destination-kvdb to OSS registry (airbytehq#35444) * Normalization logs: remove json parse warnings (airbytehq#34978) * Support archived connectors in Docs (airbytehq#35374) * remove destination-kvdb one more time (airbytehq#35382) * [Source-Postgres] : Add config to throw an error on invalid CDC position (airbytehq#35304) * java-cdk:remove unused class (airbytehq#35408) * Source S3: add filter by start date (airbytehq#35392) * Revert "Add destination-kvdb to OSS registry" (airbytehq#35453) * airbyte-ci: do no run QA checks on publish - only MetadataValidation (airbytehq#35437) Co-authored-by: Ella Rohm-Ensing <erohmensing@gmail.com> * restore kvdb to state from airbytehq#35424 (airbytehq#35454) * 🚨🚨 Source Facebook Marketing: Add statuses filters (airbytehq#32449) Co-authored-by: Anatolii Yatsuk <tolikyatsuk@gmail.com> * add proper logging to junit runs (airbytehq#35394) Basically, Junit is not logging any thing about its progress outside of the console. This is aimed at fixing that by outputing progress logs along with the standard logs. So there's going to be a line before each step of a test run, and a line after with the elapsed time. Also, exception are now part of the logs instead of being only part of the junit report. In the process of doing that, I decided to clean up and simplify the log4j2.xml file. I also noted a few issues with ANSI coloring, so there's a fix for that. Finally, I'm removing empty lines from container logs (MSSQL is full of them). The junit printing is done through an intereceptor. That interceptor uses introspection. I wanted to use a factory method, but java's ServiceLoader only allows classes that extends the service interface, hence the need to override every method in the interceptor class, and to plop a proxy on top of that. * Re-ignore documentation structure check for the time being (airbytehq#35458) * [Source-mysql] : Add config to throw an error on invalid CDC position (airbytehq#35338) * [Source-Mongodb] : Add config to throw an error on invalid CDC position (airbytehq#35375) * pin to older version (airbytehq#35469) * Update on-kubernetes-via-helm.md - Add GCS Logging steps (airbytehq#35455) Co-authored-by: Sajarin <sajarindider@gmail.com> * Airbyte CDK: add filter to RemoveFields (airbytehq#35326) Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com> * 🤖 Bump minor version of Python CDK * 🐛 Source Facebook Marketing: Fix error during transforming state (airbytehq#35467) * .github: remove connector checklist (airbytehq#35484) * connectors_qa: bump to 1.0.3 (airbytehq#35475) * .github: tighter filtering for gradle workflow (airbytehq#35492) * Airbyte docs: Fixed JSON schema rendering issues for dark mode (airbytehq#35489) Co-authored-by: bindipankhudi <bindi@airbyte.com> * Source Quickbooks: fix spec (airbytehq#35457) * 🐛 Change null cursor value query to not use IIF sql function (airbytehq#35405) * Source Google Ads: rollback patch 500 Internal Server Error (airbytehq#35493) * Fix syntax error in `tools/bin/manage.sh`, used to publish airbyte cdk (airbytehq#35466) * [DB sources] : Reduce CDC state compression limit to 1MB (airbytehq#35511) * 🤖 Bump patch version of Python CDK * Add ignore_stream_slicer_parameters_on_paginated_requests flag (airbytehq#35462) * 🤖 Bump minor version of Python CDK * Mangle unhandled MongoCommandException to prevent creating grouping o… (airbytehq#35526) * .github: fix java cdk publish workflow (airbytehq#35533) * [Source-mysql] : Adopt 0.21.4 and reduce cdc state compression threshold to 1MB (airbytehq#35525) * 🏥 Source Notion: update stream schema (airbytehq#35409) * airbyte-ci: make QA check work on strict-encrypt connectors (airbytehq#35536) * Update docs to show archived information if connector is not in registries (airbytehq#35468) * 🐛 Source Facebook Marketing: Add missing config migration (airbytehq#35539) * docs: update ALB configuration docs for exposing API (airbytehq#35520) * chore: remove upgrading-airbyte.md (airbytehq#35545) * 📚 Add documentation for Entra ID (airbytehq#34569) * Bump Airbyte version from 0.50.50 to 0.50.51 * gradle.yml: use a smaller runner (airbytehq#35547) * airbyte-ci: augment the report for java connectors (airbytehq#35317) Today we're missing the logs (both JVM and container logs) in java connector reports. This is creating a link to test artifacts. In the CI, the link will point to a zip file, while on a local run, it will point to a directory. In addition, we recently added the junit XML inlined with the test standard output and error, but that didn't really work as well as we'd hoped: The reports were slow to load, they were not ordered by time, the corresponding logs were lacking. There's still a possibility they'll be useful, so rather than removing them altogether, they will be bundled in the log zip (or directory). I'm also adding a button to copy the standard output or the standard error from a step into the clipboard. Finally, I'm reducing the max vertical size of an expanded step, so it doesn't go over 70%, which seems much cleaner to me. Here's an example of the result (from the child PR): https://storage.cloud.google.com/airbyte-ci-reports-multi/airbyte-ci/connectors/test/pull_request/stephane_02-09-add_background_thread_to_track_mssql_container_status/1708056420/d4683bfb7f90675c6b9e7c6d4bbad3f98c7a7550/source-mssql/3.7.0/output.html * Source SalesForce: Add Stream Slice Step option to specification (airbytehq#35421) Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com> * Destination Clickhouse - 1.0, remove normalization (airbytehq#34637) Co-authored-by: Aaron ("AJ") Steers <aj@airbyte.io> Co-authored-by: Joe Reuter <joe@airbyte.io> Co-authored-by: Obioma Anomnachi <onanomnachi@gmail.com> Co-authored-by: Anatolii Yatsuk <35109939+tolik0@users.noreply.github.com> Co-authored-by: Maxime Carbonneau-Leclerc <3360483+maxi297@users.noreply.github.com> Co-authored-by: maxi297 <maxi297@users.noreply.github.com> Co-authored-by: Ryan Waskewich <156025126+rwask@users.noreply.github.com> Co-authored-by: Catherine Noll <clnoll@users.noreply.github.com> Co-authored-by: Marius Posta <marius@airbyte.io> Co-authored-by: Edward Gao <edward.gao@airbyte.io> Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com> Co-authored-by: SatishChGit <satishchinthanippu@gmail.com> Co-authored-by: evantahler <evan@airbyte.io> Co-authored-by: Rodi Reich Zilberman <867491+rodireich@users.noreply.github.com> Co-authored-by: Anton Karpets <anton.karpets@globallogic.com> Co-authored-by: Christo Grabowski <108154848+ChristoGrab@users.noreply.github.com> Co-authored-by: Akash Kulkarni <akash@airbyte.io> Co-authored-by: Akash Kulkarni <113392464+akashkulk@users.noreply.github.com> Co-authored-by: Gireesh Sreepathi <gisripa@gmail.com> Co-authored-by: Artem Inzhyyants <36314070+artem1205@users.noreply.github.com> * Airbyte CDK: add interpolation for request options (airbytehq#35485) Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com> Co-authored-by: Alexandre Girard <alexandre@airbyte.io> * 🤖 Bump minor version of Python CDK * Handle seeing uncompressed sendgrid contact data (airbytehq#35343) * gradle.yml: use XXL runners but only if gradle related files are changed (airbytehq#35548) * ✨ [greenhouse] [iterable] [linkedin-ads] [paypal-transactions] [pinterest] Bump cdk versions for to use continue on stream per-error reporting (airbytehq#35465) * Airbyte CDK: add CustomRecordFilter (airbytehq#35283) Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com> * 🤖 Bump minor version of Python CDK * Do not add connector header to source and destination index pages (airbytehq#35553) * gradle.yml: fix path filters (airbytehq#35554) * Source Monday: fix gql query to support inline fragment value for the Items stream (airbytehq#35506) * gradle.yml: checkout the repo when not PR trigger (airbytehq#35558) * airbyte-cdk [python]: re-enable tests in CI (airbytehq#35560) Co-authored-by: Marius Posta <marius@airbyte.io> * ✨ [source-mssql] skip sql server agent check if EngineEdition == 8 (airbytehq#35368) * push new source-mssql version (airbytehq#35564) * Destinations CDK: Refactor T+D to gather required world state upfront (airbytehq#35342) Signed-off-by: Gireesh Sreepathi <gisripa@gmail.com> * .github: fix python_cdk_tests.yml (airbytehq#35567) * Bump Airbyte version from 0.50.51 to 0.50.52 * add entry into JAVA_OPTS to always select log4j2.xml as our logger configuration (airbytehq#35569) * destination-s3: bump patch version following airbytehq#35569 (airbytehq#35576) Co-authored-by: Stephane Geneix <stephane@airbyte.io> * destination-snowflake: bump patch version following airbytehq#35569 (airbytehq#35575) Co-authored-by: Stephane Geneix <stephane@airbyte.io> * destination-bigquery: bump patch version following airbytehq#35569 (airbytehq#35574) Co-authored-by: Stephane Geneix <stephane@airbyte.io> * source-mysql: bump patch version following airbytehq#35569 (airbytehq#35573) Co-authored-by: Stephane Geneix <stephane@airbyte.io> * source-postgres: bump patch version following airbytehq#35569 (airbytehq#35572) Co-authored-by: Stephane Geneix <stephane@airbyte.io> * source-mongodb-v2: bump patch version following airbytehq#35569 (airbytehq#35571) Co-authored-by: Stephane Geneix <stephane@airbyte.io> * airbyte-ci-test.yml: only run if modified internal poetry packages (airbytehq#35551) * airbyte-ci-test.yml: checkout repo for path filters when not on PR (airbytehq#35577) * connectors-ci: early exit when no connector changes (airbytehq#35578) * Microsoft Entra ID for Self-Managed Enterprise (airbytehq#35585) * Improve documentation on check command (airbytehq#35542) Co-authored-by: Ella Rohm-Ensing <erohmensing@gmail.com> * 🐛 Source S3: fix exception when setting CSV stream delimiter to `\t`. (airbytehq#35246) Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com> Co-authored-by: marcosmarxm <marcosmarxm@gmail.com> * 🐛 Source BigQuery: fix error with RECORD REPEATED fields (airbytehq#35503) Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com> Co-authored-by: marcosmarxm <marcosmarxm@gmail.com> * re-release source mssql with logger fixes (airbytehq#35596) * Source File: change header=0 to header=null in docs (airbytehq#35595) CI tests failed because the version was not incremented, despite only a single line being altered in the documentation. This change is minor and can be safely merged. * Changed tag to low code (airbytehq#35594) CI tests failed because the version was not incremented. This change is minor and can be safely merged. * Bump Airbyte version from 0.50.52 to 0.50.53 * Destination Postgres: CDK T+D initial state gathering (airbytehq#35385) Signed-off-by: Gireesh Sreepathi <gisripa@gmail.com> * Destination Snowflake: CDK T+D initial state refactor (airbytehq#35456) Signed-off-by: Gireesh Sreepathi <gisripa@gmail.com> * Destination Redshift: CDK T+D initial state refactor (airbytehq#35354) Signed-off-by: Gireesh Sreepathi <gisripa@gmail.com> * delete metadata checks workflow (airbytehq#35580) * Source Recurly: Enable in registries with updated CDK (airbytehq#34622) * reduce interrupt and shutdown delays to 1 minutes and 2 minutes when stopping a connector (initially set at 60minutes and 70minutes) (airbytehq#35527) Fixes airbytehq#32348 discussed here : https://airbytehq-team.slack.com/archives/C02U2SSHP9S/p1708552465201999 * Docs: Add depecration notices to sunsetting connectors (airbytehq#35446) * Cleaned up PyAibyte docs (PR # 35603) (airbytehq#35603) Co-authored-by: bindipankhudi <bindi@airbyte.com> * Source S3: run incremental syncs with concurrency (airbytehq#34895) * old commits added * add file location in output stream * file docker file * docker file version change * pgp docker file * fix * Bump gnupg version and pgp decryption changes * fix bug * fix: discover dtype issued and test cases added * added files --------- Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com> Signed-off-by: Gireesh Sreepathi <gisripa@gmail.com> Co-authored-by: Augustin <augustin@airbyte.io> Co-authored-by: Baz <oleksandr.bazarnov@globallogic.com> Co-authored-by: Artem Inzhyyants <36314070+artem1205@users.noreply.github.com> Co-authored-by: Subodh Kant Chaturvedi <subodh1810@gmail.com> Co-authored-by: Xiaohan Song <xiaohan@airbyte.io> Co-authored-by: Aaron ("AJ") Steers <aj@airbyte.io> Co-authored-by: Tim Roes <tim@airbyte.io> Co-authored-by: benmoriceau <benmoriceau@users.noreply.github.com> Co-authored-by: Gireesh Sreepathi <gisripa@gmail.com> Co-authored-by: Marius Posta <marius@airbyte.io> Co-authored-by: Evan Tahler <evan@airbyte.io> Co-authored-by: Edward Gao <edward.gao@airbyte.io> Co-authored-by: Anton Karpets <anton.karpets@globallogic.com> Co-authored-by: Patrick Nilan <nilan.patrick@gmail.com> Co-authored-by: Akash Kulkarni <113392464+akashkulk@users.noreply.github.com> Co-authored-by: Tyler B <104733644+tybernstein@users.noreply.github.com> Co-authored-by: bgroff <bgroff@users.noreply.github.com> Co-authored-by: mjgatz <86885812+mjgatz@users.noreply.github.com> Co-authored-by: mgreene <michael.greene@gravie.com> Co-authored-by: Serhii Lazebnyi <serhii.lazebnyi@globallogic.com> Co-authored-by: Serhii Lazebnyi <53845333+lazebnyi@users.noreply.github.com> Co-authored-by: Rodi Reich Zilberman <867491+rodireich@users.noreply.github.com> Co-authored-by: Daryna Ishchenko <80129833+darynaishchenko@users.noreply.github.com> Co-authored-by: Stephane Geneix <147216312+stephane-airbyte@users.noreply.github.com> Co-authored-by: Joe Reuter <joe@airbyte.io> Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com> Co-authored-by: Maxime Carbonneau-Leclerc <3360483+maxi297@users.noreply.github.com> Co-authored-by: Akash Kulkarni <akash@airbyte.io> Co-authored-by: Roman Yermilov [GL] <86300758+roman-yermilov-gl@users.noreply.github.com> Co-authored-by: Alexandre Girard <alexandre@airbyte.io> Co-authored-by: girarda <girarda@users.noreply.github.com> Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com> Co-authored-by: brianjlai <brianjlai@users.noreply.github.com> Co-authored-by: Catherine Noll <clnoll@users.noreply.github.com> Co-authored-by: midavadim <midavadim@yahoo.com> Co-authored-by: Julien COUTAND <julien.coutand@gmail.com> Co-authored-by: Christo Grabowski <108154848+ChristoGrab@users.noreply.github.com> Co-authored-by: maxi297 <maxime@airbyte.io> Co-authored-by: Bindi Pankhudi <bindi@airbyte.io> Co-authored-by: Bindi Pankhudi <bindi@airbyte.com> Co-authored-by: Ben Drucker <bvdrucker@gmail.com> Co-authored-by: TornadoContre <37258495+TornadoContre@users.noreply.github.com> Co-authored-by: Natik Gadzhi <natik@respawn.io> Co-authored-by: Thomas Dippel <dipth@users.noreply.github.com> Co-authored-by: marcosmarxm <marcosmarxm@gmail.com> Co-authored-by: Alex Birdsall <ambirdsall@gmail.com> Co-authored-by: ambirdsall <ambirdsall@users.noreply.github.com> Co-authored-by: Jose Gerardo Pineda <jose.pineda@airbyte.io> Co-authored-by: alafanechere <augustin.lafanechere@gmail.com> Co-authored-by: Anatolii Yatsuk <35109939+tolik0@users.noreply.github.com> Co-authored-by: Pedro S. Lopez <pedroslopez@me.com> Co-authored-by: Ella Rohm-Ensing <erohmensing@gmail.com> Co-authored-by: Siarhei Ivanou <sinusu@gmail.com> Co-authored-by: Anatolii Yatsuk <tolikyatsuk@gmail.com> Co-authored-by: Ryan Waskewich <156025126+rwask@users.noreply.github.com> Co-authored-by: Sajarin <sajarindider@gmail.com> Co-authored-by: artem1205 <artem1205@users.noreply.github.com> Co-authored-by: perangel <perangel@gmail.com> Co-authored-by: Joe Bell <joseph.bell@airbyte.io> Co-authored-by: Obioma Anomnachi <onanomnachi@gmail.com> Co-authored-by: maxi297 <maxi297@users.noreply.github.com> Co-authored-by: SatishChGit <satishchinthanippu@gmail.com> Co-authored-by: Brian Leonard <brian@bleonard.com> Co-authored-by: David Wallace <dwallace0723@gmail.com> Co-authored-by: pmossman <pmossman@users.noreply.github.com> Co-authored-by: Stephane Geneix <stephane@airbyte.io> Co-authored-by: Alexandre Cuoci <Hesperide@users.noreply.github.com> Co-authored-by: Danny Tiesling <tiesling@gmail.com> Co-authored-by: Marco Fontana <MaxwellJK@users.noreply.github.com> Co-authored-by: rishabh-cldcvr <rishabh@cldcvr.com>
What
Relates to:
This PR introduces a new
connectors-qa
🐍 package which can run static-analysis checks on our connectors and generate documentation.It will help address the following problems we have:
qa_checks.py
script in theconnector_ops
package (called byairbyte-ci
)airbyte-ci
(e.g.VersionFollowsSemverCheck
orCheckPythonRegistryPublishConfiguration
)🤔 Philosophy
This efforts is driven by the following principles:
airbyte-ci
. We should consider it an orchestrator calling external tools and packages.airbyte-ci
will call a containerizedconnectors-qa
via its CLI.🎉 Net new features
Documentation generation
This will generate a markdown filedocumenting all the enabled checks.
The content is taken from check classes names and descriptions.
Report generation
This will generate a json report of all the QA checks on all our connectors.![Dynamic JSON Badge](https://camo.githubusercontent.com/7ada9a786c967b80d26b6d7ef9ffcd7dfc22dc9c18c07a7c31b0585cb9c3120e/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f64796e616d69632f6a736f6e3f75726c3d6874747073253341253246253246676973742e67697468756275736572636f6e74656e742e636f6d253246616c6166616e65636865726525324661346166333639613665343165653332643062376638376664666461386634302532467261772532463263646163366562656162613233396435633462613235313762356131393963303662373535313425324671615f7265706f72742e6a736f6e2671756572793d2532342e636f6e6e6563746f7273253542253232736f757263652d676f6f676c652d7368656574732532322535442e62616467655f74657874267374796c653d666c6174266c6f676f3d61697262797465266c6162656c3d436f6e6e6563746f72732532305141253230636865636b73)
We could automate the generation of this report in our CD pipeline and use it to feed dashboards or other assets like connector's README.
And create cool badges like:
Recommended reading order
README.md
to install and try out the tool locally.checks/*.py
to understand which checks are runningcli.py
to grasp how the entrypoint andasyncio
logic is implemented🚨 User Impact 🚨
None as this PR is just introducing a new package which is not yet used by
airbyte-ci
Follow up steps
QAChecks
step ofairbyte-ci connectors test
connector_ops/qa_checks
andairbyte-ci
steps that are now running inside this package (semver version check, pypi publishing etc.)