Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(source-apify-dataset): add option to specify dataset by its name #44857

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

vdusek
Copy link
Contributor

@vdusek vdusek commented Aug 28, 2024

What

How

Todo

  • This PR is still under development, and I'm seeking assistance with the following challenge:

Logic for dataset resolution in DeclarativeStreams

  • There should be two distinct branches for resolving the dataset path:
  1. Using dataset_id:
path: "datasets/{{ config['dataset_id'] }}"
  1. Using username and dataset_name:
path: "datasets/{{ config['username'] }}~{{ config['dataset_name'] }}"

If neither or both of these options are provided, the process should fail.

  • Question: How can this conditional logic be defined in the manifest? Is there a way to enforce this directly?

User Impact

  • This update introduces an alternative way to reference a dataset, but the existing method (using dataset_id) remains fully functional.

Can this PR be safely reverted and rolled back?

  • YES 💚

Copy link

vercel bot commented Aug 28, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Aug 28, 2024 1:15pm

Copy link
Member

@marcosmarxm marcosmarxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vdusek, are you planning to continue working on this contribution? Today, you made a change in the spec, but it hasn't been applied to any part of the connector code.

@marcosmarxm marcosmarxm changed the title Source Apify Dataset: add option to specify dataset by its name feat(source-apify-dataset): add option to specify dataset by its name Sep 9, 2024
@vdusek
Copy link
Contributor Author

vdusek commented Sep 10, 2024

@marcosmarxm Hi, I'd love to. Have you read the "Logic for dataset resolution in DeclarativeStreams" section? I have a question there regarding the conditional evaluation in the manifest.

@omcknightkittl
Copy link

Hey, any update on this fix? Would be greatly appreciated :)

@vdusek
Copy link
Contributor Author

vdusek commented Oct 21, 2024

Well, I haven't found anything about the conditional logic in the manifests in docs.

Another approach would be to remove the dataset_id parameter and replace it with dataset_name. This change would be breaking, so it would require releasing a new major version. However, this solution wouldn't work for users who want to use a dataset ID instead of a name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[source-apify-dataset] Reference dataset using its name rather than ID
4 participants