Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

model executor for s3/gcs/azure to duckdb #6353

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Conversation

k-anshul
Copy link
Member

@k-anshul k-anshul commented Jan 6, 2025

subtask for https://github.com/rilldata/rill-private-issues/issues/854

Adds a model executor that ingests data from s3/gcs/azure using duckdb extension (Note : GCS works via SQL compatibility mode).
Also supports incremental ingestion.
Sample model yaml that ingests from s3.

connector: s3
path: s3://rill-developer.rilldata.io/AdBids.csv.gz

output:
  connector: duckdb

Note : Tested only for GCS and S3.

@k-anshul k-anshul self-assigned this Jan 6, 2025
runtime/drivers/gcs/gcs.go Outdated Show resolved Hide resolved
runtime/drivers/duckdb/model_executor_objectstore_self.go Outdated Show resolved Hide resolved
runtime/drivers/duckdb/model_executor_objectstore_self.go Outdated Show resolved Hide resolved
@k-anshul k-anshul mentioned this pull request Jan 9, 2025
Comment on lines +125 to +127
if gcsConfig.KeyID != "" {
fmt.Fprintf(&sb, ", KEY_ID %s, SECRET %s", safeSQLString(gcsConfig.KeyID), safeSQLString(gcsConfig.Secret))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. If gcsConfig.SecretJSON is set, but not gcsConfig.KeyID, should we perhaps return an error here?
  2. Can you also update rill env configure to request key_id and secret instead of google_application_credentials for GCS? I guess that would be more appropriate now, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Sure.
  2. I planned to do this after we migrate sources to be run as models because as of now sources backed by GCS are taking usual code path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Okay, do you have an issue where you are tracking these follow ups? If not, can you add them to the unification issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants