[Draft] Initial Redshift Driver Implementation #4

eliasdefaria · 2025-04-17T23:28:36Z

Putting up a draft for what I was able to get done so far at the Seattle on-site!

Supported Types

(Redshift Type -> Arrow Type)
int2 -> int16
int/int4 -> Int32
int8 -> Int64
float4 -> float32
float/float8 -> float64
bool -> boolean
char/varchar/text -> string
date -> date32
time/timetz -> time64us
timestamp/timestamptz -> timestamp_us
decimal/numeric -> decimal128

Things I would like to do but didn't get a chance to:

Additional testing to make sure all the data types are handled correctly
More robust casting of data types in the record reader when they're passed to the builders
Testing
Adjusting credentials handling to be easier to understand and set. This part is a bit of a challenge to get right since different auth methods require different inputs and therefore options.
Optimizing performance of the record reader. Would like to also clean up the code a bit here as part of that effort. Because the schema comes with the first batch of records, we can do a lock-step sort of synchronization that Felippe implemented really nicely for us in Databricks

Some interesting tidbits that were discovered:

Redshift Data API requires auth to AWS via IAM credentials (https://docs.aws.amazon.com/redshift/latest/mgmt/data-api-iam.html)
Username password over the wire without IAM isn't supported here
Authenticated calls also require authentication to the database via IAM identity center (not supported in this PR), or AWS SM stored username/password for a DB user, or a temporary user (provide a username). (https://docs.aws.amazon.com/redshift/latest/mgmt/data-api.html#data-api-calling-considerations-authentication)
Internal conditions dictate the size of the pagination behavior

Cheers,
-Jason :)

This reverts commit 44ef881.

jasonlin45 and others added 12 commits April 14, 2025 12:11

go: Add AWS Redshift SDK

44ef881

go: Get make clean to never fail (even if files don't exist)

823a90c

go: Fix template: arrow/go -> arrow-go

9c6cff7

go: Generate glue code for the Redshift driver

c2683a2

go: Add Redshift entry to the pkg/Makefile

beb65e6

Revert "go: Add AWS Redshift SDK" to install V2 SDK

b02e5ea

This reverts commit 44ef881.

go: Add Redshift Data API

010b82b

go: Add AWS Go SDK deps

086a856

go: Add Redshift driver implementation

1cf6d2a

Stubbed conn + stmt

ee6b0ba

go: Working record reader and statement

3b8d7e4

go: Update constants in match

5d92e4b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Draft] Initial Redshift Driver Implementation #4

[Draft] Initial Redshift Driver Implementation #4

eliasdefaria commented Apr 17, 2025 •

edited

Loading

[Draft] Initial Redshift Driver Implementation #4

Are you sure you want to change the base?

[Draft] Initial Redshift Driver Implementation #4

Conversation

eliasdefaria commented Apr 17, 2025 • edited Loading

Supported Types

Things I would like to do but didn't get a chance to:

Some interesting tidbits that were discovered:

eliasdefaria commented Apr 17, 2025 •

edited

Loading