Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add maxLength to discovered schema #477

Closed
techtangents opened this issue Aug 7, 2024 · 3 comments
Closed

Add maxLength to discovered schema #477

techtangents opened this issue Aug 7, 2024 · 3 comments
Assignees

Comments

@techtangents
Copy link

techtangents commented Aug 7, 2024

Add the JSON Schema "maxLength" property when encountering varchars with a max size. e.g. a varchar(256) would have "maxLength": 256 in the schema.

This would give information to the target that's useful when creating fields. In the above example, a Postgresql or Redshift target could create a varchar(256) field.

This is particularly important in Redshift. Redshift docs advise to use the smallest possible column size for data (see here and here).

For my situation, it appears that this is causing excessive disk usage in my Redshift warehouse.

The behaviour to emit maxLength values could be a config setting. I'd probably keep it off by default.

Note: corresponding target-redshift issue here: TicketSwap/target-redshift#105

@techtangents
Copy link
Author

@edgarrmondragon
Copy link
Member

This should be supported once ship meltano/sdk#2618.

@edgarrmondragon
Copy link
Member

@techtangents this should've been addressed by

  1. feat: Bump to singer-sdk v0.41.0 #503
  2. (upstream) feat(taps): SQL taps now emit schemas with maxLength when applicable meltano/sdk#2651

Those changes were shipped with v0.0.15. Let me know if you find any problems with it.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

2 participants