Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add postgres to_date function #1799

Open
universalmind303 opened this issue Sep 21, 2023 · 5 comments
Open

Add postgres to_date function #1799

universalmind303 opened this issue Sep 21, 2023 · 5 comments
Labels
feat 🎇 New feature or request upstream 🌊 Issues caused by upstream dependencies

Comments

@universalmind303
Copy link
Contributor

Given the test data userdata1.parquet, I'd like to convert the birthdate' field to a Date` field.

> select birthdate from 'testdata/parquet/userdata1.parquet';
┌────────────┐
│ birthdate  │
│ ──         │
│ Utf8       │
╞════════════╡
│ 3/8/1971   │
│ 1/16/1968  │
│ 2/1/1960   │
│ 4/8/1997   │
│            │
│ 2/25/1983  │
│ 12/18/1987 │
│ 3/1/1962   │
│ 3/27/1992  │
│ 1/28/1997  │
│ …          │
│ 6/8/1970   │
│ 7/18/2000  │
│ 2/16/1960  │
│ 1/5/1988   │
│ 7/30/1987  │
│ 7/8/1965   │
│ 4/22/1975  │
│ 5/1/1979   │
│ 10/9/1991  │
│            │
└────────────┘
 1000 rows (20 shown)

I've tried

select birthdate::Date from  'testdata/parquet/userdata1.parquet';
select cast(birthdate as DATE) from 'testdata/parquet/userdata1.parquet';

The datafusion docs don't seem to have any builtin functions for parsing strings to dates

@universalmind303 universalmind303 added the question ❔ Community and user questions label Sep 21, 2023
@tychoish
Copy link
Collaborator

My understanding is that this ends up being pretty hard, and it looks like arrow/datafusion's date+timestamp formats play a little fast and loose with timezones/leapseconds/calendars/etc. (and it seems like you'd need to use/link chrono for this? it seems like datafusion uses this to create some timestamps internally but not for manipulating or parsing user data.

it seems reasonable that we'd need to write/provide function to perform this cast on our end, and probably ensure that users also pass a format (or a nice name for well known formats) and also use a library to handle the timezone things.)

@universalmind303
Copy link
Contributor Author

we should be able use chrono's strptime to handle this. I believe polars str.to_date and str.to_datetime both use this under the hood.

@universalmind303 universalmind303 changed the title How do i parse a string into a date. Add postgres to_date function Sep 21, 2023
@universalmind303 universalmind303 added feat 🎇 New feature or request and removed question ❔ Community and user questions labels Sep 21, 2023
@tychoish
Copy link
Collaborator

strpttime seems good (though the semantic differences between this and posix date might be weird to document? also We maybe should also have a couple of constants/named helpers for RFC3339 and various well defined timestamp formats)

@universalmind303 universalmind303 added this to the next milestone Sep 28, 2023
@greyscaled
Copy link
Contributor

I ran into this. A work around is casting to date, which oddly enough did not work as expected in the dashboard, but did in the CLI.

We are also missing to_char, as a workaround.

See also: https://www.postgresql.org/docs/current/functions-formatting.html

@universalmind303 universalmind303 removed this from the next milestone Jan 10, 2024
@universalmind303
Copy link
Contributor Author

Seems likely that this will be implemented upstream apache/datafusion#9147

@universalmind303 universalmind303 added the upstream 🌊 Issues caused by upstream dependencies label Feb 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat 🎇 New feature or request upstream 🌊 Issues caused by upstream dependencies
Projects
None yet
Development

No branches or pull requests

3 participants