Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

confused result if ignore the timezone when cast timestamp to the date32 #9982

Closed
liukun4515 opened this issue Apr 7, 2024 · 11 comments
Closed
Labels
bug Something isn't working

Comments

@liukun4515
Copy link
Contributor

liukun4515 commented Apr 7, 2024

Describe the bug

add other timezone for the timestamp array for test

    fn test_cast_timestamp_to_date32() {
        let array =
            TimestampMillisecondArray::from(vec![Some(864000000005), Some(1545696000001), None])
                .with_timezone("UTC".to_string());
        let b = cast(&array, &DataType::Date32).unwrap();
        let c = b.as_primitive::<Date32Type>();
        assert_eq!(10000, c.value(0));
        assert_eq!(17890, c.value(1));
        assert!(c.is_null(2));

        // cast the timestamp using the different timezone to the date32
        let array =
            TimestampMillisecondArray::from(vec![Some(25201000), Some(111599000), None])
                .with_timezone("-07:00".to_string());

        let b = cast(&array, &DataType::Date32).unwrap();
        let c = b.as_primitive::<Date32Type>();
        println!("{:?}", array);
        println!("{:?}", c);
    }

And will get the result

PrimitiveArray<Timestamp(Millisecond, Some("-07:00"))>
[
  1970-01-01T00:00:01-07:00,
  **1970-01-01T23:59:59-07:00,**
  null,
]
PrimitiveArray<Date32>
[
  1970-01-01,
  **1970-01-02,**
  null,
]

The result is confused

To Reproduce

No response

Expected behavior

No response

Additional context

No response

@Lordworms
Copy link
Contributor

Lordworms commented Apr 7, 2024

I think it is a bug in arrow-rs instead of datafusion. this is due to ignoring timezone in cast_with_option, let me fix this in arrow-rs

@liukun4515
Copy link
Contributor Author

I think it is a bug in arrow-rs instead of datafusion. this is due to ignoring timezone in cast_with_option, let me fix this in arrow-rs

yes, i also see the code in the arrow-rs and use the same test case to verify the result

@liukun4515
Copy link
Contributor Author

I think it is a bug in arrow-rs instead of datafusion. this is due to ignoring timezone in cast_with_option, let me fix this in arrow-rs

@Lordworms you can refer this issue: apache/arrow-rs#5598

But i want to see more comments about this behavior before deciding if this is a bug

@liukun4515
Copy link
Contributor Author

cc @alamb @waitingkuo

@Lordworms
Copy link
Contributor

filed apache/arrow-rs#5605 (comment) to track

@Lordworms
Copy link
Contributor

I think it is a bug in arrow-rs instead of datafusion. this is due to ignoring timezone in cast_with_option, let me fix this in arrow-rs

@Lordworms you can refer this issue: apache/arrow-rs#5598

But i want to see more comments about this behavior before deciding if this is a bug

I think it is a bug since Date32 did not have a timezone parameter so I think it should be converted

@liukun4515
Copy link
Contributor Author

@Lordworms your fix is not right. The option of timezone in the timestamp is not used to do that, which is used to represent the value of timestamp is reference.
The definition of timestamp in the arrow is not same with spark or other SQL engine.

Please follow the schema doc of timestamp https://github.com/apache/arrow/blob/main/format/Schema.fbs#L276

@liukun4515
Copy link
Contributor Author

I think it is a bug in arrow-rs instead of datafusion. this is due to ignoring timezone in cast_with_option, let me fix this in arrow-rs

@Lordworms you can refer this issue: apache/arrow-rs#5598
But i want to see more comments about this behavior before deciding if this is a bug

I think it is a bug since Date32 did not have a timezone parameter so I think it should be converted

Yes, I think we should follow a timezone to convert the value of date, but we don't need to follow the timezone value of datatype timestamp.

@Lordworms
Copy link
Contributor

I think it is a bug in arrow-rs instead of datafusion. this is due to ignoring timezone in cast_with_option, let me fix this in arrow-rs

@Lordworms you can refer this issue: apache/arrow-rs#5598
But i want to see more comments about this behavior before deciding if this is a bug

I think it is a bug since Date32 did not have a timezone parameter so I think it should be converted

Yes, I think we should follow a timezone to convert the value of date, but we don't need to follow the timezone value of datatype timestamp.

got it

@Lordworms
Copy link
Contributor

closed due to apache/arrow-rs#5605

@liukun4515
Copy link
Contributor Author

I think it can be closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants