-
Notifications
You must be signed in to change notification settings - Fork 833
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minor: Document timestamp with/without cast behavior #5826
Conversation
/// However, note that when casting from a timestamp with timezone BACK to a | ||
/// timestamp without timezone the cast kernel does not adjust the values. | ||
/// | ||
/// Thus round trip casting a timestamp without timezone to a timestamp with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The behavior that round trip casting the timestamp CHANGES the underlying timestamp I think caused a lot of confusion (at least for me) in the context of apache/datafusion#10602
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I filed #5827 to discuss changing the behavior
CI integration is failing due to #5815 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for documenting this. Made the behavior much clearer. Minor remark
//! use arrow_array::types::Float64Type; | ||
//! use arrow_array::cast::AsArray; | ||
//! | ||
//! # use arrow_array::*; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this just makes the existing example less verbose
arrow-cast/src/cast/mod.rs
Outdated
/// | ||
/// When casting from a timestamp without timezone to a timestamp with | ||
/// timezone, the cast kernel treats the underlying timestamp values as being in | ||
/// UTC and adjusts them to the provided timezone. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is correct, it interprets the timestamp as being in the destination timezone and then adjusts the value to UTC as required.
From the docs on DataType
One possibility is to assume that the original timestamp values are relative to the epoch of the timezone being set; timestamp values should then adjusted to the Unix epoch (for example, changing the timezone from empty to “Europe/Paris” would require converting the timestamp values from “Europe/Paris” to “UTC”, which seems counter-intuitive but is nevertheless correct).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Than you -- fixed
Which issue does this PR close?
Closes #.
Rationale for this change
The behavior of casting timestamps to/from timezones is quite subtle and I spent quite some time testing them out in the context of apache/datafusion#10602
Thus I thought it would be good to document this behavior in the arrow crate itself so I don't have to do that next time (and hopefully) others can benefit from it as well.
What changes are included in this PR?
Document, with examples, what happens when one casts
timestamp with timezone
to/fromtimestamp without timezone
Are there any user-facing changes?
Documentation. No changes to code