Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add UDF to_local_time() #11347

Merged
merged 21 commits into from
Jul 11, 2024

Conversation

appletreeisyellow
Copy link
Contributor

@appletreeisyellow appletreeisyellow commented Jul 9, 2024

Which issue does this PR close?

Help with #10602
Closes #11358

Rationale for this change

This PR adds a ScalarUDF function to_local_time():

  • this function converts a timezone-aware timestamp to local time (with no offset or timezone information). In other words, this function strips off the timezone from the timestamp, while keep the display value of the timestamp the same. See examples below
  • only accept 1 input with type Timestamp(..., *)
  • returns with type Timestamp(..., None)

Example

This is how to use it in datafusion-cli:

> select to_local_time('2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels');
+---------------------------------------------+
| to_local_time(Utf8("2024-04-01T00:00:20Z")) |
+---------------------------------------------+
| 2024-04-01T00:00:20                         |
+---------------------------------------------+
1 row(s) fetched.
Elapsed 0.010 seconds.

> select to_local_time('2024-04-01T00:00:20'::timestamp AT TIME ZONE 'Europe/Brussels');
+--------------------------------------------+
| to_local_time(Utf8("2024-04-01T00:00:20")) |
+--------------------------------------------+
| 2024-04-01T00:00:20                        |
+--------------------------------------------+
1 row(s) fetched.
Elapsed 0.008 seconds.

> select
  time,
  arrow_typeof(time) as type,
  to_local_time(time) as to_local_time,
  arrow_typeof(to_local_time(time)) as to_local_time_type
from (
  select '2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels' as time
);
+---------------------------+------------------------------------------------+---------------------+-----------------------------+
| time                      | type                                           | to_local_time       | to_local_time_type          |
+---------------------------+------------------------------------------------+---------------------+-----------------------------+
| 2024-04-01T00:00:20+02:00 | Timestamp(Nanosecond, Some("Europe/Brussels")) | 2024-04-01T00:00:20 | Timestamp(Nanosecond, None) |
+---------------------------+------------------------------------------------+---------------------+-----------------------------+
1 row(s) fetched.
Elapsed 0.017 seconds.

Example of using to_local_time() in date_bin()

Combine to_local_time() with date_bin() will look like:

> select date_bin(interval '1 day', to_local_time('2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels'));
+----------------------------------------------------------------------------------------------------+
| date_bin(IntervalMonthDayNano("18446744073709551616"),to_local_time(Utf8("2024-04-01T00:00:20Z"))) |
+----------------------------------------------------------------------------------------------------+
| 2024-04-01T00:00:00                                                                                |
+----------------------------------------------------------------------------------------------------+


> select date_bin(interval '1 day', to_local_time('2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels')) AT TIME ZONE 'Europe/Brussels';
+----------------------------------------------------------------------------------------------------+
| date_bin(IntervalMonthDayNano("18446744073709551616"),to_local_time(Utf8("2024-04-01T00:00:20Z"))) |
+----------------------------------------------------------------------------------------------------+
| 2024-04-01T00:00:00+02:00                                                                          |
+----------------------------------------------------------------------------------------------------+
Click to see more examples of applying to array values
  1. Write sample data
create or replace table t AS
VALUES
  ('2024-01-01T00:00:01Z'),
  ('2024-02-01T00:00:01Z'),
  ('2024-03-01T00:00:01Z'),
  ('2024-04-01T00:00:01Z'),
  ('2024-05-01T00:00:01Z'),
  ('2024-06-01T00:00:01Z'),
  ('2024-07-01T00:00:01Z'),
  ('2024-08-01T00:00:01Z'),
  ('2024-09-01T00:00:01Z'),
  ('2024-10-01T00:00:01Z'),
  ('2024-11-01T00:00:01Z'),
  ('2024-12-01T00:00:01Z')
;

create or replace view t_utc as
select column1::timestamp AT TIME ZONE 'UTC' as "column1"
from t;

create or replace view t_timezone 
as 
select column1::timestamp AT TIME ZONE 'Europe/Brussels' as "column1" 
from t;
  1. See how tables look like
> select column1, arrow_typeof(column1) from t;
+----------------------+-------------------------+
| column1              | arrow_typeof(t.column1) |
+----------------------+-------------------------+
| 2024-01-01T00:00:01Z | Utf8                    |
| 2024-02-01T00:00:01Z | Utf8                    |
| 2024-03-01T00:00:01Z | Utf8                    |
| 2024-04-01T00:00:01Z | Utf8                    |
| 2024-05-01T00:00:01Z | Utf8                    |
| 2024-06-01T00:00:01Z | Utf8                    |
| 2024-07-01T00:00:01Z | Utf8                    |
| 2024-08-01T00:00:01Z | Utf8                    |
| 2024-09-01T00:00:01Z | Utf8                    |
| 2024-10-01T00:00:01Z | Utf8                    |
| 2024-11-01T00:00:01Z | Utf8                    |
| 2024-12-01T00:00:01Z | Utf8                    |
+----------------------+-------------------------+
12 row(s) fetched.
Elapsed 0.009 seconds.

> select column1, arrow_typeof(column1) from t_utc;
+----------------------+------------------------------------+
| column1              | arrow_typeof(t_utc.column1)        |
+----------------------+------------------------------------+
| 2024-01-01T00:00:01Z | Timestamp(Nanosecond, Some("UTC")) |
| 2024-02-01T00:00:01Z | Timestamp(Nanosecond, Some("UTC")) |
| 2024-03-01T00:00:01Z | Timestamp(Nanosecond, Some("UTC")) |
| 2024-04-01T00:00:01Z | Timestamp(Nanosecond, Some("UTC")) |
| 2024-05-01T00:00:01Z | Timestamp(Nanosecond, Some("UTC")) |
| 2024-06-01T00:00:01Z | Timestamp(Nanosecond, Some("UTC")) |
| 2024-07-01T00:00:01Z | Timestamp(Nanosecond, Some("UTC")) |
| 2024-08-01T00:00:01Z | Timestamp(Nanosecond, Some("UTC")) |
| 2024-09-01T00:00:01Z | Timestamp(Nanosecond, Some("UTC")) |
| 2024-10-01T00:00:01Z | Timestamp(Nanosecond, Some("UTC")) |
| 2024-11-01T00:00:01Z | Timestamp(Nanosecond, Some("UTC")) |
| 2024-12-01T00:00:01Z | Timestamp(Nanosecond, Some("UTC")) |
+----------------------+------------------------------------+
12 row(s) fetched.
Elapsed 0.011 seconds.

> select column1, arrow_typeof(column1) from t_timezone;
+---------------------------+------------------------------------------------+
| column1                   | arrow_typeof(t_timezone.column1)               |
+---------------------------+------------------------------------------------+
| 2024-01-01T00:00:01+01:00 | Timestamp(Nanosecond, Some("Europe/Brussels")) |
| 2024-02-01T00:00:01+01:00 | Timestamp(Nanosecond, Some("Europe/Brussels")) |
| 2024-03-01T00:00:01+01:00 | Timestamp(Nanosecond, Some("Europe/Brussels")) |
| 2024-04-01T00:00:01+02:00 | Timestamp(Nanosecond, Some("Europe/Brussels")) |
| 2024-05-01T00:00:01+02:00 | Timestamp(Nanosecond, Some("Europe/Brussels")) |
| 2024-06-01T00:00:01+02:00 | Timestamp(Nanosecond, Some("Europe/Brussels")) |
| 2024-07-01T00:00:01+02:00 | Timestamp(Nanosecond, Some("Europe/Brussels")) |
| 2024-08-01T00:00:01+02:00 | Timestamp(Nanosecond, Some("Europe/Brussels")) |
| 2024-09-01T00:00:01+02:00 | Timestamp(Nanosecond, Some("Europe/Brussels")) |
| 2024-10-01T00:00:01+02:00 | Timestamp(Nanosecond, Some("Europe/Brussels")) |
| 2024-11-01T00:00:01+01:00 | Timestamp(Nanosecond, Some("Europe/Brussels")) |
| 2024-12-01T00:00:01+01:00 | Timestamp(Nanosecond, Some("Europe/Brussels")) |
+---------------------------+------------------------------------------------+
12 row(s) fetched.
Elapsed 0.012 seconds.
  1. Query using to_local_time()
> select column1, to_local_time(column1), arrow_typeof(to_local_time(column1)) from t_utc;
+----------------------+------------------------------+--------------------------------------------+
| column1              | to_local_time(t_utc.column1) | arrow_typeof(to_local_time(t_utc.column1)) |
+----------------------+------------------------------+--------------------------------------------+
| 2024-01-01T00:00:01Z | 2024-01-01T00:00:01          | Timestamp(Nanosecond, None)                |
| 2024-02-01T00:00:01Z | 2024-02-01T00:00:01          | Timestamp(Nanosecond, None)                |
| 2024-03-01T00:00:01Z | 2024-03-01T00:00:01          | Timestamp(Nanosecond, None)                |
| 2024-04-01T00:00:01Z | 2024-04-01T00:00:01          | Timestamp(Nanosecond, None)                |
| 2024-05-01T00:00:01Z | 2024-05-01T00:00:01          | Timestamp(Nanosecond, None)                |
| 2024-06-01T00:00:01Z | 2024-06-01T00:00:01          | Timestamp(Nanosecond, None)                |
| 2024-07-01T00:00:01Z | 2024-07-01T00:00:01          | Timestamp(Nanosecond, None)                |
| 2024-08-01T00:00:01Z | 2024-08-01T00:00:01          | Timestamp(Nanosecond, None)                |
| 2024-09-01T00:00:01Z | 2024-09-01T00:00:01          | Timestamp(Nanosecond, None)                |
| 2024-10-01T00:00:01Z | 2024-10-01T00:00:01          | Timestamp(Nanosecond, None)                |
| 2024-11-01T00:00:01Z | 2024-11-01T00:00:01          | Timestamp(Nanosecond, None)                |
| 2024-12-01T00:00:01Z | 2024-12-01T00:00:01          | Timestamp(Nanosecond, None)                |
+----------------------+------------------------------+--------------------------------------------+
12 row(s) fetched.
Elapsed 0.015 seconds.

> select column1, to_local_time(column1), arrow_typeof(to_local_time(column1)) from t_timezone;
+---------------------------+-----------------------------------+-------------------------------------------------+
| column1                   | to_local_time(t_timezone.column1) | arrow_typeof(to_local_time(t_timezone.column1)) |
+---------------------------+-----------------------------------+-------------------------------------------------+
| 2024-01-01T00:00:01+01:00 | 2024-01-01T00:00:01               | Timestamp(Nanosecond, None)                     |
| 2024-02-01T00:00:01+01:00 | 2024-02-01T00:00:01               | Timestamp(Nanosecond, None)                     |
| 2024-03-01T00:00:01+01:00 | 2024-03-01T00:00:01               | Timestamp(Nanosecond, None)                     |
| 2024-04-01T00:00:01+02:00 | 2024-04-01T00:00:01               | Timestamp(Nanosecond, None)                     |
| 2024-05-01T00:00:01+02:00 | 2024-05-01T00:00:01               | Timestamp(Nanosecond, None)                     |
| 2024-06-01T00:00:01+02:00 | 2024-06-01T00:00:01               | Timestamp(Nanosecond, None)                     |
| 2024-07-01T00:00:01+02:00 | 2024-07-01T00:00:01               | Timestamp(Nanosecond, None)                     |
| 2024-08-01T00:00:01+02:00 | 2024-08-01T00:00:01               | Timestamp(Nanosecond, None)                     |
| 2024-09-01T00:00:01+02:00 | 2024-09-01T00:00:01               | Timestamp(Nanosecond, None)                     |
| 2024-10-01T00:00:01+02:00 | 2024-10-01T00:00:01               | Timestamp(Nanosecond, None)                     |
| 2024-11-01T00:00:01+01:00 | 2024-11-01T00:00:01               | Timestamp(Nanosecond, None)                     |
| 2024-12-01T00:00:01+01:00 | 2024-12-01T00:00:01               | Timestamp(Nanosecond, None)                     |
+---------------------------+-----------------------------------+-------------------------------------------------+
12 row(s) fetched.
Elapsed 0.016 seconds.
  1. Combine with date_bin()
> select date_bin(interval '1 day', to_local_time(column1)) AT TIME ZONE 'Europe/Brussels' as date_bin from t_utc;
+---------------------------+
| date_bin                  |
+---------------------------+
| 2024-01-01T00:00:00+01:00 |
| 2024-02-01T00:00:00+01:00 |
| 2024-03-01T00:00:00+01:00 |
| 2024-04-01T00:00:00+02:00 |
| 2024-05-01T00:00:00+02:00 |
| 2024-06-01T00:00:00+02:00 |
| 2024-07-01T00:00:00+02:00 |
| 2024-08-01T00:00:00+02:00 |
| 2024-09-01T00:00:00+02:00 |
| 2024-10-01T00:00:00+02:00 |
| 2024-11-01T00:00:00+01:00 |
| 2024-12-01T00:00:00+01:00 |
+---------------------------+
12 row(s) fetched.
Elapsed 0.023 seconds.

> select date_bin(interval '1 day', to_local_time(column1)) AT TIME ZONE 'Europe/Brussels' as date_bin from t_timezone;
+---------------------------+
| date_bin                  |
+---------------------------+
| 2024-01-01T00:00:00+01:00 |
| 2024-02-01T00:00:00+01:00 |
| 2024-03-01T00:00:00+01:00 |
| 2024-04-01T00:00:00+02:00 |
| 2024-05-01T00:00:00+02:00 |
| 2024-06-01T00:00:00+02:00 |
| 2024-07-01T00:00:00+02:00 |
| 2024-08-01T00:00:00+02:00 |
| 2024-09-01T00:00:00+02:00 |
| 2024-10-01T00:00:00+02:00 |
| 2024-11-01T00:00:00+01:00 |
| 2024-12-01T00:00:00+01:00 |
+---------------------------+
12 row(s) fetched.
Elapsed 0.011 seconds.

What changes are included in this PR?

New ScalarUDF function to_local_time() with tests

Are these changes tested?

Yes

Are there any user-facing changes?

No API changes.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @appletreeisyellow -- I found this PR well tested and very well documented 🏆 . Really nice

I also filed #11358 to track this particular feature so it was quite as entangled in various proposals

I think the PR needs a few things before it could be merged:

  1. slt (end to end) tests, as suggested by @jayzhan211
  2. Better error handing (don't panic if some part of the conversion doesn't succeed)

While not strictly required, I think it would also be good to avoid parsing the timezone on each row.

Also, finishing up the TODOs is probably good too

datafusion/functions/src/datetime/to_local_time.rs Outdated Show resolved Hide resolved
datafusion/functions/src/datetime/to_local_time.rs Outdated Show resolved Hide resolved
}
}

/// This function converts a timestamp with a timezone to a timestamp without a timezone.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of the clearest explanations of what a function does that I have read in a long time 💯 Nice work @appletreeisyellow

datafusion/functions/src/datetime/to_local_time.rs Outdated Show resolved Hide resolved
datafusion/functions/src/datetime/to_local_time.rs Outdated Show resolved Hide resolved
datafusion/functions/src/datetime/to_local_time.rs Outdated Show resolved Hide resolved
datafusion/functions/src/datetime/to_local_time.rs Outdated Show resolved Hide resolved
datafusion/functions/src/datetime/to_local_time.rs Outdated Show resolved Hide resolved
datafusion/functions/src/datetime/to_local_time.rs Outdated Show resolved Hide resolved
datafusion/functions/src/datetime/to_local_time.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@Abdullahsab3 Abdullahsab3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this @appletreeisyellow 🙏 Impressive work. Can't wait to try it out once it's in influxDB.

I left a couple of very minor remarks

datafusion/functions/src/datetime/to_local_time.rs Outdated Show resolved Hide resolved
datafusion/functions/src/datetime/to_local_time.rs Outdated Show resolved Hide resolved
@appletreeisyellow appletreeisyellow force-pushed the chunchun/udf-to-loal-time-origin branch from f0a1cf4 to db5b73e Compare July 9, 2024 20:12
@alamb
Copy link
Contributor

alamb commented Jul 9, 2024

The clippy error was fixed in #11368

@alamb
Copy link
Contributor

alamb commented Jul 9, 2024

I took the liberty of merging up from main to get the CI error fix

@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Jul 9, 2024
@appletreeisyellow
Copy link
Contributor Author

@alamb @jayzhan211 @Abdullahsab3 -- Thank you for all your reviews and the many helpful suggestions ♥️ I have addressed all of them. I'm happy to make further updates if there are any additional suggestions.

@alamb -- Thank you for merging the clippy error fix!

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @appletreeisyellow and @Abdullahsab3. I think this PR is looking very nice now thanks to all your work and review

The other thing we should do is document this function in the function reference: https://datafusion.apache.org/user-guide/sql/scalar_functions.html#time-and-date-functions

However, we can do that as a follow on PR as this one is already quite large

I went through this PR carefully and I think it looks really nice and could be merged. I plan to leave it open for a few more hours to allow time for any additional comments that people might have

I also tested that I could use this function to get the desired answer from #10602

Input

DataFusion CLI v40.0.0
> create table t AS
VALUES
  ('2024-01-01T00:00:01Z'),
  ('2024-02-01T00:00:01Z'),
  ('2024-03-01T00:00:01Z'),
  ('2024-04-01T00:00:01Z'),
  ('2024-05-01T00:00:01Z'),
  ('2024-06-01T00:00:01Z'),
  ('2024-07-01T00:00:01Z'),
  ('2024-08-01T00:00:01Z'),
  ('2024-09-01T00:00:01Z'),
  ('2024-10-01T00:00:01Z'),
  ('2024-11-01T00:00:01Z'),
  ('2024-12-01T00:00:01Z')
;
> create view t_timezone as
select column1::timestamp AT TIME ZONE 'Europe/Brussels' as "column1"
from t;
0 row(s) fetched.
Elapsed 0.005 seconds.

> select column1 from t_timezone;
+---------------------------+
| column1                   |
+---------------------------+
| 2024-01-01T00:00:01+01:00 |
| 2024-02-01T00:00:01+01:00 |
| 2024-03-01T00:00:01+01:00 |
| 2024-04-01T00:00:01+02:00 |
| 2024-05-01T00:00:01+02:00 |
| 2024-06-01T00:00:01+02:00 |
| 2024-07-01T00:00:01+02:00 |
| 2024-08-01T00:00:01+02:00 |
| 2024-09-01T00:00:01+02:00 |
| 2024-10-01T00:00:01+02:00 |
| 2024-11-01T00:00:01+01:00 |
| 2024-12-01T00:00:01+01:00 |
+---------------------------+
12 row(s) fetched.
Elapsed 0.014 seconds.

(Bad) date_bin with timezone'd timestamps:

> select column1, date_bin(interval '1 month', column1) as month from t_timezone;
+---------------------------+---------------------------+
| column1                   | month                     |
+---------------------------+---------------------------+
| 2024-01-01T00:00:01+01:00 | 2023-12-01T01:00:00+01:00 | <-- in the wrong month!
| 2024-02-01T00:00:01+01:00 | 2024-01-01T01:00:00+01:00 |
| 2024-03-01T00:00:01+01:00 | 2024-02-01T01:00:00+01:00 |
| 2024-04-01T00:00:01+02:00 | 2024-03-01T01:00:00+01:00 |
| 2024-05-01T00:00:01+02:00 | 2024-04-01T02:00:00+02:00 |
| 2024-06-01T00:00:01+02:00 | 2024-05-01T02:00:00+02:00 |
| 2024-07-01T00:00:01+02:00 | 2024-06-01T02:00:00+02:00 |
| 2024-08-01T00:00:01+02:00 | 2024-07-01T02:00:00+02:00 |
| 2024-09-01T00:00:01+02:00 | 2024-08-01T02:00:00+02:00 |
| 2024-10-01T00:00:01+02:00 | 2024-09-01T02:00:00+02:00 |
| 2024-11-01T00:00:01+01:00 | 2024-10-01T02:00:00+02:00 |
| 2024-12-01T00:00:01+01:00 | 2024-11-01T01:00:00+01:00 |
+---------------------------+---------------------------+
12 row(s) fetched.
Elapsed 0.011 seconds.

(good) date_bin after using to_local_time:

> select column1, date_bin(interval '1 month', to_local_time(column1)) as month from t_timezone;
+---------------------------+---------------------+
| column1                   | month               |
+---------------------------+---------------------+
| 2024-01-01T00:00:01+01:00 | 2024-01-01T00:00:00 | <-- right month
| 2024-02-01T00:00:01+01:00 | 2024-02-01T00:00:00 |
| 2024-03-01T00:00:01+01:00 | 2024-03-01T00:00:00 |
| 2024-04-01T00:00:01+02:00 | 2024-04-01T00:00:00 |
| 2024-05-01T00:00:01+02:00 | 2024-05-01T00:00:00 |
| 2024-06-01T00:00:01+02:00 | 2024-06-01T00:00:00 |
| 2024-07-01T00:00:01+02:00 | 2024-07-01T00:00:00 |
| 2024-08-01T00:00:01+02:00 | 2024-08-01T00:00:00 |
| 2024-09-01T00:00:01+02:00 | 2024-09-01T00:00:00 |
| 2024-10-01T00:00:01+02:00 | 2024-10-01T00:00:00 |
| 2024-11-01T00:00:01+01:00 | 2024-11-01T00:00:00 |
| 2024-12-01T00:00:01+01:00 | 2024-12-01T00:00:00 |
+---------------------------+---------------------+
12 row(s) fetched.
Elapsed 0.008 seconds.

Comment on lines +149 to +163
let mut builder = PrimitiveBuilder::<T>::new();

let primitive_array = as_primitive_array::<T>(array)?;
for ts_opt in primitive_array.iter() {
match ts_opt {
None => builder.append_null(),
Some(ts) => {
let adjusted_ts: i64 =
adjust_to_local_time::<T>(ts, tz)?;
builder.append_value(adjusted_ts)
}
}
}

Ok(ColumnarValue::Array(Arc::new(builder.finish())))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also use try_unary here (that basically does the same thing as what you have here)

                            let primitive_array = as_primitive_array::<T>(array)?;
                            let ts_array = try_unary(primitive_array, |ts| {
                                adjust_to_local_time::<T>(ts, tz)
                            })?;
                            Ok(ColumnarValue::Array(Arc::new(ts_array)))

I tried it locally, and using try_unary does require that adjust_to_local_time is changed to return ArrowError rather than DataFusion error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it would be neat to use try_unary. I got the same compiling error when I used try_unary, so I rewrote the code in a for loop with PrimitiveBuilder

{
match converter(ts) {
MappedLocalTime::Ambiguous(earliest, latest) => exec_err!(
"Ambiguous timestamp in microseconds. Do you mean {:?} or {:?}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

Copy link
Contributor Author

@appletreeisyellow appletreeisyellow Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized that this is a general function -- the error message applies to microsecond, millisecond, and second😅 So I removed the phrase in microseconds in the error to avoid confusion.

Updated in 2c35025

datafusion/functions/src/datetime/to_local_time.rs Outdated Show resolved Hide resolved
@appletreeisyellow
Copy link
Contributor Author

@alamb Thank you for the careful review!

The other thing we should do is document this function in the function reference: datafusion.apache.org/user-guide/sql/scalar_functions.html

However, we can do that as a follow on PR as this one is already quite large

I'll have a follow-up PR to document this new function

@alamb
Copy link
Contributor

alamb commented Jul 10, 2024

I plan to merge this PR tomorrow morning Eastern time unless there are any other comments or anyone would like additional time to review

@appletreeisyellow
Copy link
Contributor Author

I'll have a follow-up PR to document this new function

Here is the follow-up PR:

Copy link
Contributor

@jayzhan211 jayzhan211 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@alamb alamb merged commit f284e3b into apache:main Jul 11, 2024
23 checks passed
@alamb
Copy link
Contributor

alamb commented Jul 11, 2024

Thank you everyone! This is a great addition i think

@appletreeisyellow appletreeisyellow deleted the chunchun/udf-to-loal-time-origin branch July 11, 2024 17:59
appletreeisyellow added a commit to influxdata/arrow-datafusion that referenced this pull request Jul 11, 2024
* feat: add UDF `to_local_time()`

* chore: support column value in array

* chore: lint

* chore: fix conversion for us, ms, and s

* chore: add more tests for daylight savings time

* chore: add function description

* refactor: update tests and add examples in description

* chore: add description and example

* chore: doc

chore: doc

chore: doc

chore: doc

chore: doc

* chore: stop copying

* chore: fix typo

* chore: mention that the offset varies based on daylight savings time

* refactor: parse timezone once and update examples in description

* refactor: replace map..concat with flat_map

* chore: add hard code timestamp value in test

chore: doc

chore: doc

* chore: handle errors and remove panics

* chore: move some test to slt

* chore: clone time_value

* chore: typo

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Lordworms pushed a commit to Lordworms/arrow-datafusion that referenced this pull request Jul 12, 2024
* feat: add UDF `to_local_time()`

* chore: support column value in array

* chore: lint

* chore: fix conversion for us, ms, and s

* chore: add more tests for daylight savings time

* chore: add function description

* refactor: update tests and add examples in description

* chore: add description and example

* chore: doc

chore: doc

chore: doc

chore: doc

chore: doc

* chore: stop copying

* chore: fix typo

* chore: mention that the offset varies based on daylight savings time

* refactor: parse timezone once and update examples in description

* refactor: replace map..concat with flat_map

* chore: add hard code timestamp value in test

chore: doc

chore: doc

* chore: handle errors and remove panics

* chore: move some test to slt

* chore: clone time_value

* chore: typo

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
appletreeisyellow added a commit to influxdata/arrow-datafusion that referenced this pull request Jul 12, 2024
* feat: add UDF `to_local_time()`

* chore: support column value in array

* chore: lint

* chore: fix conversion for us, ms, and s

* chore: add more tests for daylight savings time

* chore: add function description

* refactor: update tests and add examples in description

* chore: add description and example

* chore: doc

chore: doc

chore: doc

chore: doc

chore: doc

* chore: stop copying

* chore: fix typo

* chore: mention that the offset varies based on daylight savings time

* refactor: parse timezone once and update examples in description

* refactor: replace map..concat with flat_map

* chore: add hard code timestamp value in test

chore: doc

chore: doc

* chore: handle errors and remove panics

* chore: move some test to slt

* chore: clone time_value

* chore: typo

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
findepi pushed a commit to findepi/datafusion that referenced this pull request Jul 16, 2024
* feat: add UDF `to_local_time()`

* chore: support column value in array

* chore: lint

* chore: fix conversion for us, ms, and s

* chore: add more tests for daylight savings time

* chore: add function description

* refactor: update tests and add examples in description

* chore: add description and example

* chore: doc

chore: doc

chore: doc

chore: doc

chore: doc

* chore: stop copying

* chore: fix typo

* chore: mention that the offset varies based on daylight savings time

* refactor: parse timezone once and update examples in description

* refactor: replace map..concat with flat_map

* chore: add hard code timestamp value in test

chore: doc

chore: doc

* chore: handle errors and remove panics

* chore: move some test to slt

* chore: clone time_value

* chore: typo

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
xinlifoobar pushed a commit to xinlifoobar/datafusion that referenced this pull request Jul 17, 2024
* feat: add UDF `to_local_time()`

* chore: support column value in array

* chore: lint

* chore: fix conversion for us, ms, and s

* chore: add more tests for daylight savings time

* chore: add function description

* refactor: update tests and add examples in description

* chore: add description and example

* chore: doc

chore: doc

chore: doc

chore: doc

chore: doc

* chore: stop copying

* chore: fix typo

* chore: mention that the offset varies based on daylight savings time

* refactor: parse timezone once and update examples in description

* refactor: replace map..concat with flat_map

* chore: add hard code timestamp value in test

chore: doc

chore: doc

* chore: handle errors and remove panics

* chore: move some test to slt

* chore: clone time_value

* chore: typo

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
xinlifoobar pushed a commit to xinlifoobar/datafusion that referenced this pull request Jul 18, 2024
* feat: add UDF `to_local_time()`

* chore: support column value in array

* chore: lint

* chore: fix conversion for us, ms, and s

* chore: add more tests for daylight savings time

* chore: add function description

* refactor: update tests and add examples in description

* chore: add description and example

* chore: doc

chore: doc

chore: doc

chore: doc

chore: doc

* chore: stop copying

* chore: fix typo

* chore: mention that the offset varies based on daylight savings time

* refactor: parse timezone once and update examples in description

* refactor: replace map..concat with flat_map

* chore: add hard code timestamp value in test

chore: doc

chore: doc

* chore: handle errors and remove panics

* chore: move some test to slt

* chore: clone time_value

* chore: typo

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
appletreeisyellow added a commit to influxdata/arrow-datafusion that referenced this pull request Jul 22, 2024
* feat: add UDF `to_local_time()`

* chore: support column value in array

* chore: lint

* chore: fix conversion for us, ms, and s

* chore: add more tests for daylight savings time

* chore: add function description

* refactor: update tests and add examples in description

* chore: add description and example

* chore: doc

chore: doc

chore: doc

chore: doc

chore: doc

* chore: stop copying

* chore: fix typo

* chore: mention that the offset varies based on daylight savings time

* refactor: parse timezone once and update examples in description

* refactor: replace map..concat with flat_map

* chore: add hard code timestamp value in test

chore: doc

chore: doc

* chore: handle errors and remove panics

* chore: move some test to slt

* chore: clone time_value

* chore: typo

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add to_local_time function for converting timestamps with timezones to timestmaps without timezones
4 participants