Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add Polars example to readme #68

Merged
merged 2 commits into from
Jun 25, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 41 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

This ia an API client to get weather data from the [Open-Meteo Weather API](https://open-meteo.com) based on the Python library `requests`.

Instead of using JSON, the API client uses FlatBuffers to transfer data. Encoding data in FlatBuffers is more efficient for long time-series data. Data can be transferred to `numpy` or `pandas` using [Zero-Copy](https://en.wikipedia.org/wiki/Zero-copy) to analyze large amount of data quickly. The schema definition files can be found on [GitHub open-meteo/sdk](https://github.com/open-meteo/sdk).
Instead of using JSON, the API client uses FlatBuffers to transfer data. Encoding data in FlatBuffers is more efficient for long time-series data. Data can be transferred to `numpy`, `pandas`, or `polars` using [Zero-Copy](https://en.wikipedia.org/wiki/Zero-copy) to analyze large amount of data quickly. The schema definition files can be found on [GitHub open-meteo/sdk](https://github.com/open-meteo/sdk).

This library is primarily designed for data-scientists to process weather data. In combination with the [Open-Meteo Historical Weather API](https://open-meteo.com/en/docs/historical-weather-api) data from 1940 onwards can be analyzed quickly.

Expand Down Expand Up @@ -64,8 +64,7 @@ hourly_wind_speed_10m = next(filter(lambda x: x.Variable() == Variable.wind_spee

### Pandas

For `Pandas` you can prepare a data-frame from hourly data like follows:

After using `NumPy` to create arrays for hourly data, you can use `Pandas` to create a DataFrame from hourly data like follows:

```python
import pandas as pd
Expand All @@ -80,13 +79,43 @@ hourly_data["temperature_2m"] = hourly_temperature_2m
hourly_data["precipitation"] = hourly_precipitation
hourly_data["wind_speed_10m"] = hourly_wind_speed_10m

hourly_dataframe = pd.DataFrame(data = hourly_data)
print(hourly_dataframe)
#date temperature_2m precipitation
#0 2023-08-01 00:00:00 16.945999 1.7
#1 2023-08-01 01:00:00 16.996000 2.1
#2 2023-08-01 02:00:00 16.996000 1.0
#3 2023-08-01 03:00:00 16.846001 0.2
hourly_dataframe_pd = pd.DataFrame(data = hourly_data)
print(hourly_dataframe_pd)
# date temperature_2m precipitation wind_speed_10m
# 0 2024-06-21 00:00:00 17.437000 0.0 6.569383
# 1 2024-06-21 01:00:00 17.087000 0.0 6.151683
# 2 2024-06-21 02:00:00 16.786999 0.0 7.421590
# 3 2024-06-21 03:00:00 16.337000 0.0 5.154416
```

### Polars

Additionally, `Polars` can also be used to create a DataFrame from hourly data using the `NumPy` arrays created previously:

```python
import polars as pl
from datetime import datetime, timedelta, timezone

start = datetime.fromtimestamp(hourly.Time(), timezone.utc)
end = datetime.fromtimestamp(hourly.TimeEnd(), timezone.utc)
freq = timedelta(seconds = hourly.Interval())

hourly_dataframe_pl = pl.select(
date = pl.datetime_range(start, end, freq, closed = "left"),
temperature_2m = hourly_temperature_2m,
precipitation = hourly_precipitation,
wind_speed_10m = hourly_wind_speed_10m
)
print(hourly_dataframe_pl)
# ┌─────────────────────────┬────────────────┬───────────────┬────────────────┐
# │ date ┆ temperature_2m ┆ precipitation ┆ wind_speed_10m │
# │ --- ┆ --- ┆ --- ┆ --- │
# │ datetime[μs, UTC] ┆ f32 ┆ f32 ┆ f32 │
# ╞═════════════════════════╪════════════════╪═══════════════╪════════════════╡
# │ 2024-06-21 00:00:00 UTC ┆ 17.437 ┆ 0.0 ┆ 6.569383 │
# │ 2024-06-21 01:00:00 UTC ┆ 17.087 ┆ 0.0 ┆ 6.151683 │
# │ 2024-06-21 02:00:00 UTC ┆ 16.786999 ┆ 0.0 ┆ 7.42159 │
# │ 2024-06-21 03:00:00 UTC ┆ 16.337 ┆ 0.0 ┆ 5.154416 │
```

### Caching Data
Expand Down Expand Up @@ -114,6 +143,7 @@ om = openmeteo_requests.Client(session=retry_session)
```

# TODO

- Document multi location/timeinterval usage
- Document FlatBuffers data structure
- Document time start/end/interval
Expand All @@ -123,4 +153,5 @@ om = openmeteo_requests.Client(session=retry_session)
- Consider dedicated pandas library to convert responses quickly

# License

MIT
Loading