Skip to content

IanRFerguson/klondike

Repository files navigation

Klondike

Klondike offers a lightweight API to read and write data to Google BigQuery using Polars DataFrames.

Installation

Installing Klondike

Install at the command line

pip install klondike

Installing Rust

Since Polars leverages Rust speedups, you need to have Rust installed in your environment as well. See the Rust installation guide here.

Usage

In this demo we'll connect to Google BigQuery, read data, transform it, and write it back to the data warehouse.

First, connect to the BigQuery warehouse by supplying the BigQueryConnector() object with the relative path to your service account credentials.

from klondike import BigQueryConnector

# Connect Klondike to Google BigQuery
bq = BigQueryConnector(
    app_creds="dev/nba-player-service.json"
)

Next, supply the object with a SQL query in the read_dataframe() function to redner a DataFrame object:

# Write some valid SQL
sql = """
SELECT
    *
FROM nba_dbt.cln_player__averages
ORDER BY avg_points DESC
"""


# Pull BigQuery data into a Polars DataFrame
nyk = bq.read_dataframe(sql=sql)

Now that your data is pulled into a local instance, you can clean and transform it using standard Polars functionality - see the docs for more information.

# Perform some transformations
key_metrics = [
    "avg_offensive_rating",
    "avg_defensive_rating",
    "avg_effective_field_goal_percentage"
]

summary_stats = nyk[key_metrics].describe()

Finally, push your transformed data back to the BigQuery warehouse using the write_dataframe() function:

# Write back to BigQuery
bq.write_dataframe(
    df=summary_stats,
    table_name="nba_dbt.summary_statistics",
    if_exists="truncate"
)

About

Lightweight data warehouse API, powered by Polars

Resources

License

Stars

Watchers

Forks

Packages

No packages published