Skip to content

Releases: kaburia/filter-stations

Kieni weather Data and Aggregate Variables

07 Sep 02:43
Compare
Choose a tag to compare

Kieni Data

"""
Retrieves weather data from the Kieni API endpoint and returns it as a pandas DataFrame after processing.

    Parameters:
    -----------
    - start_date (str, optional): The start date for retrieving weather data in 'YYYY-MM-DD' format. Defaults to None if None returns from the beginning of the data.
    - end_date (str, optional): The end date for retrieving weather data in 'YYYY-MM-DD' format. Defaults to None if None returns to the end of the data.
    - variable (str, optional): The weather variable to retrieve same as the weather shortcodes by TAHMO e.g., 'pr', 'ap', 'rh'
    - method (str, optional): The aggregation method to apply to the data ('sum', 'mean', 'min', 'max' and custom functions). Defaults to 'sum'.
    - freq (str, optional): The frequency for data aggregation (e.g., '1D' for daily, '1H' for hourly). Defaults to '1D'.

    Returns:
    -----------
    - pandas.DataFrame: DataFrame containing the weather data for the specified parameters, with columns containing NaN values dropped.

    Usage:
    -----------
    To retrieve daily rainfall data from January 1, 2024, to January 31, 2024:
    ```python
    # Instantiate the Kieni class
    api_key, api_secret = '', '' # Request DSAIL for the API key and secret
    kieni = Kieni(api_key, api_secret)

    kieni_weather_data = kieni.kieni_weather_data(start_date='2024-01-01', end_date='2024-01-31', variable='pr', freq='1D', method='sum')
    ```

    To retrieve hourly temperature data from February 1, 2024, to February 7, 2024:
    ```python
    kieni_weather_data = kieni.kieni_weather_data(start_date='2024-02-01', end_date='2024-02-07', variable='te', method='mean', freq='1H')
    ```
    """

Aggregate Variables

"""
Aggregates a pandas DataFrame of weather variables by applying a specified method across a given frequency.

    Parameters:
    -----------
    - dataframe (pandas.DataFrame): DataFrame containing weather variable data.
    - freq (str, optional): Frequency to aggregate the data by. Defaults to '1D'. 
                            Examples include '1H' for hourly, '12H' for every 12 hours, '1D' for daily, '1W' for weekly, '1M' for monthly, etc.
    - method (str or callable, optional): Method to use for aggregation. Defaults to 'sum'.
                                        Acceptable string values are 'sum', 'mean', 'min', 'max'. 
                                        Alternatively, you can provide a custom aggregation function (callable).
                                        
                                        Example of a custom method:
                                        ```python
                                        def custom_median(x):
                                            return np.nan if x.isnull().all() else x.median()

                                        daily_median_data = aggregate_variables(dataframe, freq='1D', method=custom_median)
                                        ```

    Returns:
    -----------
    - pandas.DataFrame: DataFrame containing aggregated weather variable data according to the specified frequency and method.
        
    Usage:
    -----------
    Define the DataFrame containing the weather variable data:
    ```python
    dataframe = ret.get_measurements('TA00001', '2020-01-01', '2020-01-31', ['pr']) # data comes in 5 minute interval
    ```
    To aggregate data hourly:
    ```python
    hourly_data = aggregate_variables(dataframe, freq='1H')
    ```
    To aggregate data by 12 hours:
    ```python
    half_day_data = aggregate_variables(dataframe, freq='12H')
    ```
    To aggregate data by day:
    ```python
    daily_data = aggregate_variables(dataframe, freq='1D')
    ```
    To aggregate data by week:
    ```python
    weekly_data = aggregate_variables(dataframe, freq='1W')
    ```
    To aggregate data by month:
    ```python
    monthly_data = aggregate_variables(dataframe, freq='1M')
    ```
    To use a custom aggregation method:
    ```python
    def custom_median(x):
        return np.nan if x.isnull().all() else x.median()

    daily_median_data = aggregate_variables(dataframe, freq='1D', method=custom_median)
    ```
    """

Water Level Class

22 Jan 11:11
Compare
Choose a tag to compare

The Water_level class is used to retrieve water level data and coordinates of gauging stations

Example:
Getting the coordinates

wl = Water_level()
coords = wl.coordinates('muringato')
print(coords)  # Output: (-0.406689, 36.96301)

Retrieving water level data

       from filter_stations import water_level
       wl = Water_level()
       # get water level data for the muringato gauging station
       muringato_data = wl.water_level_data('muringato')
       # get water level data for the ewaso gauging station
       ewaso_data = wl.water_level_data('ewaso') 

Duplicate Coordinates for stations with multiple sensors

16 Jan 20:27
Compare
Choose a tag to compare
    Retrieve longitudes,latitudes for a list of station_sensor names and duplicated for stations with multiple sensors.

    Parameters:
    -----------
    - station_sensor (list): List of station_sensor names.
    - normalize (bool): If True, normalize the coordinates using MinMaxScaler to the range (0,1).

    Returns:
    -----------
    - pd.DataFrame: DataFrame containing longitude and latitude coordinates for each station_sensor.

    Usage:
    -----------
    To retrieve coordinates 
    ```python
    start_date = '2023-01-01'
    end_date = '2023-12-31'
    country= 'KE'
    
    # get the precipitation data for the stations
    ke_pr = filt.filter_pr(start_date=start_date, end_date=end_date, 
                            country='Kenya').set_index('Date')
    
    # get the coordinates
    xs = ret.get_coordinates(ke_pr.columns, normalize=True)

Retrieve Data from BigQuery

19 Dec 00:24
Compare
Choose a tag to compare

"""
Retrieves precipitation data from BigQuery based on specified parameters.

    Parameters:
    -----------
    - start_date (str): Start date for data query.
    - end_date (str): End date for data query.
    - country (str): Country name for filtering stations.
    - region (str): Region name for filtering stations.
    - radius (str): Radius for stations within a specified region.
    - multiple_stations (str): Comma-separated list of station IDs.
    - station (str): Single station ID for data filtering.

    Returns:
    -----------
    - pd.DataFrame: A Pandas DataFrame containing the filtered precipitation data.
    
    Usage:
    -----------
    To get precipitation data for a specific date range:
    ```python
    fs = Filter(api_key, api_secret, maps_key)  # Create an instance of your class
    start_date = '2021-01-01'
    end_date = '2021-01-31'
    pr_data = fs.filter_pr(start_date, end_date)
    ```
    To get precipitation data for a specific date range and country:
    ```python
    fs = Filter(api_key, api_secret, maps_key)  # Create an instance of your class
    start_date = '2021-01-01'
    end_date = '2021-01-31'
    country = 'Kenya'
    pr_data = fs.filter_pr(start_date, end_date, country=country)
    ```
    To get precipitation data for a specific date range and region:
    ```python
    fs = Filter(api_key, api_secret, maps_key)  # Create an instance of your class
    start_date = '2021-01-01'
    end_date = '2021-01-31'
    region = 'Nairobi'
    pr_data = fs.filter_pr(start_date, end_date, region=region)
    ```
    To get precipitation data for a specific date range and region with a radius:
    ```python
    fs = Filter(api_key, api_secret, maps_key)  # Create an instance of your class
    start_date = '2021-01-01'
    end_date = '2021-01-31'
    region = 'Nairobi'
    radius = 100
    pr_data = fs.filter_pr(start_date, end_date, region=region, radius=radius)
    ```
    To get precipitation data for a specific date range and multiple stations:
    ```python
    fs = Filter(api_key, api_secret, maps_key)  # Create an instance of your class
    start_date = '2021-01-01'
    end_date = '2021-01-31'
    multiple_stations = ['TA00001', 'TA00002', 'TA00003']
    pr_data = fs.filter_pr(start_date, end_date, multiple_stations=multiple_stations)
    ```
    To get precipitation data for a specific date range and a single station:
    ```python
    fs = Filter(api_key, api_secret, maps_key)  # Create an instance of your class
    start_date = '2021-01-01'
    end_date = '2021-01-31'
    station = 'TA00001'
    pr_data = fs.filter_pr(start_date, end_date, station=station)
    ```

    """

filter-stations v0.5.3

08 Nov 14:26
Compare
Choose a tag to compare

Include the ground truth data and maps configuration path fixed

Ground truth data

11 Oct 08:58
Compare
Choose a tag to compare
    Retrieves ground truth data for a specified date range.

    Parameters:
    -----------
    - start_date (str): The start date for the report in 'yyyy-mm-dd' format.
    - end_date (str, optional): The end date for the report in 'yyyy-mm-dd' format.
                                If not provided, only data for the start_date is returned.

    Returns:
    -----------
    - pandas.DataFrame: A DataFrame containing ground truth data with columns 'startDate',
                        'station_sensor',  'description' and 'level'. The 'startDate' column is used as the index.

    Raises:
    -----------
    - Exception: If there's an issue with the API request.

    Usage:
    -----------
    To retrieve ground truth data for a specific date range:
    ```python
    start_date = '2023-01-01'
    end_date = '2023-01-31'
    report_data = ret.ground_truth(start_date, end_date)
    ```

    To retrieve ground truth data for a specific date:
    ```
    start_date = '2023-01-01'
    report_data = ret.ground_truth(start_date)
    ```

stations by region

06 Oct 06:31
9374b9d
Compare
Choose a tag to compare

Subsets weather stations by a specific geographical region and optionally plots them on a map with a scale bar.

    Parameters:
    -----------
    - region (str): The name of the region to subset stations from (47 Kenyan counties).
    - plot (bool, optional): If True, a map with stations and a scale bar is plotted. Default is False.

    Returns:
    -----------
    - list or None: If plot is False, returns a list of station codes in the specified region. Otherwise, returns None.

    Usage:
    -----------
    To get a list of station codes in the 'Nairobi' region without plotting:
    ```
    fs = Filter(api_key, api_secret, maps_key)  # Create an instance of your class
    station_list = fs.stations_region('Nairobi')
    ```

    To subset stations in the 'Nairobi' region and display them on a map with a scale bar:
    ```
    fs = Filter(api_key, api_secret, maps_key)  # Create an instance of your class
    fs.stations_region('Nairobi', plot=True)
    ```
Nairobi Region Map