Skip to content

Commit

Permalink
Merge pull request #7 from tgrandje/dev
Browse files Browse the repository at this point in the history
Due to be released as 0.3.0
  • Loading branch information
tgrandje authored Aug 1, 2024
2 parents 14f552d + 4c2c2f8 commit 080ebb2
Show file tree
Hide file tree
Showing 19 changed files with 848 additions and 159 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ jobs:
env:
insee_key: ${{secrets.INSEE_KEY}}
insee_secret: ${{secrets.INSEE_SECRET}}
run: poetry run pytest --cov
run: poetry run pytest --cov -W error

publish:
runs-on: ubuntu-latest
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -68,4 +68,4 @@ jobs:
env:
insee_key: ${{secrets.INSEE_KEY}}
insee_secret: ${{secrets.INSEE_SECRET}}
run: poetry run pytest --cov
run: poetry run pytest --cov -W error
80 changes: 80 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,27 @@ This package is currently under active development.
Every API on [Hub'eau](hubeau.eaufrance.fr/) will be covered by this package in
due time.

At this stage, the following APIs are covered by cl-hubeau:
* [piezometry/piézométrie](https://hubeau.eaufrance.fr/page/api-piezometrie)
* [hydrometry/hydrométrie](https://hubeau.eaufrance.fr/page/api-hydrometrie)
* [drinking water quality/qualité de l'eau potable](https://hubeau.eaufrance.fr/page/api-qualite-eau-potable#/qualite_eau_potable/communes)

For any help on available kwargs for each endpoint, please refer
directly to the documentation on hubeau (this will not be covered
by the current documentation).

Assume that each function from cl-hubeau will be consistent with
it's hub'eau counterpart, with the exception of the `size` and
`page` or `cursor` arguments (those will be set automatically by
cl-hubeau to crawl allong the results).

## Configuration

First of all, you will need API keys from INSEE to use some high level operations,
which may loop over cities'official codes. Please refer to pynsee's
[API subscription Tutorial ](https://pynsee.readthedocs.io/en/latest/api_subscription.html)
for help.

## Basic examples

### Clean cache
Expand Down Expand Up @@ -105,4 +126,63 @@ with hydrometry.HydrometrySession() as session:
df = session.get_realtime_observations(code_entite="K437311001")
df = session.get_observations(code_entite="K437311001")

```

### Drinking water quality

2 high level functions are available (and one class for low level operations).


Get all water networks (UDI) (uses a 30 days caching):

```python
from cl_hubeau import drinking_water_quality
df = drinking_water_quality.get_all_water_networks()
```

Get the sanitary controls's results for nitrates on all networks of Paris, Lyon & Marseille
(uses a 30 days caching) for nitrates

```python
networks = drinking_water_quality.get_all_water_networks()
networks = networks[
networks.nom_commune.isin(["PARIS", "MARSEILLE", "LYON"])
]["code_reseau"].unique().tolist()

df = drinking_water_quality.get_control_results(
codes_reseaux=networks,
code_parametre="1340"
)
```

Note that this query is heavy, even if this was already restricted to nitrates.
In theory, you could also query the API without specifying the substance you're tracking,
but you may hit the 20k threshold and trigger an exception.

As it is, the `get_control_results` function already implements a double loop:

* on networks' codes (20 codes maximum) ;
* on periods, requesting only yearly datasets (which should be scalable over time **and** should work nicely with the cache algorithm).

You can also call the same function, using official city codes directly:
```python
df = drinking_water_quality.get_control_results(
codes_communes=['59350'],
code_parametre="1340"
)
```

Low level class to perform the same tasks:


Note that :

* the API is forbidding results > 20k rows and you may need inner loops
* the cache handling will be your responsibility

```python
with drinking_water_quality.DrinkingWaterQualitySession() as session:
df = session.get_cities_networks(nom_commune="LILLE")
df = session.get_control_results(code_departement='02', code_parametre="1340")

```
3 changes: 2 additions & 1 deletion cl_hubeau/__init__.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
# -*- coding: utf-8 -*-

from dotenv import load_dotenv
from importlib_metadata import version

from .config import _config


load_dotenv(override=True)
__version__ = version(__package__)
105 changes: 0 additions & 105 deletions cl_hubeau/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,108 +8,3 @@
APP_NAME = "cl-hubeau"
DIR_CACHE = platformdirs.user_cache_dir(APP_NAME, ensure_exists=True)
CACHE_NAME = "clhubeau_http_cache.sqlite"

DEPARTEMENTS = [
"01",
"02",
"03",
"04",
"05",
"06",
"07",
"08",
"09",
"10",
"11",
"12",
"13",
"14",
"15",
"16",
"17",
"18",
"19",
"20",
"21",
"22",
"23",
"24",
"25",
"26",
"27",
"28",
"29",
"2A",
"2B",
"30",
"31",
"32",
"33",
"34",
"35",
"36",
"37",
"38",
"39",
"40",
"41",
"42",
"43",
"44",
"45",
"46",
"47",
"48",
"49",
"50",
"51",
"52",
"53",
"54",
"55",
"56",
"57",
"58",
"59",
"60",
"61",
"62",
"63",
"64",
"65",
"66",
"67",
"68",
"69",
"70",
"71",
"72",
"73",
"74",
"75",
"76",
"77",
"78",
"79",
"80",
"81",
"82",
"83",
"84",
"85",
"86",
"87",
"88",
"89",
"90",
"91",
"92",
"93",
"94",
"95",
"971",
"972",
"973",
"974",
"976",
]
12 changes: 12 additions & 0 deletions cl_hubeau/drinking_water_quality/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# -*- coding: utf-8 -*-

from .drinking_water_quality_scraper import DrinkingWaterQualitySession

from .utils import get_all_water_networks, get_control_results


__all__ = [
"get_all_water_networks",
"get_control_results",
"DrinkingWaterQualitySession",
]
Loading

0 comments on commit 080ebb2

Please sign in to comment.