diff --git a/.gitignore b/.gitignore
index d8a06582..083d8970 100644
--- a/.gitignore
+++ b/.gitignore
@@ -50,4 +50,7 @@ terraform.rc
**/buildplan
**/destroyplan
+# Ignore all jupyter checkpoint folders
+.ipynb_checkpoints
+
Dev/env.yaml
\ No newline at end of file
diff --git a/Core/Manual_Workflows/1. Viz DB Service Development Overview.ipynb b/Core/Manual_Workflows/1. Viz DB Service Development Overview.ipynb
new file mode 100644
index 00000000..a7866a0a
--- /dev/null
+++ b/Core/Manual_Workflows/1. Viz DB Service Development Overview.ipynb
@@ -0,0 +1,616 @@
+{
+ "cells": [
+ {
+ "attachments": {
+ "IntroDiagram.JPG": {
+ "image/jpeg": ""
+ }
+ },
+ "cell_type": "markdown",
+ "id": "1be801a7",
+ "metadata": {
+ "deletable": false
+ },
+ "source": [
+ "
\n",
+ "
HydroVIS Visualization Workflow
\n",
+ " \n",
+ "
\n",
+ " Viz EGIS services (except inundation) are now stored in a PostgreSQL database. While a version of the on-prem loosa viz pipeline does still run on a Windows virtual machine in HydroVIS, we're moving in the direction of a serverless pipeline based on a central RDS database with processing taking place in on-demand lambda functions. In order to maintain the stability of the DB and lambda functions in the pipeline, we have developed a workflow to minimize the need to work directly with either the database or the automated lambda functions. Instead, services can be developed with structured query language (SQL, and more specifically PostgreSQL) inside of a Jupyter notebook like this one. A good tutorial resource can be found here to learn more about writing SQL. \n",
+ "
\n",
+ "
\n",
+ " In other words, this new framework replaces the process.py and product python files of the on-prem loosa library with SQL files that do the same work in less code (a SQL file is just an executable text file that stores SQL code with a .sql extension, similar to a .py file). That said, the entire pipeline is still python-based, and while we're encouraging the use of SQL wherever possible due to the reduction of code and built-in data validation / QA-QC... we do still have the ability to use Python for post-processing as well, and can continue to evaluate that on a service-by-service basis, depending on requirements.\n",
+ "
\n",
+ "
\n",
+ " There are 3 main categories of SQL files in the viz workflow (explained in detail through the rest of this notebook):\n",
+ "
\n",
+ "
Parent SQL Files (pre-processing, e.g. max flows)
\n",
+ "
Service SQL Files (processing, e.g. loosa product files)
\n",
+ "
Final SQL Files (post-processing, e.g. hucs hotspots)
\n",
+ " \n",
+ " \n",
+ "
Almost all development of vector services can/should be done in these SQL files (primarily #2 Serivce SQL Files and occasionaly #3 Final SQL Files, such as HUC hotspot layers). We're still working out the right approach for raster services, and will be collaborating on that approach in the coming weeks.
The VizProcessing PostgreSQL database now serves as our primary source of authoritative and pipeline input data. Similar to the folder structure of the on-prem EGIS pipelines, we now use database schemas to categorize available database tables.
Several other schemas exist within the database:\n",
+ "
\n",
+ "
admin - this schema stores service metadata, as well as logs and tracking information about the pipelines.
\n",
+ "
cache - this holds the max flows tables and some interim tables used by some services.
\n",
+ "
fim - this schema stores data specific to FIM.
\n",
+ "
publish - this schema is where the attribute tables of the services themselves are written to, as a result of the SQL file scripts that will be explained in this notebook.
These SQL files are used for pre processing NWM data. In most cases, the SQL code creates a table of maximum streamflows for a given forecast. Normally these files will not need to be altered or touched, unless a new data pre-processing needs to be done.
\n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "ae1327e4",
+ "metadata": {},
+ "source": [
+ "An example of some SQL that calculates the maximum streamflow for each feature id in the short range configuration (this is equivelant to the max_flows pipelines on-prem). \n",
+ "For development purposes, a pandas dataframe is returned by the run_sql_in_db function \n",
+ "and respresents what the resulting DB output would look like."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "f2d26ebf",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from helper_functions.shared_functions import *\n",
+ "sql = \"\"\"\n",
+ "SELECT\n",
+ " forecasts.feature_id,\n",
+ " round((max(forecasts.streamflow) * 35.315)::numeric, 2) AS maxflow_18hour\n",
+ "FROM ingest.nwm_channel_rt_srf forecasts\n",
+ "GROUP BY forecasts.feature_id;\n",
+ "\"\"\"\n",
+ "\n",
+ "run_sql_in_db(sql)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "a15c905c",
+ "metadata": {},
+ "source": [
+ "
\n",
+ "
*The actual automated parent SQL file that creates the srf max flows table in the database uses a SELECT ... INTO statement to write the output of the above query into a new table in the database on each run (as opposed to just returning the results of the query to the client, like we're doing in this example). We're leaving the INTO part out of this notebook for now, as to not write any data as part of this tutorial, but that's how we're able to query this same data from cache.max_flows_srf in the following steps.\n",
+ "
The service SQL files do the main data transformations and processing to create the actual attribute tables for the for map services (akin to the product files on-prem, although now we have one SQL file per service). The results of running this SQL is a table that is used as a data source for the pro project layer, just like the feature classes created by our on-prem process.py files. There are many SQL statements that can be used to get the desired table, including join, where, group by, etc. See the PostgreSQL tutorial mentioned in the intro for more help on these statements and queries. In the examples below I will slowly build the SQL code for the srf max high flow magnitude service, highlighting different SQL statements that will probably be used for most services.
\n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "c713eeda",
+ "metadata": {},
+ "source": [
+ "An example of selecting fields from the table we just \"created\" in the last step."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "7a2ec878",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "sql = \"\"\"\n",
+ "SELECT \n",
+ " cache.max_flows_srf.feature_id,\n",
+ " cache.max_flows_srf.maxflow_18hour\n",
+ "FROM cache.max_flows_srf\n",
+ "\"\"\"\n",
+ "\n",
+ "run_sql_in_db(sql)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "029f2619",
+ "metadata": {},
+ "source": [
+ "An example of using AS to create aliases for table names and fields (AS is optional)\n",
+ "Notice that when I select the fields now, I am using the Alias from when I imported the table. Aliases are useful to help write clean and readable SQL code."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "d7cc20ca",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "sql = \"\"\"\n",
+ "SELECT \n",
+ " maxflows.feature_id,\n",
+ " maxflows.maxflow_18hour AS max_flow\n",
+ "FROM cache.max_flows_srf AS maxflows\n",
+ "\"\"\"\n",
+ "\n",
+ "run_sql_in_db(sql)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "34ff8bc6",
+ "metadata": {},
+ "source": [
+ "An example of using JOIN to join two tables together, selecing fields from both tables.\n",
+ "Notice that on the join we first indicate what table we want to join to, and then which field to use as the joining key. There are multiple types of joins in SQL (inner, left outer, right outer, etc.), but we can primarily get away with the simple JOIN statement (short for inner join) which only returns rows in which the join key is present in both tables."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "07566a7a",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "sql = \"\"\"\n",
+ "SELECT \n",
+ " maxflows.feature_id,\n",
+ " maxflows.maxflow_18hour AS max_flow,\n",
+ " thresholds.high_water_threshold AS high_water_threshold,\n",
+ " thresholds.rf_2_0_17c AS flow_2yr,\n",
+ " thresholds.rf_5_0_17c AS flow_5yr,\n",
+ " thresholds.rf_10_0_17c AS flow_10yr,\n",
+ " thresholds.rf_25_0_17c AS flow_25yr,\n",
+ " thresholds.rf_50_0_17c AS flow_50yr\n",
+ "FROM cache.max_flows_srf AS maxflows\n",
+ "JOIN derived.recurrence_flows_conus AS thresholds ON maxflows.feature_id = thresholds.feature_id\n",
+ "\"\"\"\n",
+ "\n",
+ "run_sql_in_db(sql)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "3fe88fa0",
+ "metadata": {},
+ "source": [
+ "An example of using WHERE to select specific data (WHERE is basically a way to filter your results)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "ab03df6b",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "sql = \"\"\"\n",
+ "SELECT \n",
+ " maxflows.feature_id,\n",
+ " maxflows.maxflow_18hour AS max_flow,\n",
+ " thresholds.high_water_threshold AS high_water_threshold,\n",
+ " thresholds.rf_2_0_17c AS flow_2yr,\n",
+ " thresholds.rf_5_0_17c AS flow_5yr,\n",
+ " thresholds.rf_10_0_17c AS flow_10yr,\n",
+ " thresholds.rf_25_0_17c AS flow_25yr,\n",
+ " thresholds.rf_50_0_17c AS flow_50yr\n",
+ "FROM cache.max_flows_srf AS maxflows\n",
+ "JOIN derived.recurrence_flows_conus AS thresholds ON maxflows.feature_id = thresholds.feature_id\n",
+ "WHERE thresholds.high_water_threshold > 0 AND maxflows.maxflow_18hour >= thresholds.high_water_threshold\n",
+ "\"\"\"\n",
+ "\n",
+ "run_sql_in_db(sql)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "9884a473",
+ "metadata": {},
+ "source": [
+ "An example of using CASE to calculate a field using conditional statements. This is basically how you do an IF statement in SQL.\n",
+ "In addition, you can cast a value to a specific data type (like TEXT) with the \"::\" syntax, as shown below. This is needed in the example below because one of our recur_cat categories is \">50\", so every numeric value needs to be cast as text so that it doesn't throw a data type error (this is one example of SQL being a little more stringent than Python, but ultimately in a good way that forces an extra layer of validation)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "4a86c264",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "sql = \"\"\"\n",
+ "SELECT \n",
+ " maxflows.feature_id,\n",
+ " maxflows.maxflow_18hour AS max_flow,\n",
+ " CASE\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.rf_50_0_17c THEN '2'::text\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.rf_25_0_17c THEN '4'::text\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.rf_10_0_17c THEN '10'::text\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.rf_5_0_17c THEN '20'::text\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.rf_2_0_17c THEN '50'::text\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.high_water_threshold THEN '>50'::text\n",
+ " ELSE NULL::text\n",
+ " END AS recur_cat,\n",
+ " thresholds.high_water_threshold AS high_water_threshold,\n",
+ " thresholds.rf_2_0_17c AS flow_2yr,\n",
+ " thresholds.rf_5_0_17c AS flow_5yr,\n",
+ " thresholds.rf_10_0_17c AS flow_10yr,\n",
+ " thresholds.rf_25_0_17c AS flow_25yr,\n",
+ " thresholds.rf_50_0_17c AS flow_50yr\n",
+ "FROM cache.max_flows_srf AS maxflows\n",
+ "JOIN derived.recurrence_flows_conus AS thresholds ON maxflows.feature_id = thresholds.feature_id\n",
+ "WHERE thresholds.high_water_threshold > 0 AND maxflows.maxflow_18hour >= thresholds.high_water_threshold\n",
+ "\"\"\"\n",
+ "\n",
+ "run_sql_in_db(sql)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "6c9881d8",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "
\n",
+ "
Adding Geometry
\n",
+ "
\n",
+ " Now that we have our main data table, we need to add the spatial component in order for ArcGIS Server to render directly from the output of our SQL (this also allows us to use geopandas to do some mapping in this notebook). We can do this by joining to the derived.channels table and adding the geometry column to our select statement, along with the other fields that we want. This step should be pretty similar for all the services.\n",
+ "
\n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "c42f83a3",
+ "metadata": {},
+ "source": [
+ "This example also uses a temporary table by using the WITH statement. Sometimes it's nice to keep individual chunks of SQL / JOINs seperate (instead of one long nasty SQL statement that's tough to troubleshoot), so instead of saving an intermediate table somewhere in the database and referencing it, we can just create a temporary sub-query using WITH and use that in the primary SELECT statement. In this case, the sub-query is the high flow magnitude query that we just ran in the last step, and we're joining that to the channels table so that the channels table gets to rightfully be the authoritative source for the channel data."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "b220c4b1",
+ "metadata": {},
+ "source": [
+ "You'll notice below that we don't currently have a reference time column anywhere, and that's because we automatically manage reference_time within the database infastructure. We also need an oid column for ArcGIS Server, so when developing, we can just manually add those columns by adding the following lines to the SELECT statement:\n",
+ "\n",
+ "
\n",
+ "
'2022-03-25 00:00:00 UTC' as ref_time
\n",
+ "
row_number() over (order by channels.feature_id) as oid
\n",
+ "
\n",
+ ""
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "d540683e",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "sql = \"\"\"\n",
+ "WITH high_flow_mag AS (\n",
+ " SELECT \n",
+ " maxflows.feature_id,\n",
+ " maxflows.maxflow_18hour AS max_flow,\n",
+ " CASE\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.rf_50_0_17c THEN '2'::text\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.rf_25_0_17c THEN '4'::text\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.rf_10_0_17c THEN '10'::text\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.rf_5_0_17c THEN '20'::text\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.rf_2_0_17c THEN '50'::text\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.high_water_threshold THEN '>50'::text\n",
+ " ELSE NULL::text\n",
+ " END AS recur_cat,\n",
+ " thresholds.high_water_threshold AS high_water_threshold,\n",
+ " thresholds.rf_2_0_17c AS flow_2yr,\n",
+ " thresholds.rf_5_0_17c AS flow_5yr,\n",
+ " thresholds.rf_10_0_17c AS flow_10yr,\n",
+ " thresholds.rf_25_0_17c AS flow_25yr,\n",
+ " thresholds.rf_50_0_17c AS flow_50yr\n",
+ " FROM cache.max_flows_srf AS maxflows\n",
+ " JOIN derived.recurrence_flows_conus AS thresholds ON maxflows.feature_id = thresholds.feature_id\n",
+ " WHERE thresholds.high_water_threshold > 0::double precision AND maxflows.maxflow_18hour >= thresholds.high_water_threshold\n",
+ ")\n",
+ "SELECT \n",
+ " channels.feature_id,\n",
+ " channels.feature_id::TEXT AS feature_id_str,\n",
+ " channels.strm_order,\n",
+ " channels.name,\n",
+ " channels.huc6,\n",
+ " channels.nwm_vers,\n",
+ " high_flow_mag.max_flow,\n",
+ " high_flow_mag.recur_cat,\n",
+ " high_flow_mag.high_water_threshold,\n",
+ " high_flow_mag.flow_2yr,\n",
+ " high_flow_mag.flow_5yr,\n",
+ " high_flow_mag.flow_10yr,\n",
+ " high_flow_mag.flow_25yr,\n",
+ " high_flow_mag.flow_50yr,\n",
+ " channels.geom,\n",
+ " to_char(now()::timestamp without time zone, 'YYYY-MM-DD HH24:MI:SS UTC') AS update_time,\n",
+ " '2022-03-25 00:00:00 UTC' as ref_time,\n",
+ " row_number() over (order by channels.feature_id) as oid\n",
+ "FROM derived.channels_conus channels\n",
+ "JOIN high_flow_mag ON channels.feature_id = high_flow_mag.feature_id;\n",
+ "\"\"\"\n",
+ "\n",
+ "run_sql_in_db(sql)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "c8b73323",
+ "metadata": {},
+ "source": [
+ "With the geom column now present, we can return a geodataframe from the run_sql_in_db helper function, and map the output using the map_column helper function:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "46c451ad",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "gdf = run_sql_in_db(sql, return_geodataframe=True)\n",
+ "sub = gdf[gdf['strm_order'] >= 4]\n",
+ "column = \"recur_cat\"\n",
+ "colormap = {\n",
+ " '2': '#cc33ff',\n",
+ " '4': '#e600a9',\n",
+ " '10': '#ff0000',\n",
+ " '20': '#ff9900',\n",
+ " '50': '#ffff00',\n",
+ " '>50': '#72afe8'\n",
+ "}\n",
+ "title = \"Short Range Max High Flow Magnitude\"\n",
+ "\n",
+ "ax = map_column(gdf, column, colormap, title=title)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "95535f43",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "sub = gdf[gdf['strm_order'] > 4]\n",
+ "\n",
+ "ax = map_column(sub, column, colormap, title=title)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "a595b1a9",
+ "metadata": {},
+ "source": [
+ " Once you're ready to map your output in Arc Pro. Just use the save_gdf_shapefile_to_s3 helper function to save and upload a shapefile that you can download and use in your pro project. "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "a1f4add1",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "save_gdf_shapefile_to_s3(gdf, \"tyler_test\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "aadaf7de",
+ "metadata": {},
+ "source": [
+ "
\n",
+ "
3. Final SQL Files
\n",
+ "
The Final SQL files are for processing / summary that happens after the main service table is created. HUC Hotspots are the only current example of this type of processing. Below is a theoretical example of a HUC Hotspot layer for the SRF Max High Flow Magnitude Service table that resides in the database (created using the same SQL that we just wrote above).
\n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "0bf40cff",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "sql = \"\"\"\n",
+ "WITH srf_max_high_flow_magnitude AS (\n",
+ " SELECT maxflows.feature_id,\n",
+ " maxflows.maxflow_18hour AS max_flow,\n",
+ " CASE\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.rf_50_0_17c THEN '2'::text\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.rf_25_0_17c THEN '4'::text\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.rf_10_0_17c THEN '10'::text\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.rf_5_0_17c THEN '20'::text\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.rf_2_0_17c THEN '50'::text\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.high_water_threshold THEN '>50'::text\n",
+ " ELSE NULL::text\n",
+ " END AS recur_cat,\n",
+ " thresholds.high_water_threshold AS high_water_threshold,\n",
+ " thresholds.rf_2_0_17c AS flow_2yr,\n",
+ " thresholds.rf_5_0_17c AS flow_5yr,\n",
+ " thresholds.rf_10_0_17c AS flow_10yr,\n",
+ " thresholds.rf_25_0_17c AS flow_25yr,\n",
+ " thresholds.rf_50_0_17c AS flow_50yr\n",
+ " FROM cache.max_flows_srf maxflows\n",
+ " JOIN derived.recurrence_flows_conus thresholds ON maxflows.feature_id = thresholds.feature_id\n",
+ " WHERE thresholds.high_water_threshold > 0::double precision AND maxflows.maxflow_18hour >= thresholds.high_water_threshold\n",
+ " )\n",
+ "SELECT\n",
+ " hucs.huc10,\n",
+ " TO_CHAR(hucs.huc10, 'fm0000000000') AS huc10_str,\n",
+ " hucs.total_nwm_features,\n",
+ " count(hfm.feature_id) AS hfm_features,\n",
+ " count(hfm.feature_id)::numeric / hucs.total_nwm_features AS high_water_features_percent,\n",
+ " sum(CASE WHEN recur_cat = '2' THEN 1.0 ELSE 0 END) / hucs.total_nwm_features AS pct_2,\n",
+ " sum(CASE WHEN recur_cat = '4' THEN 1.0 ELSE 0 END) / hucs.total_nwm_features AS pct_4,\n",
+ " sum(CASE WHEN recur_cat = '10' THEN 1.0 ELSE 0 END) / hucs.total_nwm_features AS pct_10,\n",
+ " sum(CASE WHEN recur_cat = '20' THEN 1.0 ELSE 0 END) / hucs.total_nwm_features AS pct_20,\n",
+ " sum(CASE WHEN recur_cat = '50' THEN 1.0 ELSE 0 END) / hucs.total_nwm_features AS pct_50,\n",
+ " sum(CASE WHEN recur_cat = '>50' THEN 1.0 ELSE 0 END) / hucs.total_nwm_features AS pct_morethan50,\n",
+ " hucs.geom\n",
+ "FROM derived.huc10s_conus AS hucs\n",
+ "JOIN derived.featureid_huc_crosswalk AS crosswalk ON hucs.huc10 = crosswalk.huc10\n",
+ "JOIN srf_max_high_flow_magnitude AS hfm ON crosswalk.feature_id = hfm.feature_id\n",
+ "GROUP BY hucs.huc10, hucs.total_nwm_features, hucs.geom\n",
+ "order by count(hfm.feature_id) DESC\n",
+ "\"\"\"\n",
+ "\n",
+ "gdf = run_sql_in_db(sql, return_geodataframe=True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "6ea8a1e6",
+ "metadata": {},
+ "source": [
+ "We can map polygons as well with the map_column helper function, as long as geom column is present (currently available in the db for huc10 and huc8)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "1e2db58d",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "column = \"high_water_features_percent\"\n",
+ "map_column(gdf, column, categorical=True)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "40ab019e",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "65b5a02f",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from helper_functions.shared_functions import *"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "3b2c256d",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from sqlalchemy_schemadisplay import create_schema_graph\n",
+ "from sqlalchemy import MetaData\n",
+ "\n",
+ "graph = create_schema_graph(metadata=MetaData('postgres://user:pwd@host/database'))\n",
+ "graph.write_png('my_erd.png')"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "9a84834f",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "conda_python3",
+ "language": "python",
+ "name": "conda_python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.10.14"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/Core/Manual_Workflows/11. RASTER PRODUCT DEV WORKFLOW.ipynb b/Core/Manual_Workflows/11. RASTER PRODUCT DEV WORKFLOW.ipynb
new file mode 100644
index 00000000..6d241a08
--- /dev/null
+++ b/Core/Manual_Workflows/11. RASTER PRODUCT DEV WORKFLOW.ipynb
@@ -0,0 +1,228 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "4f3d72f8-df5f-4335-b81a-9a3996e27348",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "
Step 1: Create the raster product processing file in the raster_product_development/products folder
\n",
+ "\n",
+ "This is a jupyter notebook for developing visualization services. Create a file like this for each service that is being developed. This is where we will track progress as well as be able to debug code. Please refer to the \"Viz DB Service Development Tutorial\" notebook for practicing and learning how to interact with data stored in the viz database. To ease the work of development, helper functions have also been created to run SQL code, create maps, and upload shapefiles to an AWS S3 storage bucket. While these function should help in most cases, you can run other python code inside this notebook as well to do more complex operations.\n",
+ "\n",
+ "In order to enable a few tools/extensions that we want to use for this notebook please do the following:\n",
+ "\n",
+ "
Run \"!pip install nodejs geopandas contextily\" on the next cell
\n",
+ "
Click on the \"Commands\" tab on the left or press (Ctrl+Shift+C)
\n",
+ "
Click on \"Enable Extension Manager\"
\n",
+ "
Click on \"Enable\"
\n",
+ "
Click on the \"Extension Manager\" tab on the left (looks like a puzzle piece)
\n",
+ "
Install the \"jupyter-widgets/jupyterlab-manager\" extension.
\n",
+ "
You will be prompted to rebuild the notebook. Click on \"Rebuild\". This may take a bit of time but you will be prompted when it finishes.
\n",
+ "\n",
+ "First we need to create some SQL that will give us the table with the fields that we want. This is similar to the product scripts in the on-prem workflow. If additional datasets are needed, please contact Corey or Tyler about getting those into the DB. "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "id": "d145fe37",
+ "metadata": {},
+ "outputs": [
+ {
+ "ename": "NameError",
+ "evalue": "name 'run_sql_in_db' is not defined",
+ "output_type": "error",
+ "traceback": [
+ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
+ "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
+ "\u001b[0;32m/tmp/ipykernel_31280/3700748656.py\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 24\u001b[0m \"\"\"\n\u001b[1;32m 25\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 26\u001b[0;31m \u001b[0mrun_sql_in_db\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0msql\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
+ "\u001b[0;31mNameError\u001b[0m: name 'run_sql_in_db' is not defined"
+ ]
+ }
+ ],
+ "source": [
+ "sql = \"\"\"\n",
+ "WITH reference_time AS (\n",
+ " SELECT max(ingest_status.reference_time) AS reference_time\n",
+ " FROM admin.ingest_status\n",
+ " WHERE ingest_status.target::text = 'ingest.rnr_max_flows'::text\n",
+ " )\n",
+ "\t\n",
+ "SELECT ingest.rnr_max_flows.feature_id, \n",
+ "\tingest.rnr_max_flows.feature_id::TEXT AS feature_id_str,\n",
+ "\tName, \n",
+ "\tto_char(REFERENCE_TIME, 'YYYY-MM-DD HH24:MI:SS UTC') AS reference_time,\n",
+ "\tSTRING_AGG(FORECAST_NWS_LID || ' @ ' || FORECAST_ISSUE_TIME || ' (' || FORECAST_MAX_STATUS || ')', ', ') AS INHERITED_RFC_FORECASTS,\n",
+ "\tMAX(forecast_max_value) * 35.31467 AS MAX_FLOW,\n",
+ "\tINITCAP(MAX(REPLACE(VIZ_MAX_STATUS, '_', ' '))) AS MAX_STATUS,\n",
+ "\tINITCAP(MAX(WATERBODY_STATUS)) AS WATERBODY_STATUS,\n",
+ "\tMAX(VIZ_STATUS_LID) AS VIZ_STATUS_LID,\n",
+ "\tStrm_Order,\n",
+ "\thuc6,\n",
+ "\tto_char(now()::timestamp without time zone, 'YYYY-MM-DD HH24:MI:SS UTC') AS update_time,\n",
+ "\tgeom\n",
+ "FROM INGEST.RNR_MAX_FLOWS\n",
+ "left join derived.channels_conus ON INGEST.RNR_MAX_FLOWS.feature_id = derived.channels_conus.feature_id, reference_time\n",
+ "GROUP BY INGEST.RNR_MAX_FLOWS.FEATURE_ID, feature_id_str, Name, reference_time, Strm_Order, huc6, geom;\n",
+ "\"\"\"\n",
+ "\n",
+ "run_sql_in_db(sql)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "f95586e8",
+ "metadata": {},
+ "source": [
+ "
Check Outputs Through Pandas DataFrame
\n",
+ "\n",
+ "When we use the \"run_sql_in_db\" function, a pandas dataframe is returned. This allows us to inspect the results a little better and dig to make sure the output is what we expect"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "7a621e82-a811-401d-8dff-341767d00109",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "sql = \"\"\"\n",
+ "SELECT \n",
+ " maxflows.feature_id,\n",
+ " maxflows.maxflow_18hour AS max_flow,\n",
+ " CASE\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.rf_50_0_17c THEN '2'::text\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.rf_25_0_17c THEN '4'::text\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.rf_10_0_17c THEN '10'::text\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.rf_5_0_17c THEN '20'::text\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.rf_2_0_17c THEN '50'::text\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.high_water_threshold THEN '>50'::text\n",
+ " ELSE NULL::text\n",
+ " END AS recur_cat,\n",
+ " thresholds.high_water_threshold AS high_water_threshold,\n",
+ " thresholds.rf_2_0_17c AS flow_2yr,\n",
+ " thresholds.rf_5_0_17c AS flow_5yr,\n",
+ " thresholds.rf_10_0_17c AS flow_10yr,\n",
+ " thresholds.rf_25_0_17c AS flow_25yr,\n",
+ " thresholds.rf_50_0_17c AS flow_50yr\n",
+ "FROM cache.max_flows_srf AS maxflows\n",
+ "JOIN derived.recurrence_flows_conus AS thresholds ON maxflows.feature_id = thresholds.feature_id\n",
+ "WHERE thresholds.high_water_threshold > 0 AND maxflows.maxflow_18hour >= thresholds.high_water_threshold\n",
+ "\"\"\"\n",
+ "\n",
+ "df = run_sql_in_db(sql)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "11d13321",
+ "metadata": {},
+ "source": [
+ "In this example, we can take the dataframe and inspect all the features that have a recurence category of 2%. Interestingly, we will find that many of the features that have a recurrence category of 0, also have recurrence flow values of 0."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "682af5a3",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "df.loc[df['recur_cat'] == \"2\"]"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "0ede9df0",
+ "metadata": {},
+ "source": [
+ "Maybe we need to just remove any features that have a 2% flow of 0 or set the recurrence category to \"Not Available\". We can write some queries in the dataframe to pick out all the features with a 2% flow of 0 to see how extensive this issue is and what the range is for streamflows."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "d21ed782",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "df.loc[df['flow_50yr'] == 0].sort_values('max_flow')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "affca7a8",
+ "metadata": {},
+ "source": [
+ "
Comparing Dataframe to Testing Datasets
\n",
+ "\n",
+ "We could also compare the table to a test dataset to ensure that what we are seeing is what we expect. In this case, the outputs are not the same because the same reference times are not being compared"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "f4bdb322",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "test_dataset = pd.read_csv(\"test_datasets/srf_max_high_flow_magnitude_example.csv\")\n",
+ "\n",
+ "test_dataset.equals(df)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "921e5435",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "display(df)\n",
+ "display(test_dataset)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "d3d47161",
+ "metadata": {},
+ "source": [
+ "
Check Outputs Through Mapping
\n",
+ "\n",
+ "Previously, if we ever wanted to check outputs, we would have to run our python code, create the outputs, and then bring them into Arcgis Pro to verify. With the \"run_sql_in_db\" function, we can add a keyword argument (return_geodataframe=True) to return a spatially aware dataframe. With this dataframe, we can now plot the outputs directly in the notebook through the plot method. To ease the development, we have created a simple function to map a column (map_column). More complex aps can be created directly through the plot method if desired.\n",
+ "\n",
+ "In the first example, we will retrieve the table that is used for srf max high flow magnitude. We can then use the map_column function to create a map with the same categories and symbology to make sure the outputs look how we expect."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "id": "e6aba7f8",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "sql = \"\"\"\n",
+ "WITH high_flow_mag AS (\n",
+ " SELECT \n",
+ " maxflows.feature_id,\n",
+ " maxflows.maxflow_18hour AS max_flow,\n",
+ " CASE\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.rf_50_0_17c THEN '2'::text\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.rf_25_0_17c THEN '4'::text\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.rf_10_0_17c THEN '10'::text\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.rf_5_0_17c THEN '20'::text\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.rf_2_0_17c THEN '50'::text\n",
+ " WHEN maxflows.maxflow_18hour >= thresholds.high_water_threshold THEN '>50'::text\n",
+ " ELSE NULL::text\n",
+ " END AS recur_cat,\n",
+ " thresholds.high_water_threshold AS high_water_threshold,\n",
+ " thresholds.rf_2_0_17c AS flow_2yr,\n",
+ " thresholds.rf_5_0_17c AS flow_5yr,\n",
+ " thresholds.rf_10_0_17c AS flow_10yr,\n",
+ " thresholds.rf_25_0_17c AS flow_25yr,\n",
+ " thresholds.rf_50_0_17c AS flow_50yr\n",
+ " FROM cache.max_flows_srf AS maxflows\n",
+ " JOIN derived.recurrence_flows_conus AS thresholds ON maxflows.feature_id = thresholds.feature_id\n",
+ " WHERE thresholds.high_water_threshold > 0::double precision AND maxflows.maxflow_18hour >= thresholds.high_water_threshold\n",
+ ")\n",
+ "SELECT \n",
+ " channels.feature_id,\n",
+ " channels.feature_id::TEXT AS feature_id_str,\n",
+ " channels.strm_order,\n",
+ " channels.name,\n",
+ " channels.huc6,\n",
+ " channels.nwm_vers,\n",
+ " high_flow_mag.max_flow,\n",
+ " high_flow_mag.recur_cat,\n",
+ " high_flow_mag.high_water_threshold,\n",
+ " high_flow_mag.flow_2yr,\n",
+ " high_flow_mag.flow_5yr,\n",
+ " high_flow_mag.flow_10yr,\n",
+ " high_flow_mag.flow_25yr,\n",
+ " high_flow_mag.flow_50yr,\n",
+ " channels.geom,\n",
+ " to_char(now()::timestamp without time zone, 'YYYY-MM-DD HH24:MI:SS UTC') AS update_time\n",
+ "FROM derived.channels_conus channels\n",
+ "JOIN high_flow_mag ON channels.feature_id = high_flow_mag.feature_id;\n",
+ "\"\"\"\n",
+ "\n",
+ "gdf = run_sql_in_db(sql, return_geodataframe=True)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "ca88af6e",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "column = \"recur_cat\"\n",
+ "\n",
+ "map_column(gdf, column)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "id": "b9411806",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ ""
+ ]
+ },
+ "execution_count": 3,
+ "metadata": {},
+ "output_type": "execute_result"
+ },
+ {
+ "data": {
+ "image/png": "\n",
+ "text/plain": [
+ "
This notebook is to track and edit service metadata, including service data flows. For each service, create a markdown cell and write what service the following code will edit
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "id": "6f7e1110",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'ana_anomaly'\n",
+ "configuration = 'analysis_assim'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = None\n",
+ "postprocess_summary = None\n",
+ "summary = 'Streamflow Anomaly'\n",
+ "description = 'Depicts seasonal streamflow anomalies derived from the analysis and assimilation configuration of the National Water Model (NWM) over the contiguous U.S. Anomalies are based on 7-day and 14-day moving average streamflow percentiles for each reach and the current calendar day.'\n",
+ "tags = 'streamflow,anomaly,national,water,model,nwm'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'nwm'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = False\n",
+ "fim_configs = None\n",
+ "public_service = True\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = 'common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/analysis_assim/nwm.t{{datetime:%H}}z.analysis_assim.channel_rt.tm00.conus.nc'\n",
+ "source_table = None\n",
+ "target_table = None\n",
+ "target_keys = None\n",
+ "file_step = '1H'\n",
+ "file_window = 'P14D'\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "90439600",
+ "metadata": {},
+ "source": [
+ "
AnA High Flow Magnitude
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "id": "5d84a41f",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'ana_high_flow_magnitude'\n",
+ "configuration = 'analysis_assim'\n",
+ "postprocess_max_flows = 'ana_max_flows'\n",
+ "postprocess_service = 'ana_high_flow_magnitude'\n",
+ "postprocess_summary = None\n",
+ "summary = 'High Flow Magnitude'\n",
+ "description = 'Depicts the magnitude of the National Water Model (NWM) streamflow forecast where the NWM is signaling high water. This service is derived from the analysis and assimilation configuration of the NWM over the contiguous U.S. Shown are reaches with flow at or above high water thresholds. Reaches are colored by the annual exceedance probability (AEP) of their current flow. High water thresholds (regionally varied) and AEPs were derived using the 40-year NWM v2.1 reanalysis simulation.'\n",
+ "tags = 'high flow, magnitude, national water model, nwm'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'nwm'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = True\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = 'common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/analysis_assim/nwm.t{{datetime:%H}}z.analysis_assim.channel_rt.tm00.conus.nc'\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_ana'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '2'\n",
+ "step = 'max_flows'\n",
+ "file_format = None\n",
+ "source_table = 'ingest.nwm_channel_rt_ana'\n",
+ "target_table = 'cache.max_flows_ana'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "004ff078",
+ "metadata": {},
+ "source": [
+ "
AnA High Flow Magnitude Hawaii
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 9,
+ "id": "2e96bb31",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'ana_high_flow_magnitude_hi'\n",
+ "configuration = 'analysis_assim_hawaii'\n",
+ "postprocess_max_flows = 'ana_max_flows_hi'\n",
+ "postprocess_service = 'ana_high_flow_magnitude_hi'\n",
+ "postprocess_summary = None\n",
+ "summary = 'High Flow Magnitude Hawaii'\n",
+ "description = 'Depicts the magnitude of the National Water Model (NWM) streamflow forecast where the NWM is signaling high water. This service is derived from the analysis and assimilation configuration of the NWM over Hawaii. Shown are reaches with flow at or above high water thresholds. Reaches are colored by the annual exceedance probability (AEP) of their current flow. High water thresholds and AEPs were derived from USGS regression equations found at https://pubs.usgs.gov/sir/2010/5035/sir2010-5035_text.pdf.'\n",
+ "tags = 'high,flow,magnitude,national,water,model,nwm,hawaii,hi'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'nwm'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = True\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = 'common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/analysis_assim_hawaii/nwm.t{{datetime:%H}}z.analysis_assim.channel_rt.tm0000.hawaii.nc'\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_ana_hi'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '2'\n",
+ "step = 'max_flows'\n",
+ "file_format = None\n",
+ "source_table = 'ingest.nwm_channel_rt_ana_hi'\n",
+ "target_table = 'cache.max_flows_ana_hi'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "afbce706",
+ "metadata": {},
+ "source": [
+ "
AnA High Flow Magnitude Puerto Rico
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "id": "5502b0f2",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'ana_high_flow_magnitude_prvi'\n",
+ "configuration = 'analysis_assim_puertorico'\n",
+ "postprocess_max_flows = 'ana_max_flows_prvi'\n",
+ "postprocess_service = 'ana_high_flow_magnitude_prvi'\n",
+ "postprocess_summary = None\n",
+ "summary = 'High Flow Magnitude Puerto Rico/Virgin Islands'\n",
+ "description = 'Depicts the magnitude of the National Water Model (NWM) streamflow forecast where the NWM is signaling high water. This service is derived from the analysis and assimilation configuration of the NWM over Puerto Rico and the U.S. Virgin Islands. Shown are reaches with flow at or above high water thresholds. Reaches are colored by the annual exceedance probability (AEP) of their current flow. High water thresholds and AEPs were derived from USGS regression equations found at https://pubs.usgs.gov/wri/wri994142/pdf/wri99-4142.pdf.'\n",
+ "tags = 'high,flow,magnitude,national,water,model,nwm,puerto,rico,virgin,islands,prvi'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'nwm'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = True\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = 'common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/analysis_assim_puertorico/nwm.t{{datetime:%H}}z.analysis_assim.channel_rt.tm00.puertorico.nc'\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_ana_prvi'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '2'\n",
+ "step = 'max_flows'\n",
+ "file_format = None\n",
+ "source_table = 'ingest.nwm_channel_rt_ana_prvi'\n",
+ "target_table = 'cache.max_flows_ana_prvi'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "1c9f8781",
+ "metadata": {},
+ "source": [
+ "
AnA Past 14-Day High Flow Magnitude Analysis
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "id": "bf515022",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'ana_past_14day_max_high_flow_magnitude'\n",
+ "configuration = 'analysis_assim_14day'\n",
+ "postprocess_max_flows = 'ana_14day_max_flows'\n",
+ "postprocess_service = 'ana_past_14day_max_high_flow_magnitude'\n",
+ "postprocess_summary = None\n",
+ "summary = 'Past 14-Day High Flow Magnitude Analysis'\n",
+ "description = 'Depicts the magnitude of the peak NWM streamflow analysis over the past 14 days where the National Water Model (NWM) signaled high water. This service is derived from the analysis and assimilation configuration of the NWM over the contiguous U.S. Shown are reaches with flow at or above high water thresholds. Reaches are colored by the annual exceedance probability (AEP) of their maximum flow over the past 14 days. High water thresholds (regionally varied) and AEPs were derived using the 40-year NWM v2.1 reanalysis simulation.'\n",
+ "tags = 'high,flow,magnitude,national,water,model,nwm,14day'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'nwm'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = True\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = 'max_flows/analysis_assim/{{datetime:%Y%m%d}}/ana_14day_00_max_flows.nc'\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_ana_14day_max'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '2'\n",
+ "step = 'ingest'\n",
+ "file_format = 'max_flows/analysis_assim/{{datetime:%Y%m%d}}/ana_7day_00_max_flows.nc'\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_ana_7day_max'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "6ae47b80",
+ "metadata": {},
+ "source": [
+ "
AnA Streamflow
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "id": "bc99f1b4",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'ana_streamflow'\n",
+ "configuration = 'analysis_assim'\n",
+ "postprocess_max_flows = 'ana_max_flows'\n",
+ "postprocess_service = 'ana_streamflow'\n",
+ "postprocess_summary = None\n",
+ "summary = 'Streamflow Analysis'\n",
+ "description = 'Depicts the streamflow output from the operational National Water Model (v2.1) analysis and assimilation for the continental United States. Updated hourly.'\n",
+ "tags = 'streamflow,national,water,model,nwm'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'nwm'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = False\n",
+ "fim_configs = None\n",
+ "public_service = False\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = 'common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/analysis_assim/nwm.t{{datetime:%H}}z.analysis_assim.channel_rt.tm00.conus.nc'\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_ana'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '2'\n",
+ "step = 'max_flows'\n",
+ "file_format = None\n",
+ "source_table = 'ingest.nwm_channel_rt_ana'\n",
+ "target_table = 'cache.max_flows_ana'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "289863ea",
+ "metadata": {},
+ "source": [
+ "
AnA Streamflow Hawaii
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 72,
+ "id": "5d3378cf",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'ana_streamflow_hi'\n",
+ "configuration = 'analysis_assim_hawaii'\n",
+ "postprocess_max_flows = 'ana_max_flows_hi'\n",
+ "postprocess_service = 'ana_streamflow_hi'\n",
+ "postprocess_summary = None\n",
+ "summary = 'Streamflow Analysis for Hawaii'\n",
+ "description = 'Depicts the streamflow output from the operational National Water Model (v2.2) analysis and assimilation for the state of Hawaii. Updated hourly.'\n",
+ "tags = 'national water model, nwm, hawaii'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'nwm'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "public_service = False\n",
+ "run = False\n",
+ "fim_configs = None\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = 'common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/analysis_assim_hawaii/nwm.t{{datetime:%H}}z.analysis_assim.channel_rt.tm0000.hawaii.nc'\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_ana_hi'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '2'\n",
+ "step = 'max_flows'\n",
+ "file_format = None\n",
+ "source_table = 'ingest.nwm_channel_rt_ana_hi'\n",
+ "target_table = 'cache.max_flows_ana_hi'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "205f2d34",
+ "metadata": {},
+ "source": [
+ "
AnA Streamflow Puerto Rico
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 73,
+ "id": "5f27935f",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'ana_streamflow_prvi'\n",
+ "configuration = 'analysis_assim_puertorico'\n",
+ "postprocess_max_flows = 'ana_max_flows_prvi'\n",
+ "postprocess_service = 'ana_streamflow_prvi'\n",
+ "postprocess_summary = None\n",
+ "summary = 'Streamflow Analysis for Puerto Rico and Virgin Islands'\n",
+ "description = 'Depicts the streamflow output from the operational National Water Model (v2.2) analysis and assimilation for Puerto Rico and the U.S. Virgin Islands. Updated hourly.'\n",
+ "tags = 'streamflow,national,water,model,nwm,puerto,rico,virgin,islands,prvi'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'nwm'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "public_service = False\n",
+ "run = False\n",
+ "fim_configs = None\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = 'common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/analysis_assim_puertorico/nwm.t{{datetime:%H}}z.analysis_assim.channel_rt.tm00.puertorico.nc'\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_ana_prvi'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '2'\n",
+ "step = 'max_flows'\n",
+ "file_format = None\n",
+ "source_table = 'ingest.nwm_channel_rt_ana_prvi'\n",
+ "target_table = 'cache.max_flows_ana_prvi'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "73b70347",
+ "metadata": {},
+ "source": [
+ "
Medium-Range High Water Arrival Time Forecast
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "id": "1feba880",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'mrf_high_water_arrival_time'\n",
+ "configuration = 'medium_range_mem1'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = 'mrf_high_water_arrival_time'\n",
+ "postprocess_summary = None\n",
+ "summary = 'Medium-Range High Water Arrival Time Forecast'\n",
+ "description = 'Depicts the forecast arrival time of high water over the next 10 days. This service is derived from the medium-range configuration of the National Water Model (NWM) over the contiguous U.S. Shown are reaches that are expected to have flow at or above the high water threshold over the next 10 days. Reaches are colored by the time at which they are forecast to reach high water (calculated in 3 hour increments). High water thresholds (regionally varied) were derived using the 40-year NWM v2.1 reanalysis simulation.'\n",
+ "tags = 'medium range forecast, high water, arrival time, national water model, nwm'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'nwm'\n",
+ "fim_service = False\n",
+ "feature_service = True\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = True\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/medium_range_mem1/nwm.t{{datetime:%H}}z.medium_range.channel_rt_1.f{{range:3,241,3,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_mrf_mem1'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "2cc68d4c",
+ "metadata": {},
+ "source": [
+ "
Medium-Range High Water Probability Forecast
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "id": "4a576091",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'mrf_high_water_probability'\n",
+ "configuration = 'medium_range_mem1'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = None\n",
+ "postprocess_summary = None\n",
+ "summary = 'Medium-Range High Water Probability Forecast'\n",
+ "description = 'Depicts the probability of forecast high water over the next 5 days using ensembles from the medium-range configuration of the National Water Model (NWM) over the contiguous U.S. Shown are reaches that are expected to have flow at or above high water on Day 1, Day 2, Day 3, and Days 4-5, using the 7 ensemble members of the medium-range forecast. Reaches are colored by the probability that they will meet or exceed the high water threshold on Day 1, Day 2, Day 3, and Days 4-5. Probabilities are computed as the % agreement across the 7 ensemble members, equally weighted. Also shown are USGS HUC8 polygons for basins with greater than 50% of NWM features with flow expected to be at or above high water over the next 5 days, symbolized by the average probability. High water thresholds (regionally varied) were derived using the 40-year NWM v2.1 reanalysis simulation.'\n",
+ "tags = 'medium,range,forecast,high,water,ensemble,national,water,model,nwm,probability'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'nwm'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = False\n",
+ "fim_configs = None\n",
+ "public_service = True\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/medium_range_mem1/nwm.t{{datetime:%H}}z.medium_range.channel_rt_1.f{{range:3,121,3,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_mrf_mem1'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '2'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/medium_range_mem2/nwm.t{{datetime:%H}}z.medium_range.channel_rt_2.f{{range:3,121,3,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_mrf_mem2'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '3'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/medium_range_mem3/nwm.t{{datetime:%H}}z.medium_range.channel_rt_3.f{{range:3,121,3,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_mrf_mem3'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '4'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/medium_range_mem4/nwm.t{{datetime:%H}}z.medium_range.channel_rt_4.f{{range:3,121,3,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_mrf_mem4'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '5'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/medium_range_mem5/nwm.t{{datetime:%H}}z.medium_range.channel_rt_5.f{{range:3,121,3,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_mrf_mem5'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '6'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/medium_range_mem6/nwm.t{{datetime:%H}}z.medium_range.channel_rt_6.f{{range:3,121,3,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_mrf_mem6'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '7'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/medium_range_mem7/nwm.t{{datetime:%H}}z.medium_range.channel_rt_7.f{{range:3,121,3,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_mrf_mem7'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "9b0b7820",
+ "metadata": {},
+ "source": [
+ "
Medium-Range Maximum High Flow Magnitude Forecast
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "id": "ba20e1c0",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'mrf_max_high_flow_magnitude'\n",
+ "configuration = 'medium_range_mem1'\n",
+ "postprocess_max_flows = 'mrf_max_flows'\n",
+ "postprocess_service = 'mrf_max_high_flow_magnitude'\n",
+ "postprocess_summary = None\n",
+ "summary = 'Medium-Range Maximum High Flow Magnitude Forecast'\n",
+ "description = 'Depicts the magnitude of the peak National Water Model (NWM) streamflow forecast over the next 3, 5 and 10 days where the NWM is signaling high water. This service is derived from the medium-range configuration of the NWM over the contiguous U.S. Shown are reaches with peak flow at or above high water thresholds. Reaches are colored by the annual exceedance probability (AEP) of their forecast peak flow. High water thresholds (regionally varied) and AEPs were derived using the 40-year NWM v2.1 reanalysis simulation.'\n",
+ "tags = 'medium range forecast, maximum, high flow, magnitude, national water model, nwm'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'nwm'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = True\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = 'common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/medium_range_mem1/nwm.t{{datetime:%H}}z.medium_range.channel_rt_1.f{{range:3,243,3,%03d}}.conus.nc'\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_mrf_mem1'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '2'\n",
+ "step = 'max_flows'\n",
+ "file_format = None\n",
+ "source_table = 'ingest.nwm_channel_rt_mrf_mem1'\n",
+ "target_table = 'cache.max_flows_mrf'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "04a0dae6",
+ "metadata": {},
+ "source": [
+ "
Medium-Range Peak Flow Arrival Time Forecast
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "id": "c2df6424",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'mrf_peak_flow_arrival_time'\n",
+ "configuration = 'medium_range_mem1'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = 'mrf_peak_flow_arrival_time'\n",
+ "postprocess_summary = None\n",
+ "summary = 'Medium-Range Peak Flow Arrival Time Forecast'\n",
+ "description = 'Depicts expected peak flow arrival times derived from the operational National Water Model (NWM) (v2.2) medium-range forecast. Shown are reaches that are expected to have flow at or above the high water threshold over the next 3 and 10 days.. Reaches are colored by the time at which they are expected to be at their maximum flow within the forecast period. High water flows were derived using a 40-year retrospective analysis of the NWM (v2.1). Updated hourly.'\n",
+ "tags = 'medium range forecast, peak flow, arrival time, national water model, nwm, conus'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'nwm'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = False\n",
+ "fim_configs = None\n",
+ "public_service = False\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = 'common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/medium_range_mem1/nwm.t{{datetime:%H}}z.medium_range.channel_rt_1.f{{range:3,243,3,%03d}}.conus.nc'\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_mrf_mem1'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "35a00e3c",
+ "metadata": {},
+ "source": [
+ "
Medium-Range Rapid Onset Flooding Forecast
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "id": "c8bb17a1",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'mrf_rapid_onset_flooding'\n",
+ "configuration = 'medium_range_mem1'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = 'mrf_rapid_onset_flooding'\n",
+ "postprocess_summary = 'mrf_rapid_onset_flooding_hucs'\n",
+ "summary = 'Medium-Range Rapid Onset Flooding Forecast'\n",
+ "description = 'Depicts forecast rapid onset flooding using the medium-range configuration of the National Water Model (NWM) over the contiguous U.S. Shown are reaches (stream order 4 and below) with a forecast flow increase of 100% or greater within 3 hours, and which are expected to be at or above the high water threshold within 6 hours of that increase (all calculated in 3 hour increments). Also shown are USGS HUC08 polygons symbolized by the percentage of NWM waterway length (within each HUC08) that is expected to meet the previously mentioned criteria. High water thresholds (regionally varied) were derived using the 40-year NWM v2.1 reanalysis simulation.'\n",
+ "tags = 'medium, range, forecast, streamflow, rapid, onset, flood, national, water, model, nwm'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'nwm'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = True\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = 'common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/medium_range_mem1/nwm.t{{datetime:%H}}z.medium_range.channel_rt_1.f{{range:3,243,3,%03d}}.conus.nc'\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_mrf_mem1'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "278ddf2c",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "
Medium-Range Rapid Onset Flooding Probability Forecast
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "id": "90fd3b64",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'mrf_rapid_onset_flooding_probability'\n",
+ "configuration = 'medium_range_mem1'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = None\n",
+ "postprocess_summary = None\n",
+ "summary = 'Medium-Range Rapid Onset Flooding Probability Forecast'\n",
+ "description = 'Depicts the probability of forecast rapid onset flooding over the next 10 days using ensembles from the medium-range configuration of the National Water Model (NWM) over the contiguous U.S. Shown are reaches that are expected to have flow at or above high water thresholds on Day 1, Day 2, Day 3, Days 4-5, and Days 1-5 using the 7 ensemble members of the medium-range forecast. Reaches are colored by the probability that they will meet or exceed rapid onset conditions on Day 1, Day 2, Day 3, Days 4-5, and Days 1-5. Probabilities are computed as the % agreement across the 7 ensemble members, equally weighted. Hotspots show average 1-5 day rapid onset flooding probability, weighted by reach length, for USGS HUC8 basins with greater than 10% of NWM feature length meeting rapid onset criteria in the next 5 days. High water thresholds (regionally varied) were derived using the 40-year NWM v2.1 reanalysis simulation.'\n",
+ "tags = 'medium,range,forecast,streamflow,rapid,onset,flood,probability,national,water,model,nwm'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'nwm'\n",
+ "fim_service = False\n",
+ "feature_service = True\n",
+ "run = False\n",
+ "fim_configs = None\n",
+ "public_service = True\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/medium_range_mem1/nwm.t{{datetime:%H}}z.medium_range.channel_rt_1.f{{range:3,121,3,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_mrf_mem1'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '2'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/medium_range_mem2/nwm.t{{datetime:%H}}z.medium_range.channel_rt_2.f{{range:3,121,3,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_mrf_mem2'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '3'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/medium_range_mem3/nwm.t{{datetime:%H}}z.medium_range.channel_rt_3.f{{range:3,121,3,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_mrf_mem3'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '4'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/medium_range_mem4/nwm.t{{datetime:%H}}z.medium_range.channel_rt_4.f{{range:3,121,3,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_mrf_mem4'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '5'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/medium_range_mem5/nwm.t{{datetime:%H}}z.medium_range.channel_rt_5.f{{range:3,121,3,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_mrf_mem5'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '6'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/medium_range_mem6/nwm.t{{datetime:%H}}z.medium_range.channel_rt_6.f{{range:3,121,3,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_mrf_mem6'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '7'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/medium_range_mem7/nwm.t{{datetime:%H}}z.medium_range.channel_rt_7.f{{range:3,121,3,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_mrf_mem7'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "aeb4dfea",
+ "metadata": {},
+ "source": [
+ "
RFC 5-Day Maximum Streamflow
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 18,
+ "id": "294f9687",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'rfc_5day_max_downstream_streamflow'\n",
+ "configuration = 'replace_route'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = 'rfc_5day_max_downstream_streamflow'\n",
+ "postprocess_summary = 'rfc_5day_max_downstream_streamflow_rfc_points'\n",
+ "summary = 'RFC 5-Day Maximum Streamflow'\n",
+ "description = 'Depicts maximum forecast streamflow over the next 5 days derived from the official River Forecast Center (RFC) forecast routed downstream through the National Water Model (NWM) stream network. Maximum streamflows are available downstream of RFC forecast points whose forecast reaches action status or greater.'\n",
+ "tags = 'streamflow, replace and route, rfc'\n",
+ "credits = 'NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'rfc'\n",
+ "fim_service = False\n",
+ "feature_service = True\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = True\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = 'max_flows/replace_route/{{datetime:%Y%m%d}}/rnr_{{datetime:%H}}_max_flows.csv'\n",
+ "source_table = None\n",
+ "target_table = 'ingest.rnr_max_flows'\n",
+ "target_keys = '(feature_id)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "52e9d942",
+ "metadata": {},
+ "source": [
+ "
RFC Maximum Stage Forecast
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 19,
+ "id": "5bb0b000",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'rfc_max_stage'\n",
+ "configuration = 'ahps'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = 'rfc_max_stage'\n",
+ "postprocess_summary = None\n",
+ "summary = 'RFC Maximum Stage Forecast'\n",
+ "description = 'Depicts Advanced Hydrologic Prediction Service (AHPS) River Forecast Center (RFC) forecast points with forecasts at or above action stage. Circles represent forecast points where stages are changing by less than +/- 5% over the entire forecast period. Upward-pointing triangles represent forecast points where a greater than 5% increase in stage is expected sometime during the forecast. If stage increases greater than 5% are not expected, downward-pointing triangles represent forecast points where a greater than 5% decrease in stage is expected sometime during the forecast. Forecast points are colored by their maximum forecast flood category.'\n",
+ "tags = 'rfc,ahps,forecast,maximum,stage'\n",
+ "credits = 'NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'rfc'\n",
+ "fim_service = False\n",
+ "feature_service = True\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = True\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = 'max_stage/ahps/{{datetime:%Y%m%d}}/{{datetime:%H_%M}}_ahps_forecasts.csv'\n",
+ "source_table = None\n",
+ "target_table = 'ingest.ahps_forecasts'\n",
+ "target_keys = '(nws_lid)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '2'\n",
+ "step = 'ingest'\n",
+ "file_format = 'max_stage/ahps/{{datetime:%Y%m%d}}/{{datetime:%H_%M}}_ahps_metadata.csv'\n",
+ "source_table = None\n",
+ "target_table = 'ingest.ahps_metadata'\n",
+ "target_keys = '(nws_lid)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "fe59bf78",
+ "metadata": {},
+ "source": [
+ "
Short-Range High Water Arrival Time Forecast
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 20,
+ "id": "5afc0360",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'srf_high_water_arrival_time'\n",
+ "configuration = 'short_range'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = 'srf_high_water_arrival_time'\n",
+ "postprocess_summary = None\n",
+ "summary = 'Short-Range High Water Arrival Time Forecast'\n",
+ "description = 'Depicts the forecast arrival time of high water over the next 18 hours. This service is derived from the short-range configuration of the National Water Model (NWM) over the contiguous U.S. Shown are reaches that are expected to have flow at or above the high water threshold over the next 18 hours. Reaches are colored by the time at which they are forecast to reach high water. High water thresholds (regionally varied) were derived using the 40-year NWM v2.1 reanalysis simulation.'\n",
+ "tags = 'short range forecast, high water, arrival time, national water model, nwm'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'nwm'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = True\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/short_range/nwm.t{{datetime:%H}}z.short_range.channel_rt.f{{range:1,19,1,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_srf'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "81f03ea2",
+ "metadata": {},
+ "source": [
+ "
Short-Range High Water Arrival Time Forecast - Hawaii
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 21,
+ "id": "99060bc2",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'srf_high_water_arrival_time_hi'\n",
+ "configuration = 'short_range_hawaii'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = 'srf_high_water_arrival_time_hi'\n",
+ "postprocess_summary = None\n",
+ "summary = 'Short-Range High Water Arrival Time Forecast - Hawaii'\n",
+ "description = 'Depicts the forecast arrival time of high water over the next 48 hours. This service is derived from the short-range configuration of the National Water Model (NWM) over Hawaii. Shown are reaches that are expected to have flow at or above the high water threshold over the next 48 hours. Reaches are colored by the time at which they are forecast to reach high water. High water flows were derived from USGS regression equations found at https://pubs.usgs.gov/sir/2010/5035/sir2010-5035_text.pdf.'\n",
+ "tags = 'short range forecast, high water, arrival time, national water model, nwm, hawaii'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'nwm'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = True\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/short_range_hawaii/nwm.t{{datetime:%H}}z.short_range.channel_rt.f{{range:100,4900,100,%05d}}.hawaii.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_srf_hi'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "e36e4c5a",
+ "metadata": {},
+ "source": [
+ "
Short-Range High Water Arrival Time - Puerto Rico/Virgin Islands
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 22,
+ "id": "f5c2b55a",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'srf_high_water_arrival_time_prvi'\n",
+ "configuration = 'short_range_puertorico'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = 'srf_high_water_arrival_time_prvi'\n",
+ "postprocess_summary = None\n",
+ "summary = 'Short-Range High Water Arrival Time - Puerto Rico/Virgin Islands'\n",
+ "description = 'Depicts the forecast arrival time of high water over the next 48 hours. This service is derived from the short-range configuration of the National Water Model (NWM) over Puerto Rico and the U.S. Virgin Islands. Shown are reaches that are expected to have flow at or above the high water threshold over the next 48 hours. Reaches are colored by the time at which they are forecast to reach high water. High water flows were derived from USGS regression equations found at https://pubs.usgs.gov/wri/wri994142/pdf/wri99-4142.pdf.'\n",
+ "tags = 'short range forecast, high water, arrival time, national water model, nwm, puerto rico, virgin islands'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'nwm'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = True\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/short_range_puertorico/nwm.t{{datetime:%H}}z.short_range.channel_rt.f{{range:1,49,1,%03d}}.puertorico.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_srf_prvi'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "2efb925f",
+ "metadata": {},
+ "source": [
+ "
Short-Range Peak Flow Arrival Time Forecast
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "id": "adb59f27",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'srf_peak_flow_arrival_time'\n",
+ "configuration = 'short_range'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = 'srf_peak_flow_arrival_time'\n",
+ "postprocess_summary = None\n",
+ "summary = 'Short-Range Peak Flow Arrival Time Forecast'\n",
+ "description = 'Depicts expected peak flow arrival times derived from the operational National Water Model (NWM) (v2.2) short-range forecast. Shown are reaches that are expected to have flow at or above the high water threshold over the next 18 hours. Reaches are colored by the time at which they are expected to be at their maximum flow within the forecast period. High water flows were derived using a 40-year retrospective analysis of the NWM (v2.1). Updated hourly.'\n",
+ "tags = 'short range forecast, peak flow, arrival time, national water model, nwm, conus'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'nwm'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = False\n",
+ "fim_configs = None\n",
+ "public_service = False\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/short_range/nwm.t{{datetime:%H}}z.short_range.channel_rt.f{{range:1,19,1,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_srf'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "6d694282",
+ "metadata": {},
+ "source": [
+ "
Short-Range Peak Flow Arrival Time Forecast - Hawaii
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "id": "4fb14f8b",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'srf_peak_flow_arrival_time_hi'\n",
+ "configuration = 'short_range_hawaii'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = 'srf_peak_flow_arrival_time_hi'\n",
+ "postprocess_summary = None\n",
+ "summary = 'Short-Range Peak Flow Arrival Time Forecast - Hawaii'\n",
+ "description = 'Depicts expected peak flow arrival times derived from the operational National Water Model (NWM) (v2.2) short-range forecast for the state of Hawaii. Shown are reaches that are expected to have flow at or above high water threshold flow over the next 48 hours. Reaches are colored by the time at which they are expected to be at their maximum flow within the forecast period. High water threshold flows and annual exceedance probabilities were derived from USGS regression equations found at https://pubs.usgs.gov/sir/2010/5035/sir2010-5035_text.pdf. Updated every 12 hours.'\n",
+ "tags = 'short range forecast, peak flow, arrival time, national water model, nwm, hawaii'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'nwm'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = False\n",
+ "fim_configs = None\n",
+ "public_service = False\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/short_range_hawaii/nwm.t{{datetime:%H}}z.short_range.channel_rt.f{{range:100,4900,100,%05d}}.hawaii.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_srf_hi'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "69cf6275",
+ "metadata": {},
+ "source": [
+ "
Short-Range Peak Flow Arrival Time - Puerto Rico/Virgin Islands
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "id": "b7de97dc",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'srf_peak_flow_arrival_time_prvi'\n",
+ "configuration = 'short_range_puertorico'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = 'srf_peak_flow_arrival_time_prvi'\n",
+ "postprocess_summary = None\n",
+ "summary = 'Short-Range Peak Flow Arrival Time - Puerto Rico/Virgin Islands'\n",
+ "description = 'Depicts expected peak flow arrival times derived from the operational National Water Model (NWM) (v2.2) short-range forecast for Puerto Rico and the U.S. Virgin Islands. Shown are reaches that are expected to have flow at or above high water threshold flow over the next 48 hours. Reaches are colored by the time at which they are expected to be at their maximum flow within the forecast period. High water threshold flows and annual exceedance probabilities were derived from USGS regression equations found at https://pubs.usgs.gov/wri/wri994142/pdf/wri99-4142.pdf. Updated every 12 hours.'\n",
+ "tags = 'short range forecast, peak flow, arrival time, national water model, nwm, prvi'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'nwm'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = False\n",
+ "fim_configs = None\n",
+ "public_service = False\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/short_range_puertorico/nwm.t{{datetime:%H}}z.short_range.channel_rt.f{{range:1,49,1,%03d}}.puertorico.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_srf_prvi'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "00570e84",
+ "metadata": {},
+ "source": [
+ "
Short-Range High Water Probability Forecast
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 26,
+ "id": "8b7f4661",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'srf_high_water_probability'\n",
+ "configuration = 'short_range'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = None\n",
+ "postprocess_summary = None\n",
+ "summary = 'Short-Range High Water Probability Forecast'\n",
+ "description = 'Depicts the probability of forecast high water over the next 12 hours using a time-lagged ensemble from the short-range forecast of the National Water Model (NWM) over the contiguous U.S. Shown are reaches that are forecast to have flow at or above high water within the next 12 hours of at least one of the last 7 forecasts. Reaches are colored by the probability that they will meet or exceed the high water threshold across the last 7 forecasts. Probabilities are derived by counting the number of forecasts that meet the high water condition within the next 12 hours, equally weighted. Also shown are USGS HUC10 polygons for basins with greater than 50% of NWM features with flow expected to be at or above high water over the next 12 hours, symbolized by the average probability. High water thresholds (regionally varied) were derived using the 40-year NWM v2.1 reanalysis simulation.'\n",
+ "tags = 'short,range,forecast,high,water,national,water,model,nwm,probability'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'nwm'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = False\n",
+ "fim_configs = None\n",
+ "public_service = True\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/short_range/nwm.t{{datetime:%H}}z.short_range.channel_rt.f{{range:1,19,1,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_srf'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '2'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:reftime-1H, %Y%m%d}}/short_range/nwm.t{{datetime:reftime-1H, %H}}z.short_range.channel_rt.f{{range:1,19,1,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_srf_past_1h'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '3'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:reftime-2H, %Y%m%d}}/short_range/nwm.t{{datetime:reftime-2H, %H}}z.short_range.channel_rt.f{{range:1,19,1,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_srf_past_2h'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '4'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:reftime-3H, %Y%m%d}}/short_range/nwm.t{{datetime:reftime-3H, %H}}z.short_range.channel_rt.f{{range:1,19,1,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_srf_past_3h'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '5'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:reftime-4H, %Y%m%d}}/short_range/nwm.t{{datetime:reftime-4H, %H}}z.short_range.channel_rt.f{{range:1,19,1,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_srf_past_4h'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '6'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:reftime-5H, %Y%m%d}}/short_range/nwm.t{{datetime:reftime-5H, %H}}z.short_range.channel_rt.f{{range:1,19,1,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_srf_past_5h'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '7'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:reftime-6H, %Y%m%d}}/short_range/nwm.t{{datetime:reftime-6H, %H}}z.short_range.channel_rt.f{{range:1,19,1,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_srf_past_6h'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "4c183cbf",
+ "metadata": {},
+ "source": [
+ "
Short-Range Maximum High Flow Magnitude Forecast
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 27,
+ "id": "0049b269",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'srf_max_high_flow_magnitude'\n",
+ "configuration = 'short_range'\n",
+ "postprocess_max_flows = 'srf_max_flows'\n",
+ "postprocess_service = 'srf_max_high_flow_magnitude'\n",
+ "postprocess_summary = None\n",
+ "summary = 'Short-Range Maximum High Flow Magnitude Forecast'\n",
+ "description = 'Depicts the magnitude of the peak National Water Model (NWM) streamflow forecast over the next 18 hours where the NWM is signaling high water. This service is derived from the short-range configuration of the NWM over the contiguous U.S. Shown are reaches with peak flow at or above high water thresholds. Reaches are colored by the annual exceedance probability (AEP) of their forecast peak flow. High water thresholds (regionally varied) and AEPs were derived using the 40-year NWM v2.1 reanalysis simulation.'\n",
+ "tags = 'short range forecast, maximum, high flow, magnitude, national water model, nwm'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'nwm'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = True\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/short_range/nwm.t{{datetime:%H}}z.short_range.channel_rt.f{{range:1,19,1,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_srf'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '2'\n",
+ "step = 'max_flows'\n",
+ "file_format = None\n",
+ "source_table = 'ingest.nwm_channel_rt_srf'\n",
+ "target_table = 'cache.max_flows_srf'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "3ae97c12",
+ "metadata": {},
+ "source": [
+ "
Short-Range Maximum High Flow Magnitude Hawaii
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 28,
+ "id": "58bf7f26",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'srf_max_high_flow_magnitude_hi'\n",
+ "configuration = 'short_range_hawaii'\n",
+ "postprocess_max_flows = 'srf_max_flows_hi'\n",
+ "postprocess_service = 'srf_max_high_flow_magnitude_hi'\n",
+ "postprocess_summary = None\n",
+ "summary = 'Short-Range Maximum High Flow Magnitude Hawaii'\n",
+ "description = 'Depicts the magnitude of the peak National Water Model (NWM) streamflow forecast over the next 48 hours where the NWM is signaling high water. This service is derived from the short-range configuration of the NWM over Hawaii. Shown are reaches with peak flow at or above high water thresholds. Reaches are colored by the annual exceedance probability (AEP) of their peak flow. High water thresholds and AEPs were derived from USGS regression equations found at https://pubs.usgs.gov/sir/2010/5035/sir2010-5035_text.pdf.'\n",
+ "tags = 'high,flow,magnitude,national,water,model,nwm,hawaii,srf,hi'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'nwm'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = True\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/short_range_hawaii/nwm.t{{datetime:%H}}z.short_range.channel_rt.f{{range:100,4900,100,%05d}}.hawaii.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_srf_hi'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '2'\n",
+ "step = 'max_flows'\n",
+ "file_format = None\n",
+ "source_table = 'ingest.nwm_channel_rt_srf_hi'\n",
+ "target_table = 'cache.max_flows_srf_hi'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "2e999ceb",
+ "metadata": {},
+ "source": [
+ "
Short-Range Maximum High Flow Magnitude Puerto Rico/Virgin Islands
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 29,
+ "id": "e40bf8c7",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'srf_max_high_flow_magnitude_prvi'\n",
+ "configuration = 'short_range_puertorico'\n",
+ "postprocess_max_flows = 'srf_max_flows_prvi'\n",
+ "postprocess_service = 'srf_max_high_flow_magnitude_prvi'\n",
+ "postprocess_summary = None\n",
+ "summary = 'Short-Range Maximum High Flow Magnitude Puerto Rico/Virgin Islands'\n",
+ "description = 'Depicts the magnitude of the peak National Water Model (NWM) streamflow forecast over the next 48 hours where the NWM is signaling high water. This service is derived from the short-range configuration of the NWM over Puerto Rico and the U.S. Virgin Islands. Shown are reaches with peak flow at or above high water thresholds. Reaches are colored by the annual exceedance probability (AEP) of their peak flow. High water thresholds and AEPs were derived from USGS regression equations found at https://pubs.usgs.gov/wri/wri994142/pdf/wri99-4142.pdf.'\n",
+ "tags = 'high flow, magnitude, national water model, nwm, puerto rico, virgin islands, srf'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'nwm'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = True\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/short_range_puertorico/nwm.t{{datetime:%H}}z.short_range.channel_rt.f{{range:1,49,1,%03d}}.puertorico.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_srf_prvi'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '2'\n",
+ "step = 'max_flows'\n",
+ "file_format = None\n",
+ "source_table = 'ingest.nwm_channel_rt_srf_prvi'\n",
+ "target_table = 'cache.max_flows_srf_prvi'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "7f6934b2",
+ "metadata": {},
+ "source": [
+ "
Short-Range Rapid Onset Flooding
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 30,
+ "id": "1ffdf1df",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'srf_rapid_onset_flooding'\n",
+ "configuration = 'short_range'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = 'srf_rapid_onset_flooding'\n",
+ "postprocess_summary = 'srf_rapid_onset_flooding_hucs'\n",
+ "summary = 'Short-Range Rapid Onset Flooding'\n",
+ "description = 'Depicts forecast rapid onset flooding using the short-range configuration of the National Water Model (NWM) over the contiguous U.S. Shown are reaches (stream order 4 and below) with a forecast flow increase of 100% or greater within an hour, and which are expected to be at or above the high water threshold within 6 hours of that increase. Also shown are USGS HUC10 polygons symbolized by the percentage of NWM waterway length (within each HUC10) that is expected to meet the previously mentioned criteria. High water thresholds (regionally varied) were derived using the 40-year NWM v2.1 reanalysis simulation.'\n",
+ "tags = 'short, range, forecast, streamflow, rapid, onset, flood, national, water, model, nwm'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'nwm'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = True\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/short_range/nwm.t{{datetime:%H}}z.short_range.channel_rt.f{{range:1,19,1,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_srf'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "c00c4f4f",
+ "metadata": {},
+ "source": [
+ "
Short-Range Rapid Onset Flooding Probability Forecast
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 31,
+ "id": "a057a4d3",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'srf_rapid_onset_flooding_probability'\n",
+ "configuration = 'short_range'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = None\n",
+ "postprocess_summary = None\n",
+ "summary = 'Short-Range Rapid Onset Flooding Probability Forecast'\n",
+ "description = 'Depicts the probability of forecast rapid onset flooding over the next 18 hours using a time-lagged ensemble from the short-range configuration of the National Water Model (NWM) over the contiguous U.S. Shown are reaches (stream order 4 and below) that are expected to meet rapid onset flooding criteria (flow increase of 100% or greater within one hour and high water threshold conditions within 6 hours) using the most recent 7 forecasts. Reaches are colored by the probability that they will meet or exceed rapid onset conditions within hours 1-6, 7-12, and 1-12. Probabilities are computed as the % agreement across the 7 ensemble members that a given reach will meet rapid onset criteria at some point during the time period of interest. Hotspots show the average 1-12 hour rapid onset flooding probability, weighted by reach length, for USGS HUC10 basins with greater than 10% of NWM feature length meeting rapid onset criteria in the next 12 hours. High water thresholds (regionally varied) were derived using the 40-year NWM v2.1 reanalysis simulation.'\n",
+ "tags = 'short,range,forecast,streamflow,rapid,onset,flood,probability,national,water,model,nwm'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'nwm'\n",
+ "fim_service = False\n",
+ "feature_service = True\n",
+ "run = False\n",
+ "fim_configs = None\n",
+ "public_service = True\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/short_range/nwm.t{{datetime:%H}}z.short_range.channel_rt.f{{range:1,19,1,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_srf'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '2'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:reftime-1H, %Y%m%d}}/short_range/nwm.t{{datetime:reftime-1H, %H}}z.short_range.channel_rt.f{{range:1,19,1,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_srf_past_1h'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '3'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:reftime-2H, %Y%m%d}}/short_range/nwm.t{{datetime:reftime-2H, %H}}z.short_range.channel_rt.f{{range:1,19,1,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_srf_past_2h'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '4'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:reftime-3H, %Y%m%d}}/short_range/nwm.t{{datetime:reftime-3H, %H}}z.short_range.channel_rt.f{{range:1,19,1,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_srf_past_3h'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '5'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:reftime-4H, %Y%m%d}}/short_range/nwm.t{{datetime:reftime-4H, %H}}z.short_range.channel_rt.f{{range:1,19,1,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_srf_past_4h'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '6'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:reftime-5H, %Y%m%d}}/short_range/nwm.t{{datetime:reftime-5H, %H}}z.short_range.channel_rt.f{{range:1,19,1,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_srf_past_5h'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)\n",
+ "\n",
+ "flow_id = '7'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:reftime-6H, %Y%m%d}}/short_range/nwm.t{{datetime:reftime-6H, %H}}z.short_range.channel_rt.f{{range:1,19,1,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_srf_past_6h'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "75e6b414",
+ "metadata": {},
+ "source": [
+ "
Short-Range Rate of Change
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 18,
+ "id": "eff11341",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'srf_rate_of_change'\n",
+ "configuration = 'short_range'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = 'srf_rate_of_change'\n",
+ "postprocess_summary = None\n",
+ "summary = 'NWM 18-Hr Streamflow Rate of Change Forecast'\n",
+ "description = 'Depicts expected change in discharge derived from the operational National Water Model (NWM) (v2.1) short-range forecast. Change is computed between the current streamflow and that expected over the next 18 hours at 3-hour intervals, and is only displayed for reaches that are expected to have flow at or above their high water threshold. High water thresholds (regionally varied) and AEPs were derived using the 40-year NWM v2.1 reanalysis simulation.'\n",
+ "tags = 'short,range,forecast,streamflow,rate,change,national,water,model,nwm'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'nwm'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = False\n",
+ "fim_configs = None\n",
+ "public_service = False\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n",
+ "\n",
+ "flow_id = '1'\n",
+ "step = 'ingest'\n",
+ "file_format = \"common/data/model/com/nwm/prod/nwm.{{datetime:%Y%m%d}}/short_range/nwm.t{{datetime:%H}}z.short_range.channel_rt.f{{range:1,19,1,%03d}}.conus.nc\"\n",
+ "source_table = None\n",
+ "target_table = 'ingest.nwm_channel_rt_srf'\n",
+ "target_keys = '(feature_id, streamflow)'\n",
+ "file_step = None\n",
+ "file_window = None\n",
+ "\n",
+ "update_service_data_flows(service, flow_id, step, source_table, target_table, target_keys, file_step=file_step, file_window=file_window, file_format=file_format)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "0263828c",
+ "metadata": {},
+ "source": [
+ "
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 27,
+ "id": "e264cddb",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'fim_performance'\n",
+ "configuration = 'reference'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = None\n",
+ "postprocess_summary = None\n",
+ "summary = 'Height Above Nearest Drainage (HAND) Flood Inundation Mapping (FIM) Performance at HUC8s and AHPS locations.'\n",
+ "description = 'Depicts the HAND FIM method skill metrics at HUC8s (polygons) and AHPS locations (points). Metrics are computed through static comparison of benchmark maps to HAND maps at specific flow magnitudes (e.g. 100yr, 500yr, Action, Minor, Moderate, Major). Metrics reported are Critical Success Index (CSI), True Positive Rate (TPR) also known as Probability of Detection (POD), False Alarm Ratio (FAR), and Probability Not Detected (PND). Polygons and sites are symbolized by CSI by default; however, the other metrics (TPR, FAR, and PND) are available in the attributes for each location. For a description of how metrics are computed, see https://github.com/NOAA-OWP/inundation-mapping/wiki/Evaluating-HAND-Performance. Benchmark sources for HUC8 vary between FEMA Base Level Engineering, Iowa Flood Center, and RAS2FIM while benchmark sources for AHPS locations vary between USGS and AHPS FIM based on data availability at each HUC8 or AHPS location. Also, flow magnitudes used vary depending on the benchmark sources. Each feature represents the HAND FIM method skill metrics of unique benchmark data source and flow magnitude. This is a static service; it does not provide estimates of FIM skills in real-time. This service only shows the HAND FIM method skill as compared to benchmark maps produced for scenarios of specific magnitudes.'\n",
+ "tags = 'performance, skill, csi, flood, hand, fim, inundation, height, above, nearest, drainage, ahps, usgs'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center, Advanced Hydrologic Prediction Service, USGS, Iowa Flood Center, RAS2FIM'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'fim_libs'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = False\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "6f31c0e5",
+ "metadata": {},
+ "source": [
+ "
Flow Based CatFIM
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 28,
+ "id": "4d9e76e2",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'flow_based_catfim'\n",
+ "configuration = 'reference'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = None\n",
+ "postprocess_summary = None\n",
+ "summary = 'Depicts flood inundation extents derived from official NWS category thresholds and the Height Above Nearest Drainage (HAND) technique. Official threshold discharges are used.'\n",
+ "description = 'Depicts flood inundation extents derived from official NWS category thresholds and the Height Above Nearest Drainage (HAND) technique. Official threshold discharges are used. Flood inundation maps are colored according to their category. Mapping extends 5 miles upstream and downstream of AHPS gage locations. This is a static service and not a forecast service.'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center, Advanced Hydrologic Prediction Service, National River Layer Database'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'fim_libs'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = False\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "c5a7d71d",
+ "metadata": {},
+ "source": [
+ "
CONUS Full Resolution FIM Catchment Boundaries
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 51,
+ "id": "50f5ee62",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'full_resolution_fim_catchments'\n",
+ "configuration = 'reference'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = None\n",
+ "postprocess_summary = None\n",
+ "summary = 'CONUS Full Resolution FIM Catchment Boundaries'\n",
+ "description = 'Depicts the catchment boundaries for the full resolution FIM HAND dataset for CONUS'\n",
+ "tags = 'fim, inundation, national water model, nwm, reference'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'reference'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = False\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "25ed1b5b",
+ "metadata": {},
+ "source": [
+ "
Hawaii Full Resolution FIM Catchment Boundaries
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 52,
+ "id": "2abc4cc6",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'full_resolution_fim_catchments_hi'\n",
+ "configuration = 'reference'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = None\n",
+ "postprocess_summary = None\n",
+ "summary = 'Hawaii Full Resolution FIM Catchment Boundaries'\n",
+ "description = 'Depicts the catchment boundaries for the full resolution FIM HAND dataset for Hawaii'\n",
+ "tags = 'fim, inundation, national water model, nwm, reference, hawaii, hi'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'reference'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = False\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "ad28e86e",
+ "metadata": {},
+ "source": [
+ "
Puerto Rico Full Resolution FIM Catchment Boundaries
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 53,
+ "id": "4bb621af",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'full_resolution_fim_catchments_prvi'\n",
+ "configuration = 'reference'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = None\n",
+ "postprocess_summary = None\n",
+ "summary = 'Puerto Rico Full Resolution FIM Catchment Boundaries'\n",
+ "description = 'Depicts the catchment boundaries for the full resolution FIM HAND dataset for Puerto Rico and Virgin Islands'\n",
+ "tags = 'fim, inundation, national water model, nwm, reference, puertorico, prvi'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'reference'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = False\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "c0408ac1",
+ "metadata": {},
+ "source": [
+ "
CONUS Main Stem FIM Catchment Boundaries
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 54,
+ "id": "57b754ae",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'main_stem_fim_catchments'\n",
+ "configuration = 'reference'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = None\n",
+ "postprocess_summary = None\n",
+ "summary = 'CONUS Main Stem FIM Catchment Boundaries'\n",
+ "description = 'Depicts the catchment boundaries for the main stem FIM HAND dataset for CONUS'\n",
+ "tags = 'fim, inundation, national water model, nwm, reference'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'reference'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = False\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "b0cbe0ec",
+ "metadata": {},
+ "source": [
+ "
Hawaii Main Stem FIM Catchment Boundaries
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 55,
+ "id": "b81c63ec",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'main_stem_fim_catchments_hi'\n",
+ "configuration = 'reference'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = None\n",
+ "postprocess_summary = None\n",
+ "summary = 'Hawaii Main Stem FIM Catchment Boundaries'\n",
+ "description = 'Depicts the catchment boundaries for the main stem FIM HAND dataset for Hawaii'\n",
+ "tags = 'fim, inundation, national water model, nwm, reference, hawaii, hi'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'reference'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = False\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "e8a7d15c",
+ "metadata": {},
+ "source": [
+ "
Puerto Rico Main Stem FIM Catchment Boundaries
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 56,
+ "id": "9667f251",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'main_stem_fim_catchments_prvi'\n",
+ "configuration = 'reference'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = None\n",
+ "postprocess_summary = None\n",
+ "summary = 'Puerto Rico Main Stem FIM Catchment Boundaries'\n",
+ "description = 'Depicts the catchment boundaries for the main stem FIM HAND dataset for Puerto Rico and Virgin Islands'\n",
+ "tags = 'fim, inundation, national water model, nwm, reference, puertorico, prvi'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'reference'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = False\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "3cc51c49",
+ "metadata": {},
+ "source": [
+ "
NWM Annual Exceedance Probability FIM
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 29,
+ "id": "5a48a5ca",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'nwm_aep_fim'\n",
+ "configuration = 'reference'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = None\n",
+ "postprocess_summary = None\n",
+ "summary = 'Depicts flood inundation extents derived from NWM high water threshold, annual exceedance probabilities and the Height Above Nearest Drainage (HAND) technique.'\n",
+ "description = 'Depicts the inundation extent of the National Water Model (NWM) high water threshold and annual exceedance probabilities. High water thresholds (regionally varied) and AEPs were derived using the 40-year NWM v2.1 reanalysis simulation. This is a static service and not a forecast service.'\n",
+ "tags = 'nws, flood, categorical, hand, fim, inundation, height, above, nearest, drainage, aep, recurrence'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'fim_libs'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = False\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "bfe006ae",
+ "metadata": {},
+ "source": [
+ "
NWM Waterbodies
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 58,
+ "id": "d5b20f7d",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'nwm_waterbodies'\n",
+ "configuration = 'reference'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = None\n",
+ "postprocess_summary = None\n",
+ "summary = 'NWM Waterbodies'\n",
+ "description = 'Depicts the waterbodies for the National Water Model'\n",
+ "tags = 'nws, fim, reference, waterbodies'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'reference'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = False\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "06b5bd44",
+ "metadata": {},
+ "source": [
+ "
NWS Flood Categorical HAND FIM
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 30,
+ "id": "873d3714",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'nws_flood_categorical_hand_fim'\n",
+ "configuration = 'reference'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = None\n",
+ "postprocess_summary = None\n",
+ "summary = 'National Weather Service Flood Categorical Height Above Nearest Drainage Flood Inundation Maps'\n",
+ "description = 'Depicts flood inundation extents derived from official NWS category thresholds and the Height Above Nearest Drainage (HAND) technique. Flood inundation maps are colored according to their category. Mapping extends 5 miles upstream and downstream of AHPS gage locations. This is not a forecast service. Service also known as CatFIM.'\n",
+ "tags = 'nws, flood, categorical, hand, fim, catfim, inundation, library, rfc, height, above, nearest, drainage, ahps, usgs'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center, Advanced Hydrologic Prediction Service, National River Layer Database'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'fim_libs'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = False\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "59f20326",
+ "metadata": {},
+ "source": [
+ "
Possible Coastal Omission for FIM
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 60,
+ "id": "a8058768",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'possible_coastal_omission'\n",
+ "configuration = 'reference'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = None\n",
+ "postprocess_summary = None\n",
+ "summary = 'Coastal Regions where coastal flooding processes are not being considered in the visualization FIM output.'\n",
+ "description = 'Depicts areas along the coast where coastal flooding processes are not being considered in the visualization FIM output.'\n",
+ "tags = 'national water model, nwm, reference'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'reference'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = False\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "b79a049c",
+ "metadata": {},
+ "source": [
+ "
Stage Based CatFIM
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 31,
+ "id": "06700b37",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "This result object does not return rows. It has been closed automatically.\n",
+ "This result object does not return rows. It has been closed automatically.\n"
+ ]
+ }
+ ],
+ "source": [
+ "service = 'stage_based_catfim'\n",
+ "configuration = 'reference'\n",
+ "postprocess_max_flows = None\n",
+ "postprocess_service = None\n",
+ "postprocess_summary = None\n",
+ "summary = 'Depicts flood inundation extents derived from official NWS category thresholds and the Height Above Nearest Drainage (HAND) technique. Official stage thresholds are used.'\n",
+ "description = 'Depicts flood inundation extents derived from official NWS category thresholds and the Height Above Nearest Drainage (HAND) technique. Official stage thresholds (converted to water surface elevation) are used. Flood inundation maps are colored according to their category. Mapping extends 5 miles upstream and downstream of AHPS gage locations. This is a static service and not a forecast service.'\n",
+ "tags = 'nws, flood, categorical, hand, fim, catfim, inundation, library, rfc, height, above, nearest, drainage, ahps, usgs'\n",
+ "credits = 'National Water Model, NOAA/NWS National Water Center, Advanced Hydrologic Prediction Service, National River Layer Database'\n",
+ "egis_server = 'server'\n",
+ "egis_folder = 'fim_libs'\n",
+ "fim_service = False\n",
+ "feature_service = False\n",
+ "run = True\n",
+ "fim_configs = None\n",
+ "public_service = False\n",
+ "\n",
+ "update_service_metadata(service, configuration, summary, description, tags, credits, egis_server, egis_folder, max_flows_sql_name=postprocess_max_flows, service_sql_name=postprocess_service, summary_sql_name=postprocess_summary, fim_service=fim_service, feature_service=feature_service, run=run, fim_configs=fim_configs, public_service=public_service)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "c3ea30fe",
+ "metadata": {},
+ "source": [
+ "
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "6e6210f1-f3c4-489a-aa1f-53646d7d0f59",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "import os \n",
+ "from helper_functions.shared_functions import * \n",
+ "import boto3 \n",
+ "import geopandas as gpd \n",
+ "import pandas as pd\n",
+ "import sqlalchemy\n",
+ "from geoalchemy2 import Geometry\n",
+ "\n",
+ "db_type = \"egis\" \n",
+ "db_engine = get_db_engine(db_type)\n",
+ "\n",
+ "s3 = boto3.client('s3')\n",
+ "\n",
+ "# Define bucket and parent directories.\n",
+ "bucket = \"hydrovis-ti-deployment-us-east-1\"\n",
+ "parent_directory = \"qc_fim_data\"\n",
+ "local_download_parent_directory = f'brad_data/qc_fim_data'\n",
+ "\n",
+ "#file_handles = ['stage_based_catfim_sites.csv', 'stage_based_catfim.csv']\n",
+ "\n",
+ "#file_handles = ['stage_based_catfim_sites.csv', 'catfim_library_dissolved.csv']\n",
+ "\n",
+ "file_handles = ['catfim_library_dissolved.csv']\n",
+ "\n",
+ "for file_handle in file_handles: \n",
+ " \n",
+ " # Define path to file to download and its local download path, then download. \n",
+ " filename = f\"{QA_DATASETS_DPATH}/{file_handle}\" \n",
+ " print(filename) \n",
+ " local_download_path = os.path.join(local_download_parent_directory, f'{file_handle}') \n",
+ " print(f\"--> Downloading {FIM_BUCKET}/{filename} to {local_download_path}\") \n",
+ " #s3.download_file(FIM_BUCKET, filename, local_download_path)\n",
+ "\n",
+ " # -- Open file and reformat -- #\n",
+ " print(\"Reading file...\")\n",
+ " df = pd.read_csv(local_download_path)\n",
+ " print(\"File read.\")\n",
+ "\n",
+ " if file_handle == 'catfim_library_dissolved.csv':\n",
+ " file_handle = 'stage_based_catfim.csv'\n",
+ " # Rename headers.\n",
+ " df = df.rename(columns={'Unnamed: 0': 'oid', 'geometry': 'geom', 'huc':'huc8'})\n",
+ "\n",
+ " # Convert all field names to lowercase (needed for ArcGIS Pro).\n",
+ " df.columns= df.columns.str.lower()\n",
+ "\n",
+ " # Remove sites that are in derived.ahps_restricted_sites\n",
+ " restricted_sites_df = get_db_values(\"derived.ahps_restricted_sites\", [\"*\"])\n",
+ " restricted_dict = restricted_sites_df.to_dict('records')\n",
+ "\n",
+ " # Change 'mapped' to 'no' if sites are present in restricted_sites_df\n",
+ " for site in restricted_dict:\n",
+ " nws_lid = site['nws_lid'].lower()\n",
+ " if \"sites\" in file_handle:\n",
+ " df.loc[df.ahps_lid==nws_lid, 'mapped'] = 'no'\n",
+ " df.loc[df.ahps_lid==nws_lid, 'status'] = site['restricted_reason']\n",
+ " else:\n",
+ " df.loc[df.ahps_lid==nws_lid, 'viz'] = 'no'\n",
+ " df = df[df['viz']=='yes'] # Subset df to only sites desired for mapping\n",
+ "\n",
+ " for sea_level_site in ['qutg1', 'augg1', 'baxg1', 'lamf1', 'adlg1', 'hrag1', 'stng1']:\n",
+ " if \"sites\" in file_handle:\n",
+ " df.loc[df.ahps_lid==sea_level_site, 'mapped'] = 'no'\n",
+ " df.loc[df.ahps_lid==sea_level_site, 'status'] = 'Stage thresholds seem to be based on sea level and not channel thalweg'\n",
+ " else:\n",
+ " df.loc[df.ahps_lid==sea_level_site, 'viz'] = 'no'\n",
+ " df = df[df['viz']=='yes'] # Subset df to only sites desired for mapping\n",
+ " \n",
+ " # Enforce data types on df before loading in DB (TODO: need to create special cases for each layer).\n",
+ " df = df.astype({'huc8': 'str'})\n",
+ " df = df.fillna(0)\n",
+ "\n",
+ " if file_handle == 'stage_based_catfim.csv':\n",
+ " df['fim_version'] = FIM_VERSION\n",
+ "\n",
+ " try:\n",
+ " df = df.astype({'feature_id': 'int'})\n",
+ " df = df.astype({'feature_id': 'str'})\n",
+ " except KeyError: # If there is no feature_id field\n",
+ " pass\n",
+ " try:\n",
+ " df = df.astype({'nwm_seg': 'int'})\n",
+ " df = df.astype({'nwm_seg': 'str'})\n",
+ " except KeyError: # If there is no nwm_seg field\n",
+ " pass\n",
+ " try:\n",
+ " df = df.astype({'usgs_gage': 'int'})\n",
+ " df = df.astype({'usgs_gage': 'str'})\n",
+ " except KeyError: # If there is no usgs_gage field\n",
+ " pass\n",
+ "\n",
+ " # zfill HUC8 field.\n",
+ " df['huc8'] = df['huc8'].apply(lambda x: x.zfill(8))\n",
+ "\n",
+ " if file_handle in ['stage_based_catfim_sites.csv']:\n",
+ " df = df.astype({'nws_data_rfc_forecast_point': 'str'})\n",
+ " df = df.astype({'nws_data_rfc_defined_fcst_point': 'str'})\n",
+ " df = df.astype({'nws_data_riverpoint': 'str'})\n",
+ "\n",
+ " # Upload df to database.\n",
+ " stripped_layer_name = file_handle.replace(\".csv\", \"\")\n",
+ " table_name = \"reference.\" + stripped_layer_name\n",
+ " print(\"Loading data into DB...\")\n",
+ "\n",
+ " print(\"Dataframe shape\")\n",
+ " print(df.shape[0])\n",
+ "\n",
+ " # Chunk load data into DB\n",
+ " if file_handle in ['stage_based_catfim.csv']:\n",
+ " print(\"Chunk loading...\")\n",
+ " # Create list of df chunks\n",
+ " n = 1000 #chunk row size\n",
+ " list_df = [df[i:i+n] for i in range(0,df.shape[0],n)]\n",
+ "\n",
+ " # Load the first chunk into the DB as a new table\n",
+ " first_chunk_df = list_df[0]\n",
+ " print(first_chunk_df.shape[0])\n",
+ " #geometry = 'MULTIPOLYGON'\n",
+ " first_chunk_df.to_sql(\n",
+ " name=stripped_layer_name, \n",
+ " con=db_engine, \n",
+ " schema='reference',\n",
+ " if_exists='replace', \n",
+ " index=False,\n",
+ " dtype={'oid': sqlalchemy.types.Integer(), \n",
+ " 'geom': Geometry('MULTIPOLYGON',srid=3857)\n",
+ " }\n",
+ " )\n",
+ " \n",
+ " # Load remaining chunks into newly created table\n",
+ " \n",
+ " for remaining_chunk in list_df[1:]:\n",
+ " print(remaining_chunk.shape[0])\n",
+ " remaining_chunk.to_sql(\n",
+ " name=stripped_layer_name, \n",
+ " con=db_engine, \n",
+ " schema='reference',\n",
+ " if_exists='append', \n",
+ " index=False,\n",
+ " dtype={'oid': sqlalchemy.types.Integer(),\n",
+ " 'geom': Geometry('MULTIPOLYGON',srid=3857)\n",
+ " }\n",
+ " )\n",
+ " \n",
+ " \n",
+ " \n",
+ " else:\n",
+ " geometry = 'POINT'\n",
+ " df.to_sql(\n",
+ " name=stripped_layer_name, \n",
+ " con=db_engine, \n",
+ " schema='reference',\n",
+ " if_exists='replace', \n",
+ " index=False,\n",
+ " dtype={'oid': sqlalchemy.types.Integer(),\n",
+ " 'geom': Geometry(geometry,srid=3857)\n",
+ " }\n",
+ " )\n",
+ " \n",
+ " \n",
+ " print(\"Creating index...\")\n",
+ " with db_engine.connect() as conn:\n",
+ " result = conn.execute(text(f'CREATE INDEX ON reference.{stripped_layer_name} USING GIST (geom);'))\n",
+ " "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "422ec875-d590-481f-bf6c-f4453b76414b",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "db_type = \"egis\"\n",
+ "db_engine = get_db_engine(db_type)\n",
+ "print(db_engine)\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "28048dba-494c-45bf-b334-f8deca65d68e",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "s3 = boto3.client('s3')\n",
+ "local_download_parent_directory = f'brad_data/qc_fim_data'\n",
+ "local_download_path = os.path.join(local_download_parent_directory, f'{file_handle}')\n",
+ "filename = f\"/temp/catfim_library_exploded.gpkg\"\n",
+ "s3.download_file(FIM_BUCKET, filename, local_download_path)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "601192df",
+ "metadata": {},
+ "source": [
+ "
10 - UPDATE FLOW-BASED CATFIM DATA IN DB
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "95c55cd0",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "import os\n",
+ "from helper_functions.shared_functions import *\n",
+ "import boto3\n",
+ "import geopandas as gpd\n",
+ "import pandas as pd\n",
+ "\n",
+ "# os.environ['EGIS_DB_HOST'] = '' #TI DB\n",
+ "\n",
+ "db_type = \"egis\"\n",
+ "db_engine = get_db_engine(db_type)\n",
+ "print(db_engine)\n",
+ "\n",
+ "s3 = boto3.client('s3')\n",
+ "\n",
+ "# Define bucket and parent directories.\n",
+ "bucket = \"hydrovis-ti-deployment-us-east-1\"\n",
+ "parent_directory = \"qc_fim_data\"\n",
+ "local_download_parent_directory = f'brad_data/qc_fim_data'\n",
+ "\n",
+ "file_handles = ['flow_based_catfim_sites.csv', 'catfim_library_dissolved_flow_based.csv']\n",
+ "\n",
+ "for file_handle in file_handles:\n",
+ " # Define path to file to download and its local download path, the download.\n",
+ " filename = f\"{FIM_ROOT_DPATH}/qa_datasets/{file_handle}\"\n",
+ " print(filename)\n",
+ " local_download_path = os.path.join(local_download_parent_directory, f'{file_handle}')\n",
+ " print(f\"--> Downloading {FIM_BUCKET}/{filename} to {local_download_path}\")\n",
+ " #s3.download_file(FIM_BUCKET, filename, local_download_path)\n",
+ " \n",
+ " # -- Open file and reformat -- #\n",
+ " print(\"Reading file...\")\n",
+ " df = pd.read_csv(local_download_path)\n",
+ " print(\"File read.\")\n",
+ " # Rename headers.\n",
+ " df = df.rename(columns={'Unnamed: 0': 'oid', 'geometry': 'geom', 'huc':'huc8'})\n",
+ " \n",
+ " if file_handle == 'catfim_library_dissolved_flow_based.csv':\n",
+ " file_handle = 'flow_based_catfim.csv'\n",
+ " \n",
+ " # Convert all field names to lowercase (needed for ArcGIS Pro).\n",
+ " df.columns= df.columns.str.lower()\n",
+ "\n",
+ " # Remove sites that are in derived.ahps_restricted_sites\n",
+ " restricted_sites_df = get_db_values(\"derived.ahps_restricted_sites\", [\"*\"])\n",
+ " restricted_dict = restricted_sites_df.to_dict('records')\n",
+ "\n",
+ " # Change 'mapped' to 'no' if sites are present in restricted_sites_df\n",
+ " for site in restricted_dict:\n",
+ " nws_lid = site['nws_lid'].lower()\n",
+ " print(nws_lid)\n",
+ " if \"sites\" in file_handle:\n",
+ " #print(True)\n",
+ " #print(nws_lid)\n",
+ " df.loc[df.ahps_lid==nws_lid, 'mapped'] = 'no'\n",
+ " df.loc[df.ahps_lid==nws_lid, 'status'] = site['restricted_reason']\n",
+ " #print(df.loc[df.ahps_lid==nws_lid]['status'])\n",
+ " else:\n",
+ " df.loc[df.ahps_lid==nws_lid, 'viz'] = 'no'\n",
+ " df = df[df['viz']=='yes']\n",
+ " \n",
+ " # Enforce data types on df before loading in DB (TODO: need to create special cases for each layer).\n",
+ " df = df.astype({'huc8': 'str'})\n",
+ " df = df.fillna(0)\n",
+ " try:\n",
+ " df = df.astype({'feature_id': 'int'})\n",
+ " df = df.astype({'feature_id': 'str'})\n",
+ " except KeyError: # If there is no feature_id field\n",
+ " pass\n",
+ " try:\n",
+ " df = df.astype({'nwm_seg': 'int'})\n",
+ " df = df.astype({'nwm_seg': 'str'})\n",
+ " except KeyError: # If there is no nwm_seg field\n",
+ " pass\n",
+ " try:\n",
+ " df = df.astype({'usgs_gage': 'int'})\n",
+ " df = df.astype({'usgs_gage': 'str'})\n",
+ " except KeyError: # If there is no usgs_gage field\n",
+ " pass\n",
+ " \n",
+ " # zfill HUC8 field.\n",
+ " df['huc8'] = df['huc8'].apply(lambda x: x.zfill(8))\n",
+ " \n",
+ " if file_handle in ['flow_based_catfim_sites.csv']:\n",
+ " df = df.astype({'nws_data_rfc_forecast_point': 'str'})\n",
+ " df = df.astype({'nws_data_rfc_defined_fcst_point': 'str'})\n",
+ " df = df.astype({'nws_data_riverpoint': 'str'})\n",
+ " \n",
+ " # Upload df to database.\n",
+ " stripped_layer_name = file_handle.replace(\".csv\", \"\")\n",
+ " table_name = \"reference.\" + stripped_layer_name\n",
+ " print(\"Loading data into DB...\")\n",
+ " \n",
+ " print(\"Dataframe shape\")\n",
+ " print(df.shape[0])\n",
+ " \n",
+ " # Chunk load data into DB\n",
+ " if file_handle in ['flow_based_catfim.csv']:\n",
+ " print(\"Chunk loading...\")\n",
+ " # Create list of df chunks\n",
+ " n = 1000 #chunk row size\n",
+ " list_df = [df[i:i+n] for i in range(0,df.shape[0],n)]\n",
+ " \n",
+ " # Load the first chunk into the DB as a new table\n",
+ " first_chunk_df = list_df[0]\n",
+ " print(first_chunk_df.shape[0])\n",
+ " #geometry = 'POLYGON'\n",
+ " \n",
+ " df.to_sql(\n",
+ " name=stripped_layer_name, \n",
+ " con=db_engine, \n",
+ " schema='reference',\n",
+ " if_exists='replace', \n",
+ " index=False,\n",
+ " dtype={'oid': sqlalchemy.types.Integer(),\n",
+ " 'geom': Geometry('MULTIPOLYGON',srid=3857)\n",
+ " }\n",
+ " )\n",
+ " \n",
+ " # Load remaining chunks into newly created table\n",
+ " for remaining_chunk in list_df[1:]:\n",
+ " print(remaining_chunk.shape[0])\n",
+ " \n",
+ " remaining_chunk.to_sql(\n",
+ " name=stripped_layer_name, \n",
+ " con=db_engine, \n",
+ " schema='reference',\n",
+ " if_exists='append', \n",
+ " index=False,\n",
+ " dtype={'oid': sqlalchemy.types.Integer(),\n",
+ " 'geom': Geometry('MULTIPOLYGON',srid=3857)\n",
+ " }\n",
+ " )\n",
+ " \n",
+ " \n",
+ " else:\n",
+ " geometry = 'POINT'\n",
+ " df.to_sql(\n",
+ " name=stripped_layer_name, \n",
+ " con=db_engine, \n",
+ " schema='reference',\n",
+ " if_exists='replace', \n",
+ " index=False,\n",
+ " dtype={'oid': sqlalchemy.types.Integer(),\n",
+ " 'geom': Geometry(geometry,srid=3857)\n",
+ " }\n",
+ " )"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "cb4c0634-0878-421e-88bd-c63cd0acbace",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "
11 - UPDATE RAS2FIM DATA IN DB
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "00c42c0f-7e25-491c-9353-0e74a4089036",
+ "metadata": {},
+ "source": [
+ "Update from Tyler in early 2024: This process will need to be revisited, as Rob Hannah was working on updates to the Ras2FIM data model to sync up with our database. Brad and Corey were on point for this, so proper attention / planning will need to happen to mitigate the knowledge transfer loss / properly test any new updates."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "f9be5017-b22e-43d5-96ff-73aa7313d9a5",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "sql = \"\"\"\n",
+ "ALTER TABLE ras2fim.geocurves ADD COLUMN previous_stage_ft double precision;\n",
+ "ALTER TABLE ras2fim.geocurves ADD COLUMN previous_stage_m double precision;\n",
+ "ALTER TABLE ras2fim.geocurves ADD COLUMN previous_discharge_cfs double precision;\n",
+ "ALTER TABLE ras2fim.geocurves ADD COLUMN previous_discharge_cms double precision\n",
+ "\"\"\"\n",
+ "\n",
+ "sql = \"\"\"\n",
+ "WITH lagged as (SELECT \n",
+ " feature_id,\n",
+ " (lag(stage_m, 1) OVER (PARTITION BY feature_id ORDER by stage_m)) as previous_stage_m,\n",
+ " (lag(stage_ft, 1) OVER (PARTITION BY feature_id ORDER by stage_ft)) as previous_stage_ft,\n",
+ " (lag(discharge_cfs, 1) OVER (PARTITION BY feature_id ORDER by discharge_cfs)) as previous_discharge_cfs,\n",
+ " (lag(discharge_cms, 1) OVER (PARTITION BY feature_id ORDER by discharge_cms)) as previous_discharge_cms\n",
+ "FROM ras2fim.geocurves)\n",
+ "\n",
+ "UPDATE ras2fim.geocurves gc\n",
+ "SET previous_stage_ft = lagged.previous_stage_ft,\n",
+ " previous_stage_m = lagged.previous_stage_m,\n",
+ " previous_discharge_cfs = lagged.previous_discharge_cfs,\n",
+ " previous_discharge_cms = lagged.previous_discharge_cms\n",
+ "FROM lagged\n",
+ "WHERE gc.feature_id = lagged.feature_id and gc.stage_ft = lagged.stage_ft;\n",
+ "\"\"\"\n",
+ "\n",
+ "sql = \"\"\"\n",
+ "SELECT\n",
+ " feature_id,\n",
+ " max(discharge_cfs) as max_rc_discharge_cfs,\n",
+ " max(stage_ft) as max_rc_stage_ft,\n",
+ " max(discharge_cms) as max_rc_discharge_cms,\n",
+ " max(stage_m) as max_rc_stage_m\n",
+ "INTO ras2fim.max_geocurves\n",
+ "FROM ras2fim.geocurves\n",
+ "\"\"\""
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "3fae3f0b-7669-4aaf-b50a-bf82aea06a00",
+ "metadata": {},
+ "source": [
+ "
12 - Clear the HAND Cache
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "40cf4236-7320-4780-a6f5-b83258073bfe",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "sql = \"\"\"\n",
+ "TRUNCATE TABLE fim_cache.hand_hydrotable_cached;\n",
+ "TRUNCATE TABLE fim_cache.hand_hydrotable_cached_max;\n",
+ "TRUNCATE TABLE fim_cache.hand_hydrotable_cached_geo;\n",
+ "TRUNCATE TABLE fim_cache.hand_hydrotable_cached_zero_stage;\n",
+ "\"\"\"\n",
+ "run_sql_in_db(sql)"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "conda_python3",
+ "language": "python",
+ "name": "conda_python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.10.14"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/Core/Manual_Workflows/FIM_data_loads/10. FIM Version 4.5.2.11 Update.ipynb b/Core/Manual_Workflows/FIM_data_loads/10. FIM Version 4.5.2.11 Update.ipynb
new file mode 100644
index 00000000..243a15e7
--- /dev/null
+++ b/Core/Manual_Workflows/FIM_data_loads/10. FIM Version 4.5.2.11 Update.ipynb
@@ -0,0 +1,2755 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "220f464f-0726-479f-8871-155c640458de",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "### This notebook was created by Corey Krewson in 2023 to facilitate all the steps needed to update the pipelines / EGIS map services, when a new FIM version is released. \n",
+ "\n",
+ "Unfortunately, these steps are still very manual, and this notebook is the main source of documentation to making these updates."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "3aa5af89-47ec-45b6-a773-ea60bd07e002",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "### Load Status for FIM 4.5.2.11\n",
+ "\n",
+ "1. `Crosswalk` : Aug 5, 2024 (reloaded with new model_column Aug 14, 2024)\n",
+ "2. `Lambda FIM_PREFIX` : Aug 5, 2024\n",
+ "3. `Lambda FIM_VERSION and Memory` : Aug 5, 2024\n",
+ "4. `ras2fim` : Aug 21, 2024\n",
+ " - had to reload Sep 16. Wrong starting project (was 5070, now 2277)\n",
+ " - redone again.. mixed projections on files (Sep 18, 2024)\n",
+ "5. `ras2fim Boundaries`: Ran in Aug ?? 2024\n",
+ "6. `AEP`\n",
+ " - `2 year` : -- done - Aug 23, 2024 - redone : Sep 18\n",
+ " - `5 year` : -- done - Aug 23, 2024 - redone: Sep 18\n",
+ " - `10 year` : -- done - Aug 23, 2024 - redone: Sep 18\n",
+ " - `25 year` : -- done - Aug 23, 2024 - redone: Sep 18\n",
+ " - `50 year` : -- done - Aug 23, 2024 - redone: Sep 18\n",
+ " - `HW / High Water` : -- done - Aug 23, 2024 - redone: Sep 18\n",
+ " - `Change the hv-vpp-ti-viz-fim-data-prep Lambda memory back to 2048mb` : -- done - Aug 23, 2024\n",
+ " - Sep 16, 2024: needs reload as ras2fim needed reload\n",
+ "7. `Catchments`\n",
+ " - `Branch 0` : -- done - Aug 22, 2024\n",
+ " - `GMS` : -- done - Aug 22, 2024\n",
+ "8. `usgs_elev_table` : -- done - Aug 15, 2024\n",
+ "9. `hydrotable / hydrotable_staggered` : -- done - Aug 16, 2024\n",
+ "10. `usgs_rating_curve / usgs_rating_curves staggered` : -- done - Aug 16, 2024\n",
+ "11. `Skills Metrics` : -- Redone - Aug 19, 2024\n",
+ "12. `FIM Performance` : -- Aug 18/19 - points and poly''s done. Catchments re-done Aug 30\n",
+ "13. `CatFIM`\n",
+ " - `Stage Based CatFIM` : -- done Sep 3, 2024\n",
+ " - `Flow Based CatFIM` : -- done Sep 3, 2024\n",
+ " - `CatFIM FIM 30` : In progress\n",
+ "14. `Clear HAND cache` :\n",
+ "15. `GIT1 and `Terraform ??` : TBD\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "9e1ee2b5-b109-49e8-8255-f06449b44ee6",
+ "metadata": {
+ "scrolled": true,
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# Cell to manually pip reload a packages that the Jupyter engine not retained\n",
+ "!pip install numpy\n",
+ "!pip install geopandas\n",
+ "!pip install pyarrow\n",
+ "!pip install xarray\n",
+ "!pip install geoalchemy2\n",
+ "!pip install contextily\n",
+ "!pip install rioxarray\n",
+ "print(\"All loaded\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "103a0d12-b2a0-4586-ba0f-0ed6b2c4eb89",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# pd.set_option(\"max_info_rows\", 100000) # override "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "5b656259",
+ "metadata": {
+ "scrolled": true,
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "import os\n",
+ "import codecs\n",
+ "import csv\n",
+ "import sys\n",
+ "\n",
+ "from datetime import datetime\n",
+ "\n",
+ "import boto3\n",
+ "import geopandas as gpd\n",
+ "import json\n",
+ "import pandas as pd\n",
+ "import s3fs\n",
+ "import sqlalchemy\n",
+ "import xarray as xr\n",
+ "\n",
+ "from geopandas import GeoDataFrame\n",
+ "from io import StringIO\n",
+ "from geoalchemy2 import Geometry\n",
+ "from shapely import wkt\n",
+ "from shapely.geometry import Polygon\n",
+ "from sqlalchemy.exc import DataError # yes, reduntant, fix it later\n",
+ "from sqlalchemy.types import Text # yes, reduntant, fix it later\n",
+ "\n",
+ "sys.path.append(os.path.join(os.path.abspath(''), '..'))\n",
+ "\n",
+ "import helper_functions.s3_shared_functions as s3_sf\n",
+ "import helper_functions.shared_functions as sf\n",
+ "# import helper_functions.viz_classes\n",
+ "\n",
+ "from helper_functions.viz_classes import database\n",
+ "from helper_functions.viz_db_ingest import lambda_handler as execute_db_ingest\n",
+ "\n",
+ "print(\"imports loaded\")\n",
+ "\n",
+ "# Note: Aug 2024: Sometimes if you need to do the pip install above, you need to reload this twice.. must be a circular dependency ?? or forced pkg reload\n",
+ "# Not sure why yet"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "e944eff3-6023-48f4-9f57-27304a240447",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "\n",
+ "# Aug 5, 2024: # The variable named FIM_VERSION will continue to be the field that joins all data together.\n",
+ "# But we need two new public display fields in the UI. We will no longer show a UI field which previously would have been\n",
+ "# \"FIM_4_5_2_11\". That won't be displayed anymore. The public field of:\n",
+ "# public_fim_version, for this edition, becomes \"FIM 5_0_0\" (yes.. three segs)\n",
+ "# public_model_version, for this edition, becomes \"HAND 4_5_2_11\". \n",
+ "# When we add ras2fim into the system, it's public_fim_version continues to be FIM 5_0_0,\n",
+ "# but ras2fim public_model_version, becomes \"ras2fim 2_0_3_0\"\n",
+ "\n",
+ "# NOTE: sep 19, 2024: creaing the 4.4.0.0 was just for learning purposes. Now Rob has access to UAT db's so I can compare against\n",
+ "# those next time if needed. We might remove the references to 4.4.0.0 next time\n",
+ "\n",
+ "OLD_FIM_VERSION = \"4.4.0.0\"\n",
+ "NEW_FIM_VERSION = \"4.5.2.11\"\n",
+ "PUBLIC_FIM_VERSION = \"FIM 5.0.0\" \n",
+ "FIM_MODEL_VERSION = \"HAND 4.5.2.11\" # on next major build (after Aug 2024, change this to space and dots. ie) HAND 4.5.2.11)\n",
+ "OLD_FIM_TAG = OLD_FIM_VERSION.replace('.', '_')\n",
+ "\n",
+ "FIM_ROOT_DPATH = f\"fim/fim_{NEW_FIM_VERSION.replace('.', '_')}\"\n",
+ "HAND_DATASETS_DPATH = f\"{FIM_ROOT_DPATH}/hand_datasets\"\n",
+ "QA_DATASETS_DPATH = f\"{FIM_ROOT_DPATH}/qa_datasets\"\n",
+ "\n",
+ "FIM_BUCKET = \"hydrovis-ti-deployment-us-east-1\"\n",
+ "FIM_CROSSWALK_FPATH = os.path.join(HAND_DATASETS_DPATH, \"crosswalk_table.csv\")\n",
+ "PIPELINE_ARN = 'arn:aws:states:us-east-1:526904826677:stateMachine:hv-vpp-ti-viz-pipeline'\n",
+ "\n",
+ "COLUMN_NAME_FIM_VERSION = \"fim_version\"\n",
+ "COLUMN_NAME_MODEL_VERSION = \"model_version\"\n",
+ "\n",
+ "# Sometimes these credential values get updated. To find the latest correct values, go to your AWS Console log page and click on the \"Access Key\"\n",
+ "# link to get the latest valid set. Using the \"AWS environment variables\" values.\n",
+ "# If this is not set correctly, you will get an HTTP error 400 when you call S3 lower.\n",
+ "# You might also see an error of 'An error occurred (NoSuchKey) when calling the GetObject operation: The specified key does not exist.\" the creds are not correct\"\n",
+ "\n",
+ "# Helps us get to the keys. Note: This was added Oct 16, 2024 and is untested\n",
+ "sys.path.append(os.path.join(os.path.abspath(''), '../../../../AWS_Secret_keys'))\n",
+ "import AWS_Keys\n",
+ "\n",
+ "\n",
+ "S3_CLIENT = boto3.client(\"s3\")\n",
+ "STEPFUNCTION_CLIENT = boto3.client('stepfunctions')\n",
+ "VIZ_DB_ENGINE = sf.get_db_engine('viz')\n",
+ "\n",
+ "print(\"Global Variables loaded\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "6a5827eb-3ca3-4b29-8ee7-f7c4f3c2c736",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "
1 - UPLOAD FIM4 HYDRO ID/FEATURE ID CROSSWALK
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "28195858-0cb6-4ad3-8966-db1371a8452a",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "February 2024 Update from Tyler: This code will need to be updated to handle a new hand_id unique integer that the fim team (Rob Hanna and Matt Luck) has added to the crosswalk, and is now important to fim runs. They also changed the field names / format to match our schema, so this chunk of code should be able to be simplified significantly."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "007c59f4-72ff-4e3e-9f99-36a8e830c426",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "\n",
+ " # This was already done for 4.4.0.0, so we can skip it jump from 4.5.2.0 to 4.2.5.11\n",
+ " \n",
+ "\n",
+ "'''\n",
+ "Be Very Careful to just rename tables. If they have indexes, the index will now point to the new\n",
+ "table names but maintain the original index name. Those index names can really mess stuff up.\n",
+ "Best to never rename unless you rename indexes as well. This particular on is ok. \n",
+ "Note: When various '\"to_sql\" tools are run which have GIST indexes, this index column name issue\n",
+ "will be the problem.\n",
+ "\n",
+ "Why Drop instead of Truncate? if the schema changes for the incoming, truncate will have column\n",
+ "missmatches.\n",
+ "\n",
+ "We really should be backing up indexes and constraints as well.\n",
+ "\n",
+ "'''\n",
+ "\n",
+ "\n",
+ "# TODO: Aug 2024: Change this to a backup without indexes and not rename, it affects indexes that we might need\n",
+ "# tbl_name = \"derived.fim4_featureid_crosswalk\"\n",
+ "# new_table_name = f\"{tbl_name}_{OLD_FIM_TAG}\"\n",
+ "# sql = f'''\n",
+ "# CREATE TABLE IF NOT EXISTS {new_table_name} AS TABLE {tbl_name};\n",
+ "# '''\n",
+ "# sf.execute_sql(sql)\n",
+ "# print(f\"{tbl_name} copied to {new_table_name} if it does not already exists\")\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "c96a49f2",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "\n",
+ "print(f\"Getting column name from {FIM_CROSSWALK_FPATH}\")\n",
+ "\n",
+ "data = S3_CLIENT.get_object(Bucket=FIM_BUCKET, Key=FIM_CROSSWALK_FPATH)\n",
+ "d_reader = csv.DictReader(codecs.getreader(\"utf-8\")(data[\"Body\"]))\n",
+ "headers = d_reader.fieldnames\n",
+ "\n",
+ "# Aug 5, 2024 - Updated column names for 4.5.2.11\n",
+ "header_str = \"(\"\n",
+ "for header in headers:\n",
+ " header_str += header\n",
+ " if header in ['hand_id', 'hydro_id', 'lake_id']:\n",
+ " header_str += ' integer,'\n",
+ " elif header in ['branch_id', 'feature_id']:\n",
+ " header_str += ' bigint,'\n",
+ " else:\n",
+ " header_str += ' TEXT,'\n",
+ "header_str = header_str[:-1] + \")\"\n",
+ "print(header_str)\n",
+ "\n",
+ "db = database(db_type=\"viz\")\n",
+ "with db.get_db_connection() as conn, conn.cursor() as cur:\n",
+ " \n",
+ " print(f\"Deleting/Creating derived.fim4_featureid_crosswalk using columns {header_str}\")\n",
+ " sql = f\"DROP TABLE IF EXISTS derived.fim4_featureid_crosswalk; CREATE TABLE derived.fim4_featureid_crosswalk {header_str};\"\n",
+ " cur.execute(sql)\n",
+ " conn.commit()\n",
+ "\n",
+ " print(f\"Importing {FIM_CROSSWALK_FPATH} to derived.fim4_featureid_crosswalk\")\n",
+ " sql = f\"\"\"\n",
+ " SELECT aws_s3.table_import_from_s3(\n",
+ " 'derived.fim4_featureid_crosswalk',\n",
+ " '', \n",
+ " '(format csv, HEADER true)',\n",
+ " (SELECT aws_commons.create_s3_uri(\n",
+ " '{FIM_BUCKET}',\n",
+ " '{FIM_CROSSWALK_FPATH}',\n",
+ " 'us-east-1'\n",
+ " ) AS s3_uri\n",
+ " ),\n",
+ " aws_commons.create_aws_credentials('{TI_ACCESS_KEY}', '{TI_SECRET_KEY}', '{TI_TOKEN}')\n",
+ " );\n",
+ " \"\"\"\n",
+ " cur.execute(sql)\n",
+ " conn.commit()\n",
+ "\n",
+ " \n",
+ " # Aug 5, 2024: see notes at the top about the new FIM 5.0.0 system \n",
+ " # We will manually add a couple of new columns for public display\n",
+ " # New columns names are public_fim_version (FIM_5_0_0) and public model version (FIM_4_5_2_11)\n",
+ " print(f\"Adding {COLUMN_NAME_FIM_VERSION} column to derived.fim4_featureid_crosswalk\")\n",
+ " sql = f\"ALTER TABLE derived.fim4_featureid_crosswalk ADD COLUMN IF NOT EXISTS {COLUMN_NAME_FIM_VERSION} text DEFAULT '{PUBLIC_FIM_VERSION}';\"\n",
+ " cur.execute(sql)\n",
+ " conn.commit()\n",
+ " \n",
+ " print(f\"Adding {COLUMN_NAME_MODEL_VERSION} column to derived.fim4_featureid_crosswalk\")\n",
+ " sql = f\"ALTER TABLE derived.fim4_featureid_crosswalk ADD COLUMN IF NOT EXISTS {COLUMN_NAME_MODEL_VERSION} text DEFAULT '{FIM_MODEL_VERSION}';\"\n",
+ " cur.execute(sql)\n",
+ " conn.commit()\n",
+ "\n",
+ " print(\"Adding feature id index to derived.fim4_featureid_crosswalk\")\n",
+ " # Drop it already exists\n",
+ " sql = \"DROP INDEX IF EXISTS derived.fim4_crosswalk_feature_id\"\n",
+ " cur.execute(sql)\n",
+ " conn.commit() \n",
+ " sql = \"CREATE INDEX fim4_crosswalk_feature_id ON derived.fim4_featureid_crosswalk USING btree (feature_id)\"\n",
+ " cur.execute(sql)\n",
+ " conn.commit()\n",
+ "\n",
+ " print(\"Adding hydro id index to derived.fim4_featureid_crosswalk\")\n",
+ " # Drop it already exists\n",
+ " sql = \"DROP INDEX IF EXISTS derived.fim4_crosswalk_hydro_id\"\n",
+ " cur.execute(sql)\n",
+ " conn.commit() \n",
+ " sql = \"CREATE INDEX fim4_crosswalk_hydro_id ON derived.fim4_featureid_crosswalk USING btree (hydro_id)\"\n",
+ " cur.execute(sql)\n",
+ " conn.commit()\n",
+ "\n",
+ "print(\"\")\n",
+ "print(f\"Successully loaded derived.fim4_featureid_crosswalk and updated it\")\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "23500c40-b088-4b62-aa10-fe2f88c52ef9",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "
2 - UPDATE FIM HAND PROCESSING LAMBDA ENV VARIABLE WITH NEW FIM PREFIX
\n",
+ "\n",
+ "https://us-east-1.console.aws.amazon.com/lambda/home?region=us-east-1#/functions/hv-vpp-ti-viz-hand-fim-processing?tab=configure\n",
+ "\n",
+ "Lambda name: hv-vpp-ti-viz-hand-fim-processing\n",
+ "\n",
+ "In the Configuration Tab, click on the `Environment variables` (left menu), then change the `FIX_PREFIX` to location of the latest hand_dataset you are working on. Referencial to S3 Bucket name.\n",
+ " \n",
+ "ie) fim/fim_4_5_2_11/hand_datasets\n",
+ "\n",
+ "Aug 5, 2024: changed my FIM_PREFIX:\n",
+ " from: fim/fim_4_5_2_0/hand_datasets\n",
+ " to: fim/fim_4_5_2_11/hand_datasets\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "040cf1f3-cdfd-467a-9ac0-5478a285f032",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "
3 - UPDATE FIM DATA PREP LAMBDA ENV VARIABLE WITH NEW FIM VERSION AND MEMORY
\n",
+ "\n",
+ "https://us-east-1.console.aws.amazon.com/lambda/home?region=us-east-1#/functions/hv-vpp-ti-viz-fim-data-prep?tab=code\n",
+ "\n",
+ "Lambda name: hv-vpp-ti-viz-hand-fim-processing\n",
+ "\n",
+ "In the `Configuration` Tab, click on the `Environment variables` (left menu), then change the `FIM_VERSION` to the latest fim model version. \n",
+ " \n",
+ "ie) 4.5.2.11\n",
+ "
\n",
+ "Aug 5, 2024: changed my FIM_VERSION:\n",
+ " from: 4.5.2.0\n",
+ " to: 4.5.2.11\n",
+ "
\n",
+ "Then: Still in the Configuration Tab, now click on the `General Configuration` (left menu), followed \n",
+ "by the `edit` button on the far right side, to get into the `General Configuration` page details.\n",
+ " Change (if they are not already there)\n",
+ " Memory (text field) to 4096 (MB) and\n",
+ " Emphermeral Storage tp 1024 (MB)\n",
+ " \n",
+ "#### Note: Later in these steps we will change the Memory and Emphermal Storage back to default values, see below ####\n",
+ "\n",
+ "Aug 5, 2024: changed my Memory (4096) and Emphermal (1024):\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "0ef11c15-c1c1-46a8-a65a-fa2a0c1dc97b",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "
4 - LOAD AND UPDATE RAS2FIM DATA IN DB
\n",
+ "\n",
+ "
Note about ras2fim domain extents
\n",
+ "As of Aug 2024, a new service came online for a new layer for ras2fim domain extents. Don took care of it.\n",
+ "The new extent data was loaded as part of different tools and processes, but we will likly want to consolidate\n",
+ "it to here.\n",
+ "\n",
+ "When ras2fim datasets are released, they come with a \"release\" package that has all of the ras2fim models and geocurves\n",
+ "needed here, but also has domain extents for each HUC in the release package. That entire thing is loaded to S3\n",
+ "for HV to load. As we will upload a new ras2fim data / geocurves and domain extents at the same time, those load\n",
+ "scripts should all stay together (here for now). We can add that next time.\n",
+ "\n",
+ "However.. ras2fim will likely do releases a lot more regularily than FIM, so it should get it's own independeant load scripts\n",
+ "which this script can optionally reference if it likes (well.. future versions of this script, ie FIM 4.8.x.x or whatever)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "dfc0a44c-8b46-4db5-a9a0-c65e0bf1e20b",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "\n",
+ "# Already done for 4.4.0.0 (4.5.2.11)\n",
+ "\n",
+ "\n",
+ "# TODO: Aug 2024: Change this to a backup without indexes and not rename\n",
+ "\n",
+ "\n",
+ "# By doing a backup, we are leaving the original tables with the indexes and we want to keep them with\n",
+ "# ras2fim as it loads geometry and without those pre-existing indexes, loading can fail\n",
+ "# tbl_name = \"ras2fim.geocurves\"\n",
+ "# new_table_name = f\"{tbl_name}_{OLD_FIM_TAG}\"\n",
+ "# sql = f\"CREATE TABLE IF NOT EXISTS {new_table_name} AS TABLE {tbl_name};\"\n",
+ "# sf.execute_sql(sql)\n",
+ "# print(f\"{tbl_name} copied to {new_table_name} if it does not already exists\")\n",
+ "\n",
+ "\n",
+ "# tbl_name = \"ras2fim.max_geocurves\"\n",
+ "# new_table_name = f\"{tbl_name}_{OLD_FIM_TAG}\"\n",
+ "# sql = f\"CREATE TABLE IF NOT EXISTS {new_table_name} AS TABLE {tbl_name};\"\n",
+ "# sf.execute_sql(sql)\n",
+ "# print(f\"{tbl_name} copied to {new_table_name} if it does not already exists\")\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "7e8c3939-f2fc-42de-921d-423d69e023f1",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "\n",
+ "# NOTE: This can be removed in future ras2fim loads.\n",
+ "\n",
+ "\n",
+ "# Temp Aug 2024: We originally just did a table rename for the {table name} to add _4_4_0_0 on it.\n",
+ "# Then discovered that renaming it means the indexes are now with the new renamed tables\n",
+ "# When we load the ras2fim tables, they can't have some of the indexes in place.\n",
+ "# So.. for now, we are going to rename the _4_4_0_0 tables back to their original name, the do the backup\n",
+ "# above.\n",
+ "#sf.execute_sql(f'ALTER TABLE IF EXISTS ras2fim.geocurves_4_4_0_0 RENAME TO geocurves;')\n",
+ "#sf.execute_sql(f'ALTER TABLE IF EXISTS ras2fim.max_geocurves_4_4_0_0 RENAME TO max_geocurves')\n",
+ "#print(\"Done renaming them back\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "60f910bf-4da7-4e62-b9b6-e981f21a23af",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "\n",
+ "# This function is not efficient, but as ras2fim has a built in geometry columns, it loads it as a string and not a \"geometry\" object.\n",
+ "# we have to add the records one at a time.\n",
+ "\n",
+ "# Aug 2024: Maybe eventually I can make this more generic, but for now it is ras2fim specific\n",
+ "# see the new one for catfim as it can likely be rolled into one function\n",
+ "\n",
+ "# UPDATE: Sep 19, 2024: We had to remove the chunking portion as we discovered that each csv being loaded might have \n",
+ "# different crs's. You have to know the incoming crs in order to reproject as the incoming csv's can not be used\n",
+ "# to auto detect the crs. We put in cards for the ras2fim team to have all final csv's come out as a standard\n",
+ "# projection (perferraly 3857). Going back to chunking will slow down our DB writes and speed it back up again\n",
+ "\n",
+ "# Most of the temp comment code is still in place for chunking.\n",
+ "\n",
+ "\n",
+ "def load_ras2fim_files_into_db(csv_file_list, s3_source_parent_prefix, schema_name, db_name):\n",
+ "\n",
+ " # TODO: change these to params and make more generic\n",
+ " # also tell that this is only if you have a geometry column (for now)\n",
+ "\n",
+ " print(f\"Loading data to database {schema_name}.{db_name}\")\n",
+ " print(\"\")\n",
+ "\n",
+ " if len(csv_file_list) == 0:\n",
+ " raise Exception(\"csv file list is empty\")\n",
+ "\n",
+ "\n",
+ " # source_crs = \"epsg:2277\" # (it is coming in as 5070) but we are changing it to 3857 as loading\n",
+ "\n",
+ " # The server has limited memory but it is faster to load as many csv's in at a time\n",
+ " # as resonablty possible. We are going to try it at chunks of 50 (50 csv files) which for ras2fim\n",
+ " # shouls be appx 2,000 records, but for ras2fim V2, we have 750 (ish) files.\n",
+ "\n",
+ " # We can leave this open the entire times as well.\n",
+ " s3_client = boto3.client(\"s3\")\n",
+ " default_kwargs = {\"Bucket\": FIM_BUCKET, \"Prefix\": s3_source_parent_prefix}\n",
+ "\n",
+ "\n",
+ " # chunk_size = 25 # number of csv's to load per set\n",
+ " total_row_count = 0 # all csv row counts combined. You should see this as a record count in the db when done\n",
+ " r2f_df = None # a re-used concatenating pd dataframe loading up sets of 20 csvs\n",
+ " # is_new_df = True # After we db load a set, we reset this to start a new set\n",
+ " is_first_db_set = True # Very first db load\n",
+ "\n",
+ " num_recs = len(csv_file_list)\n",
+ " print(f\"Total number of files to process are {num_recs}\")\n",
+ "\n",
+ " # We are going ot keep the db connection open the entire time. \n",
+ " # It is slow to open/close connections\n",
+ " # It \"should not\" ?? block any other scripts / services from usign it\n",
+ " # most Sql servers allow for more than one connection at a time.\n",
+ " db = database(db_type=\"viz\")\n",
+ "\n",
+ " for idx, full_file_url in enumerate(csv_file_list):\n",
+ "\n",
+ " print(f\"Dowloading {idx + 1} of {num_recs} files: {full_file_url}\")\n",
+ "\n",
+ " is_first_db_set = idx == 0\n",
+ " # if idx > 4:\n",
+ " # return # stub test\n",
+ "\n",
+ "# if is_new_df is True:\n",
+ " s3_client = boto3.client(\"s3\")\n",
+ " default_kwargs = {\"Bucket\": FIM_BUCKET, \"Prefix\": s3_source_parent_prefix}\n",
+ "\n",
+ "\n",
+ " # is_new_df = False\n",
+ " # else:\n",
+ " # temp_df = pd.read_csv(full_file_url)\n",
+ " # total_row_count += len(temp_df)\n",
+ " # r2f_df = pd.concat([r2f_df, temp_df])\n",
+ "\n",
+ " # we want it merge into the db on each xth (chunk size) record or the last record\n",
+ " # if ((idx + 1) == num_recs) or ((idx+1) % chunk_size == 0):\n",
+ "\n",
+ " # download the csv via pandas into a dataframe\n",
+ " r2f_df = pd.read_csv(full_file_url)\n",
+ "\n",
+ " total_row_count += len(r2f_df)\n",
+ " r2f_df = r2f_df.fillna(0)\n",
+ "\n",
+ " cur_csv = r2f_df.loc[0, 'crs']\n",
+ " # print(f\"Original crs = {cur_csv}\")\n",
+ "\n",
+ " # Create a new source_unit_id which traces back to the folder and code to create\n",
+ " # this specific huc and model in ras2fim\n",
+ " r2f_df['source_unit_id'] = r2f_df.apply(lambda row: row.unit_name + \"_\" + \n",
+ " str(row.unit_version), axis=1)\n",
+ " r2f_df.rename(columns={'source_code': 'feature_id_source_code', 'geometry': 'geom'}, inplace=True)\n",
+ " r2f_df['geom'] = r2f_df['geom'].apply(wkt.loads)\n",
+ "\n",
+ " # print(f\"... Next set of downloads and adjustments complete, now to db load - Last Idx: {idx + 1} \")\n",
+ "\n",
+ " r2f_geodf = gpd.GeoDataFrame(data=r2f_df, geometry='geom', crs=cur_csv)\n",
+ " # print(r2f_geodf)\n",
+ " # print(\"\")\n",
+ " r2f_reproj = r2f_geodf.to_crs(\"epsg:3857\")\n",
+ "\n",
+ " # If this is the first load, the type must be the value of \"replace\", else \"append\"\n",
+ " load_type = 'replace' if is_first_db_set is True else 'append'\n",
+ "\n",
+ " r2f_reproj.to_postgis(\n",
+ " name=db_name,\n",
+ " con=VIZ_DB_ENGINE,\n",
+ " schema=schema_name,\n",
+ " if_exists=load_type,\n",
+ " index=False,\n",
+ " )\n",
+ " print(\"... db load complete\")\n",
+ "\n",
+ " # Sanity check on crs\n",
+ " # if is_first_db_set:\n",
+ " # print(sf.run_sql_in_db(f\"SELECT ST_SRID(geom) FROM {schema_name}.{db_name} LIMIT 1\"))\n",
+ "\n",
+ " r2f_df = None\n",
+ " r2f_geodf = None\n",
+ " r2f_reproj = None\n",
+ " # is_new_df = True # reset it for the next set\n",
+ " s3_client = None # resets it so it is not open so long. It timse out if open too long\n",
+ " is_first_db_set = False\n",
+ "\n",
+ " # break\n",
+ "\n",
+ " # else don't write to db tu continue on to the next record\n",
+ "\n",
+ " # end for\n",
+ " print(\"\")\n",
+ " print(\"--------------------------------------------------------------\")\n",
+ " print(\"All records now loaded to the database\")\n",
+ "\n",
+ " with db.get_db_connection() as conn, conn.cursor() as cur:\n",
+ " # after all records are loaded to the db.\n",
+ " print(f\"Adding {COLUMN_NAME_FIM_VERSION} column to {schema_name}.{db_name}\")\n",
+ " sql = f\"ALTER TABLE {schema_name}.{db_name} ADD COLUMN IF NOT EXISTS {COLUMN_NAME_FIM_VERSION} text DEFAULT '{PUBLIC_FIM_VERSION}';\"\n",
+ " cur.execute(sql)\n",
+ " conn.commit()\n",
+ "\n",
+ " print(f\"Adding {COLUMN_NAME_MODEL_VERSION} column to {schema_name}.{db_name}\")\n",
+ " sql = f\"ALTER TABLE {schema_name}.{db_name} ADD COLUMN IF NOT EXISTS {COLUMN_NAME_MODEL_VERSION} text DEFAULT '{model_version}';\"\n",
+ " cur.execute(sql)\n",
+ " conn.commit()\n",
+ "\n",
+ " print(\"Dropping un-necessary columns from DB ...\")\n",
+ " drop_col_names = [\"profile_num\", \"model_id\", \"xs_us\", \"xs_ds\", \"unit_name\", \"unit_version\", \"version\", \"crs\"]\n",
+ " # print(drop_col_names)\n",
+ " # print(\"\")\n",
+ "\n",
+ " sql = f\"ALTER TABLE {schema_name}.{db_name} \"\n",
+ " for col_name in drop_col_names:\n",
+ " sql += f\" DROP COLUMN {col_name},\"\n",
+ "\n",
+ " # the last char is a comma and we need to change it to be \" CASCASE;\"\n",
+ " sql = sql[0:-1] + \" CASCADE;\"\n",
+ " print(sql)\n",
+ " cur.execute(sql)\n",
+ " conn.commit()\n",
+ " print(f\"Total Rows loaded to DB is {total_row_count}\")\n",
+ " # end of def\n",
+ "\n",
+ "print(\"Download and db load ras2fim function loaded\")\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "e57d7bd1-c852-416e-b259-715b023bf743",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# Load the ras2fim.geocurves\n",
+ "\n",
+ "# Note: For Aug 2024 (ras 2.0.1.0 with appx 11 hucs, this took appx 1 hr 15 mins to run\n",
+ "\n",
+ "model_version = \"ras2fim 2.0.1.0\"\n",
+ "new_s3_version_folder = \"v2_0\"\n",
+ "s3_source_parent_prefix = f\"ras2fim/{new_s3_version_folder}\"\n",
+ "\n",
+ "start_dt = datetime.now()\n",
+ "print(\"\")\n",
+ "print(\"Starting loading of ras2fim.geocurves\")\n",
+ "\n",
+ "\n",
+ "# Aug 21 2024, AWS Creds expired and died just after loading rec 2475 of 7948.\n",
+ "# Commented out truncate, reset csv_list to be recs 2476 and higher and restarted.\n",
+ "# All over exact time lost, but can esimate it.\n",
+ "sql = '''\n",
+ " TRUNCATE TABLE ras2fim.geocurves;\n",
+ " TRUNCATE TABLE ras2fim.max_geocurves;\n",
+ "'''\n",
+ "print(sf.execute_sql(sql))\n",
+ "print(\"geocurves and max_geocurves tables truncated to start clean\")\n",
+ "print(\"\")\n",
+ "\n",
+ "# Now download the s3 geocurves\n",
+ "# Overloaded the server as the memory couldn't handle it.\n",
+ "# r2f_df = s3_sf.download_S3_csv_files_to_df(FIM_BUCKET, s3_source_parent_prefix, True)\n",
+ "\n",
+ "# lets just get a list of files, then iterate over them to load each to the db one at a time.\n",
+ "r2f_file_names = s3_sf.get_s3_subfolder_file_names(FIM_BUCKET, s3_source_parent_prefix, False)\n",
+ "\n",
+ "if len(r2f_file_names) == 0:\n",
+ " raise Exception(\"No file names found\")\n",
+ "\n",
+ "\n",
+ "csv_file_list = list(filter(lambda x: (x.endswith(\".csv\") == True), r2f_file_names))\n",
+ "if len(csv_file_list) == 0:\n",
+ " raise Exception(\"No csv file names found\")\n",
+ "\n",
+ "# print(csv_file_list)\n",
+ "\n",
+ "# Test against just 20 records for a timing test\n",
+ "# test_list = csv_file_list[:20]\n",
+ "# print(test_list)\n",
+ "\n",
+ "print(\"Loading df into ras2fim geocurve db\")\n",
+ "\n",
+ "load_ras2fim_files_into_db(csv_file_list, s3_source_parent_prefix, 'ras2fim', 'geocurves')\n",
+ "\n",
+ "# See note above about having to restart at rec num 2475 (our index displays were 1 based and not zero based\n",
+ "# restart_list = csv_file_list[974:]\n",
+ "# load_ras2fim_files_into_db(restart_list, s3_source_parent_prefix, 'ras2fim', 'geocurves')\n",
+ "\n",
+ "end_dt = datetime.now()\n",
+ "time_duration = end_dt - start_dt\n",
+ "print(\".... ras2fim files now loaded to ras2fim.geocurves\")\n",
+ "print(f\"... duration was {str(time_duration).split('.')[0]}\")\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "ded5372c-d94e-4276-be77-b13a94db55cc",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# ras2fim \"previous\" columns loading\n",
+ "\n",
+ "print(\"... Starting ras2fim previous stage adding and max_geocurves creating\")\n",
+ "start_dt = datetime.now()\n",
+ "\n",
+ "sql = \"\"\"\n",
+ "ALTER TABLE IF EXISTS ras2fim.geocurves ADD COLUMN IF NOT EXISTS previous_stage_ft double precision;\n",
+ "ALTER TABLE IF EXISTS ras2fim.geocurves ADD COLUMN IF NOT EXISTS previous_stage_m double precision;\n",
+ "ALTER TABLE IF EXISTS ras2fim.geocurves ADD COLUMN IF NOT EXISTS previous_discharge_cfs double precision;\n",
+ "ALTER TABLE IF EXISTS ras2fim.geocurves ADD COLUMN IF NOT EXISTS previous_discharge_cms double precision;\n",
+ "ALTER TABLE IF EXISTS ras2fim.geocurves ADD COLUMN IF NOT EXISTS oid INTEGER PRIMARY KEY GENERATED ALWAYS AS IDENTITY;\n",
+ "\"\"\" \n",
+ "print(sf.execute_sql(sql))\n",
+ "\n",
+ "sql = '''DROP TABLE IF EXISTS ras2fim.temp_ras2fim_lagged;'''\n",
+ "print(sf.execute_sql(sql))\n",
+ "\n",
+ "\n",
+ "db = database(db_type=\"viz\")\n",
+ "with db.get_db_connection() as conn, conn.cursor() as cur:\n",
+ "\n",
+ " # PS. It is ok that there are some nulls in the four \"previous\" columns\n",
+ " sql = \"\"\"\n",
+ " CREATE TABLE ras2fim.temp_ras2fim_lagged as (SELECT\n",
+ " feature_id,\n",
+ " stage_ft,\n",
+ " (lag(stage_m, 1) OVER (PARTITION BY feature_id ORDER by stage_m)) as previous_stage_m,\n",
+ " (lag(stage_ft, 1) OVER (PARTITION BY feature_id ORDER by stage_ft)) as previous_stage_ft,\n",
+ " (lag(discharge_cfs, 1) OVER (PARTITION BY feature_id ORDER by discharge_cfs)) as previous_discharge_cfs,\n",
+ " (lag(discharge_cms, 1) OVER (PARTITION BY feature_id ORDER by discharge_cms)) as previous_discharge_cms\n",
+ " FROM ras2fim.geocurves)\n",
+ " \"\"\"\n",
+ " cur.execute(sql)\n",
+ " conn.commit()\n",
+ "\n",
+ " sql = \"\"\"\n",
+ " UPDATE ras2fim.geocurves gc\n",
+ " SET previous_stage_ft = lagged.previous_stage_ft,\n",
+ " previous_stage_m = lagged.previous_stage_m,\n",
+ " previous_discharge_cfs = lagged.previous_discharge_cfs,\n",
+ " previous_discharge_cms = lagged.previous_discharge_cms\n",
+ " FROM ras2fim.temp_ras2fim_lagged as lagged\n",
+ " WHERE gc.feature_id = lagged.feature_id\n",
+ " and gc.stage_ft = lagged.stage_ft;\n",
+ " \"\"\"\n",
+ " cur.execute(sql)\n",
+ " conn.commit()\n",
+ "\n",
+ "\n",
+ "print(\"Removing ras2fim.temp_ras2fim_lagged table\")\n",
+ "sql = \"DROP TABLE IF EXISTS ras2fim.temp_ras2fim_lagged;\"\n",
+ "print(sf.execute_sql(sql))\n",
+ "\n",
+ "\n",
+ "print(\"Adding indexes if required\")\n",
+ "\n",
+ "sql = \"ALTER TABLE IF EXISTS ras2fim.geocurves OWNER to viz_proc_admin_rw_user;\"\n",
+ "print(sf.execute_sql(sql))\n",
+ "\n",
+ "sql = \"ALTER TABLE IF EXISTS ras2fim.geocurves OWNER to viz_proc_admin_rw_user;\"\n",
+ "print(sf.execute_sql(sql))\n",
+ "\n",
+ "sql = \"DROP INDEX IF EXISTS ras2fim.geocurves_discharge_cms_index;\"\n",
+ "print(sf.execute_sql(sql))\n",
+ "\n",
+ "\n",
+ "sql = \"\"\"\n",
+ "CREATE INDEX IF NOT EXISTS geocurves_discharge_cms_index ON ras2fim.geocurves USING btree (discharge_cms ASC NULLS LAST)\n",
+ "\"\"\"\n",
+ "print(sf.execute_sql(sql))\n",
+ "\n",
+ "\n",
+ "sql = \"DROP INDEX IF EXISTS ras2fim.geocurves_feature_id_index;\"\n",
+ "print(sf.execute_sql(sql))\n",
+ "\n",
+ "\n",
+ "sql = \"\"\"\n",
+ "CREATE INDEX IF NOT EXISTS geocurves_feature_id_index ON ras2fim.geocurves USING btree (feature_id ASC NULLS LAST)\n",
+ "\"\"\"\n",
+ "print(sf.execute_sql(sql))\n",
+ "\n",
+ "\n",
+ "sql = \"DROP INDEX IF EXISTS ras2fim.geocurves_previous_discharge_cms_index;\"\n",
+ "print(sf.execute_sql(sql))\n",
+ "\n",
+ "\n",
+ "sql = \"\"\"\n",
+ "CREATE INDEX IF NOT EXISTS geocurves_previous_discharge_cms_index\n",
+ " ON ras2fim.geocurves USING btree (previous_discharge_cms ASC NULLS LAST)\n",
+ "\"\"\"\n",
+ "print(sf.execute_sql(sql))\n",
+ "\n",
+ "\n",
+ "# Skip for now.. not sure if it will be needed in the next set\n",
+ "# -- DROP INDEX IF EXISTS ras2fim.idx_geocurves_geom;\n",
+ "# CREATE INDEX IF NOT EXISTS idx_geocurves_geom\n",
+ "# Â Â ON ras2fim.geocurves USING gist\n",
+ "# Â Â (geom)\n",
+ "# Â Â TABLESPACE pg_default;\n",
+ "\n",
+ "\n",
+ "end_dt = datetime.now()\n",
+ "time_duration = end_dt - start_dt\n",
+ "print(\".... Done - ras2fim previous stage columns added\")\n",
+ "print(f\"... duration was {str(time_duration).split('.')[0]}\")\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "a1bd997c-f813-49dd-94e0-adf062f3b74b",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# ras2fim max_geocurve loading\n",
+ "\n",
+ "start_dt = datetime.now()\n",
+ "\n",
+ "print(\"Start of creating and loading max_geocurves table\")\n",
+ "\n",
+ "# Table can't have any indexes as nothing in unique enough\n",
+ "# We shoudl have an oid column though\n",
+ "sql = \"DROP TABLE IF EXISTS ras2fim.max_geocurves;\"\n",
+ "print(sf.execute_sql(sql))\n",
+ "\n",
+ "sql = \"\"\"\n",
+ "CREATE TABLE ras2fim.max_geocurves as (\n",
+ " SELECT\n",
+ " feature_id,\n",
+ " max(discharge_cfs) as max_rc_discharge_cfs,\n",
+ " max(stage_ft) as max_rc_stage_ft,\n",
+ " max(discharge_cms) as max_rc_discharge_cms,\n",
+ " max(stage_m) as max_rc_stage_m\n",
+ " FROM ras2fim.geocurves\n",
+ " GROUP BY feature_id )\n",
+ "\"\"\"\n",
+ "print(sf.execute_sql(sql))\n",
+ "\n",
+ "sql = \"DROP INDEX IF EXISTS ras2fim.max_geocurves_feature_id_index;\"\n",
+ "print(sf.execute_sql(sql))\n",
+ "\n",
+ "sql = \"\"\"\n",
+ "CREATE INDEX IF NOT EXISTS max_geocurves_feature_id_index ON \n",
+ " ras2fim.max_geocurves USING btree (feature_id ASC NULLS LAST);\n",
+ "\"\"\"\n",
+ "print(sf.execute_sql(sql))\n",
+ "\n",
+ "print(\"max_geocurves table created and filled\")\n",
+ "\n",
+ "\n",
+ "print(\"\")\n",
+ "\n",
+ "end_dt = datetime.now()\n",
+ "time_duration = end_dt - start_dt\n",
+ "print(f\"... duration was {str(time_duration).split('.')[0]}\")\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "bec1b8bf-e1dc-4525-9066-afef7273e289",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "
5 - Load the Ras2Fim boundaries into egis
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "251905aa-259a-4ef4-9590-dcd31375c624",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# Moves data from local shapefile to EGIS\n",
+ "\n",
+ "# Importing Modules\n",
+ "import os\n",
+ "import sys\n",
+ "import helper_functions.shared_functions as sf\n",
+ "import geopandas as gpd\n",
+ "\n",
+ "# sys path if needed\n",
+ "sys.path.append(os.path.join(os.path.abspath(''), '..'))\n",
+ "\n",
+ "# Dir location of your data\n",
+ "DATA_DPATH = r\"/home/ec2-user/SageMaker/Don - Campground/Don - Store\"\n",
+ "# File location of your shapefile data\n",
+ "DATASET_DPATH = f\"{DATA_DPATH}/main_huc8.shp\"\n",
+ "# Check path by printing\n",
+ "print(DATASET_DPATH)\n",
+ "\n",
+ "# Only use when you want to create something new\n",
+ "gdf = gpd.read_file(DATASET_DPATH, columns='geometry')\n",
+ "gdf.to_postgis(name=\"boundaries2\", con=sf.get_db_engine(db_type=\"viz\"),\n",
+ " schema=\"ras2fim\", if_exists=\"replace\")\n",
+ "\n",
+ "# add an oid field to your data\n",
+ "sql = \"\"\"\n",
+ "ALTER TABLE ras2fim.boundaries ADD COLUMN oid SERIAL PRIMARY KEY;\n",
+ "\"\"\"\n",
+ "sf.execute_sql(sql)\n",
+ "\n",
+ "# delete a table in egis if you made a mistake or such\n",
+ "sql = \"\"\"\n",
+ "DROP TABLE IF EXISTS reference.ras2fim_boundaries;\n",
+ "\"\"\"\n",
+ "sf.execute_sql(sql, db_type=\"egis\")\n",
+ "\n",
+ "# The function move_data_from_viz_to_egis is called because the data needs to \n",
+ "# be in the egis before publishing to uat or prd"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "316b1321-f570-4f53-aeee-f8798330642f",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "
6 - Run AEP FIM Pipelines.
\n",
+ "Updated Documentation from Tyler Early 2024: This can be done in a couple of diferent ways.\n",
+ "\n",
+ "1) One option is to use the pipeline_input code created below by Corey to start the AEP pipelines directly from this notebook. \n",
+ " However, those pipeline_input dictionaries may very well be be out of date, pending more recent updates to the pipelines. \n",
+ "{\n",
+ " \"configuration\": \"reference\",\n",
+ " \"products_to_run\": \"static_nwm_aep_inundation_extent_library\",\n",
+ " \"invoke_step_function\": false\n",
+ "}\n",
+ "\n",
+ "Using this test event will produce the pipeline instructions, printing any errors that come up, and you can simply change the invoke_step_function flag to True when you're ready to actually invoke a pipeline run (which you can monitor/manage in the step function gui). You will need to manually update the static_nwm_aep_inundation_extent_library.yml product config file to only run 1 aep configuration at a time, and work through the configs as the pipelines finish (takes about an hour each). I've also found that the fim_data_prep lambda function needs to be temporarilly increased to ~4,500mb of memory to run these pipelines. It's also worth noting that these are very resource intesive pipelines, as FIM is calculated for every reach in the nation. AWS costs can amount to hundreds or even thousands of dollars by running these pipelines, so use responsibly.\n",
+ "\n",
+ "A couple other important notes:\n",
+ "- These AEP configurations write data directly to the aep_fim schema in the egis RDS database, instead of the viz database.\n",
+ "- You'll need to dump the aep_fim schema after that is complete for backup / deployment into other environments.\n",
+ "- This process has not been tested with new NWM 3.0 Recurrence Flows, and a good thorough audit / QC check of output data is warranted, given those changes and the recent updates to the pipelines.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "6a698067",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "\n",
+ "# Aug 6, 2024: Note: This was created after all intervals were created, so only HW was tested against\n",
+ "\n",
+ "def get_aep_pipeline_input(stage_interval):\n",
+ " pipeline_input = {\n",
+ " \"configuration\": \"reference\",\n",
+ " \"job_type\": \"auto\",\n",
+ " \"data_type\": \"channel\",\n",
+ " \"keep_raw\": False,\n",
+ " \"reference_time\": datetime.now().strftime('%Y-%m-%d 00:00:00'),\n",
+ " \"configuration_data_flow\": {\n",
+ " \"db_max_flows\": [],\n",
+ " \"db_ingest_groups\": [],\n",
+ " \"python_preprocessing\": []\n",
+ " },\n",
+ " \"pipeline_products\": [\n",
+ " {\n",
+ " \"product\": \"static_nwm_aep_inundation_extent_library\",\n",
+ " \"configuration\": \"reference\",\n",
+ " \"product_type\": \"fim\",\n",
+ " \"run\": True,\n",
+ " \"fim_configs\": [\n",
+ " {\n",
+ " \"name\": f\"rf_{stage_interval}_inundation\",\n",
+ " \"target_table\": f\"aep_fim.rf_{stage_interval}_inundation\",\n",
+ " \"fim_type\": \"hand\",\n",
+ " \"sql_file\": f\"rf_{stage_interval}_inundation\"\n",
+ " }\n",
+ " ],\n",
+ " \"services\": [\n",
+ " \"static_nwm_aep_inundation_extent_library_noaa\"\n",
+ " ],\n",
+ " \"raster_outputs\": {\n",
+ " \"output_bucket\": \"\",\n",
+ " \"output_raster_workspaces\": []\n",
+ " },\n",
+ " \"postprocess_sql\": [],\n",
+ " \"product_summaries\": [],\n",
+ " \"python_preprocesing_dependent\": False\n",
+ " }\n",
+ " ],\n",
+ " \"sql_rename_dict\": {},\n",
+ " \"logging_info\": {\n",
+ " \"Timestamp\": int(datetime.now().timestamp())\n",
+ " }\n",
+ " }\n",
+ "\n",
+ " return pipeline_input\n",
+ "\n",
+ "print(\"function: get_aep_pipeline_input loaded\")\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "f6d6ee69",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "\n",
+ "#### 2 Year Flow\n",
+ "pipeline_input = get_aep_pipeline_input(\"2\")\n",
+ "\n",
+ "# notice, slightly different object name\n",
+ "pipeline_name = f\"sagemaker_aep_2_{datetime.now().strftime('%Y%m%dT%H%M')}\"\n",
+ "\n",
+ "STEPFUNCTION_CLIENT.start_execution(\n",
+ " stateMachineArn = PIPELINE_ARN,\n",
+ " name = pipeline_name,\n",
+ " input= json.dumps(pipeline_input)\n",
+ ")\n",
+ "\n",
+ "print(f\"AEP : 2 year flows ie: rf_2_inundation kicked off. Can take 45 mins. Pipeline : hv-vpp-ti-viz-pipeline - {pipeline_name}\")\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "a4f89d9a",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "\n",
+ "#### 5 Year Flow\n",
+ "pipeline_input = get_aep_pipeline_input(\"5\")\n",
+ "\n",
+ "# notice, slightly different object name\n",
+ "pipeline_name = f\"sagemaker_aep_5_{datetime.now().strftime('%Y%m%dT%H%M')}\"\n",
+ "\n",
+ "STEPFUNCTION_CLIENT.start_execution(\n",
+ " stateMachineArn = PIPELINE_ARN,\n",
+ " name = pipeline_name,\n",
+ " input= json.dumps(pipeline_input)\n",
+ ")\n",
+ "\n",
+ "print(f\"AEP : 5 year flows ie: rf_5_inundation kicked off. Can take 45 mins. Pipeline : hv-vpp-ti-viz-pipeline - {pipeline_name}\")\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "791d1a8b",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "\n",
+ "#### 10 Year Flow\n",
+ "pipeline_input = get_aep_pipeline_input(\"10\")\n",
+ "\n",
+ "# notice, slightly different object name\n",
+ "pipeline_name = f\"sagemaker_aep_10_{datetime.now().strftime('%Y%m%dT%H%M')}\"\n",
+ "\n",
+ "STEPFUNCTION_CLIENT.start_execution(\n",
+ " stateMachineArn = PIPELINE_ARN,\n",
+ " name = pipeline_name,\n",
+ " input= json.dumps(pipeline_input)\n",
+ ")\n",
+ "\n",
+ "print(f\"AEP : 10 year flows ie: rf_10_inundation kicked off. Can take 45 mins. Pipeline : hv-vpp-ti-viz-pipeline - {pipeline_name}\")\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "1bb87128",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "\n",
+ "#### 25 Year Flow\n",
+ "pipeline_input = get_aep_pipeline_input(\"25\")\n",
+ "\n",
+ "# notice, slightly different object name\n",
+ "pipeline_name = f\"sagemaker_aep_25_{datetime.now().strftime('%Y%m%dT%H%M')}\"\n",
+ "\n",
+ "STEPFUNCTION_CLIENT.start_execution(\n",
+ " stateMachineArn = PIPELINE_ARN,\n",
+ " name = pipeline_name,\n",
+ " input= json.dumps(pipeline_input)\n",
+ ")\n",
+ "\n",
+ "print(f\"AEP : 25 year flows ie: rf_25_inundation kicked off. Can take 45 mins. Pipeline : hv-vpp-ti-viz-pipeline - {pipeline_name}\")\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "4832e4e0",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "\n",
+ "#### 50 Year Flow\n",
+ "pipeline_input = get_aep_pipeline_input(\"50\")\n",
+ "\n",
+ "# notice, slightly different object name\n",
+ "pipeline_name = f\"sagemaker_aep_50_{datetime.now().strftime('%Y%m%dT%H%M')}\"\n",
+ "\n",
+ "STEPFUNCTION_CLIENT.start_execution(\n",
+ " stateMachineArn = PIPELINE_ARN,\n",
+ " name = pipeline_name,\n",
+ " input= json.dumps(pipeline_input)\n",
+ ")\n",
+ "\n",
+ "print(f\"AEP : 50 year flows ie: rf_50_inundation kicked off. Can take 45 mins. Pipeline : hv-vpp-ti-viz-pipeline - {pipeline_name}\")\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "187e83bc-ebbe-4615-a046-e0ef7b09ad3b",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "\n",
+ "#### HW (High Water) Flow\n",
+ "pipeline_input = get_aep_pipeline_input(\"high_water\")\n",
+ "\n",
+ "# notice, slightly different object name\n",
+ "pipeline_name = f\"sagemaker_aep_hw_{datetime.now().strftime('%Y%m%dT%H%M')}\"\n",
+ "\n",
+ "STEPFUNCTION_CLIENT.start_execution(\n",
+ " stateMachineArn = PIPELINE_ARN,\n",
+ " name = pipeline_name,\n",
+ " input= json.dumps(pipeline_input)\n",
+ ")\n",
+ "\n",
+ "print(f\"AEP : High Water year flows ie: rf_hw_inundation kicked off. Can take 45 mins. Pipeline : hv-vpp-ti-viz-pipeline - {pipeline_name}\")\n",
+ "print(\"\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "2c810767-2f5d-46b2-860c-5d5c549f2e2a",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "
IMPORTANT: Return hv-vpp-ti-viz-fim-data-prep Lambda memory to 2048mb
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "37f37c81-d105-4c38-aa49-b4aef40a7543",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "''' Function to load CatFIM data (for any flow / stage / library / sites but non public)'''\n",
+ "\n",
+ "\n",
+ "def load_catfim_table(catfim_type):\n",
+ "\n",
+ " '''\n",
+ " Inputs:\n",
+ " - catfim_type: name identififer for the set, such as \"flow_based_catfim\" or \"flow_based_catfim_sites\", etc\n",
+ " Sometimes the file_handle name can be the name of the s3 file (without extension) and/or the table\n",
+ " name.\n",
+ " Options: flow_based_catfim, flow_based_catfim_sites, stage_based_catfim, stage_based_catfim_sites\n",
+ " '''\n",
+ "\n",
+ " db_type = \"egis\"\n",
+ " db_engine = sf.get_db_engine(db_type)\n",
+ " src_crs = \"3857\"\n",
+ "\n",
+ " # --------------------------------------\n",
+ " # Drop the original Db if already in place\n",
+ " table_name = catfim_type # yes, dup variable for now\n",
+ "\n",
+ " sf.execute_sql(f\"DROP TABLE IF EXISTS reference.{table_name};\", db_type=db_type)\n",
+ " print(f\"Dropping reference.{table_name} table if it existed\")\n",
+ " print(\"\")\n",
+ "\n",
+ " # --------------------------------------\n",
+ " # Get the data from S3 and load it into a df\n",
+ " if catfim_type in ['flow_based_catfim', 'stage_based_catfim']:\n",
+ " file_to_download = f\"{QA_DATASETS_DPATH}/{catfim_type}_library.csv\"\n",
+ " else:\n",
+ " file_to_download = f\"{QA_DATASETS_DPATH}/{catfim_type}.csv\"\n",
+ "\n",
+ " # print(f\"Downloading {file_to_download} ... \")\n",
+ "\n",
+ " df = s3_sf.download_S3_csv_files_to_df_from_list(FIM_BUCKET, [file_to_download], True)\n",
+ " num_recs = len(df)\n",
+ " print(f\"File read. {num_recs} records to load\")\n",
+ "\n",
+ " # --------------------------------------\n",
+ " # Adjusting Columns and data\n",
+ " # Rename headers. All files this name\n",
+ " df = df.rename(columns={'Unnamed: 0': 'oid',\n",
+ " 'geometry': 'geom',\n",
+ " 'huc': 'huc8'})\n",
+ "\n",
+ " # 4.5.2.11, fixing a column name bug\n",
+ " if catfim_type == 'stage_based_catfim_sites':\n",
+ " df = df.rename(columns={'nws_lid': 'ahps_lid'})\n",
+ "\n",
+ " # Convert all field names to lowercase (needed for ArcGIS Pro).\n",
+ " df.columns = df.columns.str.lower()\n",
+ "\n",
+ " # Remove sites that are in derived.ahps_restricted_sites\n",
+ " # TODO: Aug 2024: Need to see if this list needs to be updated. Submitted card.\n",
+ " restricted_sites_df = sf.get_db_values(\"derived.ahps_restricted_sites\", [\"*\"])\n",
+ " restricted_dict = restricted_sites_df.to_dict('records')\n",
+ "\n",
+ " for site in restricted_dict:\n",
+ " nws_lid = site['nws_lid'].lower()\n",
+ " #print(nws_lid)\n",
+ " if \"sites\" in catfim_type:\n",
+ " # print(True)\n",
+ " # print(nws_lid)\n",
+ " df.loc[df.ahps_lid == nws_lid, 'mapped'] = 'no'\n",
+ " df.loc[df.ahps_lid == nws_lid, 'status'] = site['restricted_reason']\n",
+ " # print(df.loc[df.ahps_lid==nws_lid]['status'])\n",
+ " else:\n",
+ " df.loc[df.ahps_lid == nws_lid, 'viz'] = 'no'\n",
+ " df = df[df['viz'] == 'yes']\n",
+ "\n",
+ " # TODO: Aug 2024: This may be a bug or very outdated. It was in the code to load stage for 4.4.0.0\n",
+ " # and I left it here for 4.5.2.11, but made a card with the FIM team to review and fix it in there code\n",
+ " # so we can drop this.\n",
+ " if 'stage_based' in catfim_type:\n",
+ " for sea_level_site in ['qutg1', 'augg1', 'baxg1', 'lamf1', 'adlg1', 'hrag1', 'stng1']:\n",
+ " if \"sites\" in catfim_type:\n",
+ " df.loc[df.ahps_lid==sea_level_site, 'mapped'] = 'no'\n",
+ " df.loc[df.ahps_lid==sea_level_site, 'status'] = 'Stage thresholds seem to be based on sea level and not channel thalweg'\n",
+ " else:\n",
+ " df.loc[df.ahps_lid==sea_level_site, 'viz'] = 'no'\n",
+ " df = df[df['viz']=='yes'] # Subset df to only sites desired for mapping\n",
+ " # end if\n",
+ "\n",
+ " # Enforce data types on df before loading in DB (TODO: need to create special cases for each layer).\n",
+ " df = df.astype({'huc8': 'str'})\n",
+ " df = df.fillna(0)\n",
+ " try:\n",
+ " df = df.astype({'feature_id': 'int'})\n",
+ " df = df.astype({'feature_id': 'str'})\n",
+ " except KeyError: # If there is no feature_id field\n",
+ " pass\n",
+ " try:\n",
+ " df = df.astype({'nwm_seg': 'int'})\n",
+ " df = df.astype({'nwm_seg': 'str'})\n",
+ " except KeyError: # If there is no nwm_seg field\n",
+ " pass\n",
+ " try:\n",
+ " df = df.astype({'usgs_gage': 'int'})\n",
+ " df = df.astype({'usgs_gage': 'str'})\n",
+ " except KeyError: # If there is no usgs_gage field\n",
+ " pass\n",
+ "\n",
+ " # zfill HUC8 field.\n",
+ " df['huc8'] = df['huc8'].apply(lambda x: x.zfill(8))\n",
+ "\n",
+ " if '_sites' in catfim_type:\n",
+ " df = df.astype({'nws_data_rfc_forecast_point': 'str'})\n",
+ " df = df.astype({'nws_data_rfc_defined_fcst_point': 'str'})\n",
+ " df = df.astype({'nws_data_riverpoint': 'str'})\n",
+ "\n",
+ " # TODO: Aug 27, 2024: For now, let's jsut override the \"version\" column and fix it when we\n",
+ " # reconsile the fim_version and model_version columns\n",
+ " df['version'] = PUBLIC_FIM_VERSION\n",
+ " df[COLUMN_NAME_FIM_VERSION] = PUBLIC_FIM_VERSION\n",
+ " df[COLUMN_NAME_MODEL_VERSION] = FIM_MODEL_VERSION\n",
+ "\n",
+ " # --------------------------------------\n",
+ " # Load to DB\n",
+ " # Chunk load data into DB\n",
+ " if catfim_type in ['flow_based_catfim', 'stage_based_catfim']:\n",
+ "\n",
+ " # Create list of df chunks\n",
+ " n = 1000 # chunk row size\n",
+ " print(f\"Chunk loading... into {table_name} -- {n} records at a time\")\n",
+ " print(\"\")\n",
+ " chunk_df = [df[i:i+n] for i in range(0, df.shape[0], n)]\n",
+ "\n",
+ " # Load the first chunk into the DB as a new table\n",
+ " first_chunk_df = chunk_df[0]\n",
+ " num_chunks = len(chunk_df)\n",
+ "\n",
+ " print(f\" ... loading chunk 1 of {num_chunks}\")\n",
+ "\n",
+ " first_chunk_df.to_sql(\n",
+ " name=table_name,\n",
+ " con=db_engine,\n",
+ " schema='reference',\n",
+ " if_exists='replace',\n",
+ " index=False,\n",
+ " dtype={'oid': sqlalchemy.types.Integer(),\n",
+ " 'geom': Geometry('MULTIPOLYGON', srid=src_crs)}\n",
+ " )\n",
+ "\n",
+ " # Load remaining chunks into newly created table\n",
+ " ctr = 1 # Already loaded one\n",
+ " for remaining_chunk in chunk_df[1:]:\n",
+ " # print(remaining_chunk.shape[0])\n",
+ " ctr += 1\n",
+ " print(f\" ... loading chunk {ctr} of {num_chunks}\")\n",
+ " remaining_chunk.to_sql(\n",
+ " name=table_name,\n",
+ " con=db_engine,\n",
+ " schema='reference',\n",
+ " if_exists='append',\n",
+ " index=False,\n",
+ " dtype={'oid': sqlalchemy.types.Integer(),\n",
+ " 'geom': Geometry('MULTIPOLYGON', srid=src_crs)\n",
+ " }\n",
+ " )\n",
+ " # end for\n",
+ " else: # sites tables\n",
+ " print(f\"Loading data into {table_name} ...\")\n",
+ "\n",
+ " df.to_sql(\n",
+ " name=table_name,\n",
+ " con=db_engine,\n",
+ " schema='reference',\n",
+ " if_exists='replace',\n",
+ " index=False,\n",
+ " dtype={'oid': sqlalchemy.types.Integer(),\n",
+ " 'geom': Geometry('POINT', srid=src_crs)}\n",
+ " )\n",
+ "\n",
+ " # This should auto create a gist index against the geometry column\n",
+ " # if that index name already exists, the upload will fail, the index can not pre-exist\n",
+ " # Best to drop the table before loading.\n",
+ "\n",
+ " # return\n",
+ "\n",
+ "print(\"load_catfim_table function loaded\")\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "e2b43cdc-8591-47f4-8db7-1058af5863f5",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "
13.a - Backup old DBs and prepare new databases (but not the \"public\" FIM 10/30 db's)
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "f9577edb-4aa6-423b-819e-df8c922c7ec2",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This covers both Stage Based and Flow Based (but not the \"public\" catfim db's)\n",
+ "\n",
+ "# The \"Public\" db backups ana loads are in cells lower (12.d and higher)\n",
+ "\n",
+ "# DONE for 4.4.0.0. (4.5.2.11)\n",
+ "\n",
+ "# # print(\"Starting Data Backups and table drops for stage and flow based catfim\")\n",
+ "# db_names = [\"stage_based_catfim\", \"stage_based_catfim_sites\",\n",
+ "# \"flow_based_catfim\", \"flow_based_catfim_sites\"]\n",
+ "\n",
+ "# for db_name in db_names:\n",
+ "# new_table_name = f\"reference.{db_name}_{OLD_FIM_TAG}\"\n",
+ "# sql = f'''\n",
+ "# CREATE TABLE IF NOT EXISTS {new_table_name} AS TABLE reference.{db_name};\n",
+ "# '''\n",
+ "# sf.execute_sql(sql, db_type='egis')\n",
+ "# print(f\"{db_name} copied to {new_table_name} if it does not already exist\")\n",
+ "\n",
+ "\n",
+ "# Aug 2024: Now we can drop the tables as we don't have any indexes on them at this time other than the gist geom index.\n",
+ "# By dropping them, we can auto adjust the tables schema. (don't truncate)\n",
+ "\n",
+ "# for db_name in db_names:\n",
+ "# sf.execute_sql(f\"DROP TABLE IF EXISTS reference.{db_name};\", db_type='egis')\n",
+ "# print(f\"reference.{db_name} table dropped if it existed\")\n",
+ "\n",
+ "\n",
+ "# print(\"Data Backups of flow based catfim are complete\")\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "4377dc64-963e-447e-a553-1a59c7cb1781",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "
13.b - Updated Flow and Stage Based CatFIM Data (Non Public)
\n",
+ "\n",
+ "
AUG 2024: IMPORTANT NOTE:
\n",
+ "The stage based catfim (library) csv has grown to appx 10 GiB. Our current notebook, hv-vpp-ti-viz-notebook only has 15 GiB memory.\n",
+ "Running tool can easily overwhelm the notebook server and freeze it up forcing a reboot.\n",
+ "Sometimes when the notebook instance comes back up, it no longer has ths swap system in place. You will need most of the memory\n",
+ "and some swap to load it. Keep an eye a \"terminal\" windows and keep entering `free -h` to keep an eye on it's usage.\n",
+ "\n",
+ "We will need to review to see if we want to:\n",
+ "\n",
+ "1. Upgrade this notebook server with more memory (and harddrive space would be good)\n",
+ "\n",
+ "2. Change the load of the catfim library (non sites) data to another system. Maybe we can load it via a lambda to an EC2 or something?\n",
+ "\n",
+ "3. Get the FIM Team to break it to smaller pieces, but watch carefully for the OID system (unique id for all records)\n",
+ "\n",
+ "**When you are done running this script, Please restart this kernal as it does not appear to be releasing all memory. (memory leak?)**\n",
+ "\n",
+ "\n",
+ "Also looks like Tyler has some notebooks where he was moving this into a lambda load? We need to look into that\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "95c55cd0",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "\n",
+ "print(\"Starting of CatFIM data\")\n",
+ "\n",
+ "# catfim_types = ['flow_based_catfim', 'flow_based_catfim_sites']\n",
+ "# catfim_types = ['stage_based_catfim', 'stage_based_catfim_sites']\n",
+ "catfim_types = ['stage_based_catfim_sites']\n",
+ "# catfim_types = ['stage_based_catfim']\n",
+ "\n",
+ "start_dt = datetime.now()\n",
+ "\n",
+ "for catfim_type in catfim_types:\n",
+ " print(f\"Loading {catfim_type} data\")\n",
+ " load_catfim_table(catfim_type)\n",
+ "\n",
+ "print(\"\")\n",
+ "end_dt = datetime.now()\n",
+ "time_duration = end_dt - start_dt\n",
+ "print(f\"... duration was {str(time_duration).split('.')[0]}\")\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "db3eef2c-8764-4c0a-9e4d-07d9bb6d4dfe",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "
13.c - CatFIM Backup old \"public\" FIM 10 / 30 DBs and prepare new databases
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "f5cf108d-7360-48a9-a4a2-81b23b9e51b4",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "'''\n",
+ "This covers ONLY Catfim public FIM 10/30 for both flow based and stage based\n",
+ "'''\n",
+ "\n",
+ "''' DONE for 4.4.0.0. (4.5.2.11)'''\n",
+ "\n",
+ "# db_name_appendix = f\"{OLD_FIM_TAG}_fim_10\"\n",
+ "\n",
+ "# print(\"Starting Data Backups and table drops for stage and flow based PUBLIC catfim\")\n",
+ "# # db_names = [\"stage_based_catfim_public\", \"stage_based_catfim_sites_public\",\n",
+ "# # \"flow_based_catfim_public\", \"flow_based_catfim_sites_public\"]\n",
+ "\n",
+ "# # stage_based_catfim_sites_public didn't exist for fim 10 but should have in TI (does in other enviros likely)\n",
+ "# db_names = [\"stage_based_catfim_public\", \n",
+ "# \"flow_based_catfim_public\", \"flow_based_catfim_sites_public\"]\n",
+ "\n",
+ "# for db_name in db_names:\n",
+ "# new_table_name = f\"reference.{db_name}_{db_name_appendix}\"\n",
+ "# sql = f\"CREATE TABLE IF NOT EXISTS {new_table_name} AS TABLE reference.{db_name}\"\n",
+ "# sf.execute_sql(sql, db_type='egis')\n",
+ "# print(f\"{db_name} copied to {new_table_name} if it does not already exist\")\n",
+ "\n",
+ " \n",
+ "# # Aug 2024: Now we can drop the tables as we don't have any indexes on them at this time other than the gist geom index.\n",
+ "# # By dropping them, we can auto adjust the tables schema. (don't truncate)\n",
+ "\n",
+ "# for db_name in db_names:\n",
+ "# sf.execute_sql(f\"DROP TABLE IF EXISTS reference.{db_name};\", db_type='egis')\n",
+ "# print(f\"reference.{db_name} table dropped if it existed\")\n",
+ "\n",
+ "# print(\"Data Backups of flow based catfim are complete\")\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "3e451ed4-148f-4ccd-83bf-dff900915efb",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "
13.d - Load CatFIM \"public\" FIM 30 DBs
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "fd38c00d-22ad-476e-ab05-cf293e5bbc15",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "\n",
+ "\n",
+ "print(\"Loading CatFIM Public datasets (FIM 30)\")\n",
+ "\n",
+ "catfim_types = [\"stage_based_catfim\", \"stage_based_catfim_sites\",\n",
+ " \"flow_based_catfim\", \"flow_based_catfim_sites\"]\n",
+ "\n",
+ "__public_fim_release = \"fim_30\" # The new fim public release being loaded (ie. fim_10, fim_30, fim_60..)\n",
+ "\n",
+ "start_dt = datetime.now()\n",
+ "\n",
+ "for catfim_type in catfim_types:\n",
+ " print(\"\")\n",
+ " sql = f'''\n",
+ " DROP TABLE IF EXISTS reference.{catfim_type}_public;\n",
+ "\n",
+ " SELECT\n",
+ " catfim.*,\n",
+ " '{__public_fim_release}' as public_fim_release\n",
+ " INTO reference.{catfim_type}_public\n",
+ " FROM reference.{catfim_type} as catfim\n",
+ " JOIN reference.public_fim_domain as fim_domain ON ST_Intersects(catfim.geom, fim_domain.geom)\n",
+ " '''\n",
+ " print(sf.execute_sql(sql, db_type='egis'))\n",
+ " print(f\"public {__public_fim_release} data load for {catfim_type} is complete\")\n",
+ "\n",
+ "# what about indexes again?\n",
+ "\n",
+ "# for db_name in db_names:\n",
+ "# new_table_name = f\"reference.{db_name}_{db_name_appendix}\"\n",
+ "# sql = f\"CREATE TABLE IF NOT EXISTS {new_table_name} AS TABLE reference.{db_name}\"\n",
+ "# sf.execute_sql(sql, db_type='egis')\n",
+ "# print(f\"{db_name} copied to {new_table_name} if it does not already exist\")\n",
+ "\n",
+ "print(\"\")\n",
+ "end_dt = datetime.now()\n",
+ "time_duration = end_dt - start_dt\n",
+ "print(f\"... duration was {str(time_duration).split('.')[0]}\")\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "65578391-f29e-4de4-8d98-ab48a0375603",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "