Gradient generation tool to support GENN models in FOQUS (#1164)

* Brady's gradient generation script * Adding helpful docstrings and comments * Brady's working files for gradient model regression * Integrate Brady's regression script into main file, clean up * fix formatting for black and pylint * Generalize for multiple outputs, add settings optimization loop * Add documentation for gradient tool * Manually run spellchecker on ML AI Plugin files * Fix links and file names * Add option to use simple partial derivative in lieu of chain rule formula * Fix comment * run black again * minor typo * Give docs page reference name --------- Co-authored-by: Keith Beattie <ksbeattie@lbl.gov>
CCSI-Toolset · Sep 29, 2023 · 237022b · 237022b
1 parent baa13c8
commit 237022b
Show file tree

Hide file tree

Showing 7 changed files with 659 additions and 2 deletions.
diff --git a/docs/source/chapt_surrogates/gradients.rst b/docs/source/chapt_surrogates/gradients.rst
@@ -0,0 +1,83 @@
+.. _gengrad:
+
+Gradient Generation to Support Gradient-Enhanced Neural Networks
+================================================================
+
+Neural networks are useful in instances where multivariate process data
+is available and the mathematical functions describing the variable
+relationships are unknown. Training deep neural networks is most efficient
+when samples of the variable derivatives, or gradients, are collected
+simultaneously with process data. However, gradient data is often unavailable
+unless the physics of the system are known and predetermined such as in
+fluid dynamics with outputs of known physical properties.
+
+These gradients may be estimated numerically using solely the process data. The
+gradient generation tool described below requires a Comma-Separated Value (CSV) file
+containing process samples (rows), with inputs in the left columns and outputs in the rightmost
+columns. Multiple outputs are supported, as long as they are the rightmost columns, and
+the variable columns may have string (text) headings or data may start in row 1. The method
+produces a CSV file for each output variable containing gradients with respect to each input
+variable (columns), for each sample point (rows). After navigating to the FOQUS directory
+*examples/other_files/ML_AI_Plugin*, the code below sets up and calls the gradient generation
+method on the example dataset *MEA_carbon_capture_dataset_mimo.csv*:
+
+.. code:: python
+
+  # required imports
+  >>> import pandas as pd
+  >>> import numpy as np
+  >>> from generate_gradient_data import generate_gradients
+  >>> 
+  >>> data = pd.read_csv(r"MEA_carbon_capture_dataset_mimo.csv")  # get dataset
+  >>> data_array = np.array(data, ndmin=2)  # convert to Numpy array
+  >>> n_x = 6  # we have 6 input variables, in the leftmost 6 columns
+
+  >>> gradients = generate_gradients(
+  >>>   xy_data=data_array,
+  >>>   n_x=n_x,
+  >>>   show_plots=False,  # flag to plot regression results during gradient training
+  >>>   optimize_training=True,  # will try many regression settings and pick the best result
+  >>>   use_simple_diff=True  # flag to use simple partials instead of chain rule formula; defaults to False if not passed
+  >>>   )
+  >>> print("Gradient generation complete.")
+
+  >>> for output in range(len(gradients)):  # save each gradient array to a CSV file
+  >>>   pd.DataFrame(gradients[output]).to_csv("gradients_output" + str(output) + ".csv")
+  >>>   print("Gradients for output ", str(output), " written to gradients_output" + str(output) + ".csv",)
+  
+Internally, the gradient generation methods automatically executes a series of actions on the dataset:
+
+1. Import process data of size *(m, n_x + n_y)*, where *m* is the number of sample rows,
+*n_x* is the number of input columns and *n_y* is the number of output columns. Given *n_x*,
+the data is split into an input array *X* and an output array *Y*.
+
+2. For each input *xi* and each output *yj*, estimate the gradient using a multivariate
+chain rule approximation. For example, the gradient of y1 with respect to x1 is
+calculated at each point as:
+
+:math:`\frac{Dy_1}{Dx_1} = \frac{dy_1}{dx_1} \frac{dx_1}{dx_1} + \frac{dy_1}{dx_2} \frac{dx_2}{dx_1} + \frac{dy_1}{dx_3} \frac{dx_3}{dx_1} + ...`
+
+where *D/D* represents the total derivative, *d/d* represents a partial derivative at each
+sample point. *y1*, *x1*, *x2*, *x3*, and so on are vectors with values at each sample point *m*, and
+this formula produces the gradients of each output with respect to each input at each sample point by iterating
+through the dataset. The partial derivatives are calculated by simple finite difference. For example:
+
+:math:`\frac{dy_1}{dx_1} (m_{1.5}) = \frac{y_1 (m_2) - y_1 (m_1)}{x_1 (m_2) - x_1 (m_1)}`
+
+where *m_1.5* is the midpoint between sample points *m_2* and *m_1*. As a result, this scheme
+calculates gradients at the points between the sample points, not the actual sample points.
+
+3. Train an MLP model on the calculated midpoint and midpoint-gradient values. After normalizing the data
+via linear scaling (see :ref:`mlaiplugin.datanorm`),
+the algorithm leverages a small neural network model to generate gradient data for the actual
+sampe points. Passing the argument *optimize_training=True* will train models using the optimizers
+*Adam* or *RMSProp*, with activation functions *ReLu* or *Sigmoid* on hidden layers, using a *Linear*
+or *ReLu* activation function on the output layer, building *2* or *8* hidden layers with *6* or *12*
+neurons per hidden layer. The algorithm employs cross-validation to check the mean-squared-error (MSE) loss
+on each model and uses the model with the smallest error to predict the sample gradients.
+
+4. Predict the gradients at each sample point from the regressed model. This produces *n_y*
+arrays with each having size *(m, n_x)* - the same size as the original input array *X*.
+
+5. Concatenate the predicted gradients into a single array of size *(m, n_x, n_y)*. This is the
+single object returned by the gradient generation method.
diff --git a/docs/source/chapt_surrogates/index.rst b/docs/source/chapt_surrogates/index.rst
@@ -7,6 +7,7 @@ Contents
 .. toctree::
     :maxdepth: 2
 
+    gradients
     mlaiplugin
     reference
     tutorial/index
diff --git a/docs/source/chapt_surrogates/mlaiplugin.rst b/docs/source/chapt_surrogates/mlaiplugin.rst
@@ -1,3 +1,5 @@
+.. _mlaiplugin:
+
 Machine Learning & Artificial Intelligence Flowsheet Model Plugins
 ==================================================================
 
@@ -99,7 +101,7 @@ Currently, FOQUS supports the following custom attributes:
   bounds for each output variable (default: (0, 1E5))
 - *normalized* – Boolean flag for whether the user is passing a normalized
   neural network model; to use this flag, users must train their models with
-  data normalized according to a specifc scaling form and add all input and
+  data normalized according to a specific scaling form and add all input and
   output bounds custom attributes. The section below details scaling options.
 - *normalization_form* - string flag required when *normalization* is *True*
   indicating a scaling option for FOQUS to automatically scale flowsheet-level
@@ -108,6 +110,8 @@ Currently, FOQUS supports the following custom attributes:
 - *normalization_function* - optional string argument that is required when a
   'Custom' *normalization_form* is used. The section below details scaling options.
 
+.. _mlaiplugin.datanorm:
+
 Data Normalization For Neural Network Models
 --------------------------------------------