Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gradient generation tool to support GENN models in FOQUS #1164

Merged
merged 16 commits into from
Sep 29, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
83 changes: 83 additions & 0 deletions docs/source/chapt_surrogates/gradients.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
.. _gengrad:

Gradient Generation to Support Gradient-Enhanced Neural Networks
================================================================

Neural networks are useful in instances where multivariate process data
is available and the mathematical functions describing the variable
relationships are unknown. Training deep neural networks is most efficient
when samples of the variable derivatives, or gradients, are collected
simultaneously with process data. However, gradient data is often unavailable
unless the physics of the system are known and predetermined such as in
fluid dynamics with outputs of known physical properties.

These gradients may be estimated numerically using solely the process data. The
gradient generation tool described below requires a Comma-Separated Value (CSV) file
containing process samples (rows), with inputs in the left columns and outputs in the rightmost
columns. Multiple outputs are supported, as long as they are the rightmost columns, and
the variable columns may have string (text) headings or data may start in row 1. The method
produces a CSV file for each output variable containing gradients with respect to each input
variable (columns), for each sample point (rows). After navigating to the FOQUS directory
*examples/other_files/ML_AI_Plugin*, the code below sets up and calls the gradient generation
method on the example dataset *MEA_carbon_capture_dataset_mimo.csv*:

.. code:: python

# required imports
>>> import pandas as pd
>>> import numpy as np
>>> from generate_gradient_data import generate_gradients
>>>
>>> data = pd.read_csv(r"MEA_carbon_capture_dataset_mimo.csv") # get dataset
>>> data_array = np.array(data, ndmin=2) # convert to Numpy array
>>> n_x = 6 # we have 6 input variables, in the leftmost 6 columns

>>> gradients = generate_gradients(
>>> xy_data=data_array,
>>> n_x=n_x,
>>> show_plots=False, # flag to plot regression results during gradient training
>>> optimize_training=True, # will try many regression settings and pick the best result
>>> use_simple_diff=True # flag to use simple partials instead of chain rule formula; defaults to False if not passed
>>> )
>>> print("Gradient generation complete.")

>>> for output in range(len(gradients)): # save each gradient array to a CSV file
>>> pd.DataFrame(gradients[output]).to_csv("gradients_output" + str(output) + ".csv")
>>> print("Gradients for output ", str(output), " written to gradients_output" + str(output) + ".csv",)

Internally, the gradient generation methods automatically executes a series of actions on the dataset:

1. Import process data of size *(m, n_x + n_y)*, where *m* is the number of sample rows,
*n_x* is the number of input columns and *n_y* is the number of output columns. Given *n_x*,
the data is split into an input array *X* and an output array *Y*.

2. For each input *xi* and each output *yj*, estimate the gradient using a multivariate
chain rule approximation. For example, the gradient of y1 with respect to x1 is
calculated at each point as:

:math:`\frac{Dy_1}{Dx_1} = \frac{dy_1}{dx_1} \frac{dx_1}{dx_1} + \frac{dy_1}{dx_2} \frac{dx_2}{dx_1} + \frac{dy_1}{dx_3} \frac{dx_3}{dx_1} + ...`

where *D/D* represents the total derivative, *d/d* represents a partial derivative at each
sample point. *y1*, *x1*, *x2*, *x3*, and so on are vectors with values at each sample point *m*, and
this formula produces the gradients of each output with respect to each input at each sample point by iterating
through the dataset. The partial derivatives are calculated by simple finite difference. For example:

:math:`\frac{dy_1}{dx_1} (m_{1.5}) = \frac{y_1 (m_2) - y_1 (m_1)}{x_1 (m_2) - x_1 (m_1)}`

where *m_1.5* is the midpoint between sample points *m_2* and *m_1*. As a result, this scheme
calculates gradients at the points between the sample points, not the actual sample points.

3. Train an MLP model on the calculated midpoint and midpoint-gradient values. After normalizing the data
via linear scaling (see :ref:`mlaiplugin.datanorm`),
the algorithm leverages a small neural network model to generate gradient data for the actual
sampe points. Passing the argument *optimize_training=True* will train models using the optimizers
*Adam* or *RMSProp*, with activation functions *ReLu* or *Sigmoid* on hidden layers, using a *Linear*
or *ReLu* activation function on the output layer, building *2* or *8* hidden layers with *6* or *12*
neurons per hidden layer. The algorithm employs cross-validation to check the mean-squared-error (MSE) loss
on each model and uses the model with the smallest error to predict the sample gradients.

4. Predict the gradients at each sample point from the regressed model. This produces *n_y*
arrays with each having size *(m, n_x)* - the same size as the original input array *X*.

5. Concatenate the predicted gradients into a single array of size *(m, n_x, n_y)*. This is the
single object returned by the gradient generation method.
1 change: 1 addition & 0 deletions docs/source/chapt_surrogates/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ Contents
.. toctree::
:maxdepth: 2

gradients
mlaiplugin
reference
tutorial/index
6 changes: 5 additions & 1 deletion docs/source/chapt_surrogates/mlaiplugin.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _mlaiplugin:

Machine Learning & Artificial Intelligence Flowsheet Model Plugins
==================================================================

Expand Down Expand Up @@ -99,7 +101,7 @@ Currently, FOQUS supports the following custom attributes:
bounds for each output variable (default: (0, 1E5))
- *normalized* – Boolean flag for whether the user is passing a normalized
neural network model; to use this flag, users must train their models with
data normalized according to a specifc scaling form and add all input and
data normalized according to a specific scaling form and add all input and
output bounds custom attributes. The section below details scaling options.
- *normalization_form* - string flag required when *normalization* is *True*
indicating a scaling option for FOQUS to automatically scale flowsheet-level
Expand All @@ -108,6 +110,8 @@ Currently, FOQUS supports the following custom attributes:
- *normalization_function* - optional string argument that is required when a
'Custom' *normalization_form* is used. The section below details scaling options.

.. _mlaiplugin.datanorm:

Data Normalization For Neural Network Models
--------------------------------------------

Expand Down
Loading
Loading