Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keras file format updates #1401

Merged
merged 27 commits into from
Aug 19, 2024
Merged

Conversation

rundxdi
Copy link
Contributor

@rundxdi rundxdi commented Apr 25, 2024

Fixes

Addresses file format changes associated with updates to tensorflow and keras mentioned in issue #1369

Summary/Motivation:

Tensorflow and keras version updates changed how they save/load files. This PR updates the keras_surrogate.py file to accommodate those changes.

Changes proposed in this PR:

  • Update keras_surrogate.py to save new .keras file format
  • Update test_keras_surrogate.py with new versions of saved models and accounting for new save/load methods

Legal Acknowledgement

By contributing to this software project, I agree to the following terms and conditions for my contribution:

  1. I agree my contributions are submitted under the license terms described in the LICENSE.txt file at the top level of this directory.
  2. I represent I am authorized to make the contributions and grant the license. If my employer has rights to intellectual property that includes these contributions, I represent that I have received permission to make contributions and grant the required license on behalf of that employer.

@@ -263,7 +263,7 @@ def save_to_folder(self, keras_folder_name):
The name of the folder to contain the Keras model and additional
IDAES metadata
"""
self._keras_model.save(keras_folder_name)
self._keras_model.save(os.path.join(keras_folder_name, "idaes_keras_model.keras"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the file name be hard-coded here, or should we make this an input from the user (either optional or required).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I support letting users pass a string for the file name, so something like

Suggested change
self._keras_model.save(os.path.join(keras_folder_name, "idaes_keras_model.keras"))
self._keras_model.save(os.path.join(keras_folder_name, keras_file_name + ".keras"))

where keras_file_name is an input to the save method. I don't know if we still need to specify the folder name, or if a single string for the entire path is sufficient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The folder name is used to save/load some model weights and some idaes specific information separate from the filename. But I think the file name should be rename-able as you say -- making that change now

@bpaul4
Copy link
Contributor

bpaul4 commented Apr 25, 2024

@rundxdi thank you for opening this. Once this is ready to test, it would be good to undo the temporary fix implemented in #1373 to ensure the latest version of TensorFlow is used in the tests.

@ksbeattie ksbeattie added the Priority:High High Priority Issue or PR label Apr 25, 2024
@ksbeattie
Copy link
Member

@rundxdi any progress on this?

@rundxdi
Copy link
Contributor Author

rundxdi commented May 4, 2024

@rundxdi any progress on this?

Nothing this week -- still trying to figure out plotting tests.

@rundxdi rundxdi requested a review from AlexNoring as a code owner May 8, 2024 20:28
@rundxdi
Copy link
Contributor Author

rundxdi commented May 9, 2024

I have two remaining issues:

  1. Python 3.8 doesn't like test_keras_surrogate.py. Tests all produce failure statuses looking like:

Exception encountered: Unrecognized keyword arguments: ['batch_shape']

These tests all pass on Python 3.9+.

  1. The test_sofc_surrogates.py tests rely on a saved Keras surrogate that cannot be loaded with Keras v3+ (and thus cannot be loaded with Tensorflow 2.16.1). I've found no obvious way to generate an equivalent surrogate in Keras v3 equivalent surrogate; their suggested workaround doesn't work.

@lbianchi-lbl
Copy link
Contributor

1. Python 3.8 doesn't like test_keras_surrogate.py.  Tests all produce failure statuses looking like:

Exception encountered: Unrecognized keyword arguments: ['batch_shape']

These tests all pass on Python 3.9+.

My guess is that this is due to NumPy (and consequently I imagine most of the downstream projects depending on it) having dropped support for Python 3.8 a while back. So, when installing on Python 3.8, an older version of the package (the latest compatible) gets installed, which I assume doesn't support the same arguments as the current version.

We're considering removing support for (i.e. stopping testing with) Python 3.8. In the meantime, you should be able to use pytest.mark.skipif() to skip the failing tests if they're being run: https://docs.pytest.org/en/latest/how-to/skipping.html#id1

2. The test_sofc_surrogates.py tests rely on a saved Keras surrogate that cannot be loaded with Keras v3+ (and thus cannot be loaded with Tensorflow 2.16.1).  I've found no obvious way to generate an equivalent surrogate in Keras v3 equivalent surrogate; their suggested workaround doesn't work.

With the disclaimer that I don't have any significant hands-on experience using Keras, I'm at least aware of there being limited or no support for cross-version compatibility (i.e. a model serialized with Keras version X won't work when you try to deserialize it using version Y). I'm not sure how much the new (current) serialization format changes things. If this is the case, I guess the solution would be to regenerate the file used in the tests using the current version. I'm not sure if this addresses the original issue, though.

@andrewlee94
Copy link
Member

As mentioned in the dev call, regarding the test failure in the power generation models you should:

  1. Mark the failing tests with pytest.mark.xfail
  2. In the failing code, add a deprecation warning indicating the tests for this code have started failing and the code will be removed no earlier than August if it is not fixed.
  3. Open an issue to fix this and assign it to the code owner(s), and put this on the August release board as a High priority so we track it.

@ksbeattie ksbeattie linked an issue May 23, 2024 that may be closed by this pull request
@ksbeattie
Copy link
Member

@rundxdi we expect to cut the May release next week, so if this PR is to be included, it needs to be merged very soon.

@ksbeattie
Copy link
Member

ksbeattie commented Jun 6, 2024

@rundxdi, this missed the May release, now on the Aug release board.

Copy link
Contributor

@bpaul4 bpaul4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @rundxdi for investigating and implementing the fix. The changes to the Keras/OMLT calls and file writing look good.

I have a couple questions. First, the models previously import from a folder containing saved_model.pb, keras_metadata.pb, and idaes_info.json files and a variables/ folder containing variables.index and variables.data-00000-of-00001 files. Are these still necessary, or can we remove them in favor of the .keras files? I've seen in my research into Keras 3 that it now doesn't support SavedModel files and includes a read-only TFSM format instead.

Second, do we want to look into the Python 3.8 failures? Since it isn't occurring in the newer Python environments, it may just be a compatibility issue that's best skipping in that one environment.

@bpaul4
Copy link
Contributor

bpaul4 commented Jul 9, 2024

@rundxdi when you have a moment, could you please provide a status update? What is required to finalize this?

@rundxdi
Copy link
Contributor Author

rundxdi commented Jul 17, 2024

@rundxdi when you have a moment, could you please provide a status update? What is required to finalize this?

Hi @bpaul4 -- we can safely remove any of the non .keras files. The Python 3.8 tests should be skipped at this point.

It should be ready to finalize/review for finalization.

@codecov-commenter
Copy link

codecov-commenter commented Jul 17, 2024

Codecov Report

Attention: Patch coverage is 73.33333% with 4 lines in your changes missing coverage. Please review.

Project coverage is 76.36%. Comparing base (c2825ca) to head (0c2775e).
Report is 25 commits behind head on main.

Files with missing lines Patch % Lines
idaes/core/surrogate/keras_surrogate.py 77.77% 2 Missing ⚠️
...generation/properties/sofc/sofc_keras_surrogate.py 66.66% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1401      +/-   ##
==========================================
- Coverage   76.38%   76.36%   -0.03%     
==========================================
  Files         394      394              
  Lines       65121    65126       +5     
  Branches    14427    14426       -1     
==========================================
- Hits        49745    49732      -13     
- Misses      12813    12833      +20     
+ Partials     2563     2561       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@bpaul4 bpaul4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good now, thanks @rundxdi!

lbianchi-lbl added a commit to bpaul4/examples that referenced this pull request Jul 18, 2024
@lbianchi-lbl
Copy link
Contributor

Before merging this, we should remove Python 3.8 from the CI and see if the tests pass without needing to mark them as xfailing.

@bpaul4 bpaul4 requested a review from andrewlee94 August 16, 2024 14:25
@bpaul4
Copy link
Contributor

bpaul4 commented Aug 16, 2024

@andrewlee94 it looks like all of your prior review comments have been addressed.

I removed the xfail markers from the keras surrogate tests and everything looks good.

Copy link
Member

@andrewlee94 andrewlee94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bpaul4 I was waiting for the Python 3.8 issue to be merged to make sure the tests passed. I have just one minor request.

@@ -53,6 +53,7 @@ def build_rom():


@pytest.mark.unit
@pytest.mark.xfail
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add an inline comment to explain why this is xfailed. I know this is mentioned in the deprecation warning in the code, but having comments here will help reinforce that this issue needs to be fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add the comments now. @AlexNoring is aware of the SOFC surrogate issue and will continue the discussion here: #1431

@bpaul4
Copy link
Contributor

bpaul4 commented Aug 16, 2024

@andrewlee94 how would you suggest handling the failing example integration tests? The failures are resolved as part of IDAES/examples#128.

@andrewlee94
Copy link
Member

@bpaul4 I would say we need to get that examples PR merged first.

Copy link
Contributor

@lbianchi-lbl lbianchi-lbl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly minor comments, questions, and suggestions.

@@ -53,6 +53,7 @@ def build_rom():


@pytest.mark.unit
@pytest.mark.xfail # test xfailed due to out-of-date, incompatible Keras surrogates
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: use the reason kwarg instead of a comment allows the message to be displayed when running the tests, increasing its visibility:

Suggested change
@pytest.mark.xfail # test xfailed due to out-of-date, incompatible Keras surrogates
@pytest.mark.xfail(reason="test xfailed due to out-of-date, incompatible Keras surrogates")

@@ -78,6 +79,7 @@ def test_basic_build(build_rom):

@pytest.mark.skipif(solver is None, reason="Solver not available")
@pytest.mark.component
@pytest.mark.xfail # test xfailed due to out-of-date, incompatible Keras surrogates
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: use the reason kwarg instead of a comment allows the message to be displayed when running the tests, increasing its visibility:

Suggested change
@pytest.mark.xfail # test xfailed due to out-of-date, incompatible Keras surrogates
@pytest.mark.xfail(reason="test xfailed due to out-of-date, incompatible Keras surrogates")

@@ -97,9 +99,20 @@ def test_initialize(build_rom):

@pytest.mark.skipif(solver is None, reason="Solver not available")
@pytest.mark.component
@pytest.mark.xfail # test xfailed due to out-of-date, incompatible Keras surrogates
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: use the reason kwarg instead of a comment allows the message to be displayed when running the tests, increasing its visibility:

Suggested change
@pytest.mark.xfail # test xfailed due to out-of-date, incompatible Keras surrogates
@pytest.mark.xfail(reason="test xfailed due to out-of-date, incompatible Keras surrogates")

Comment on lines +31 to +32
msg = "Tests for sofc_keras_surrogate.py have started failing. The code will be removed no early than August if it is not fixed."
deprecation_warning(msg=msg, logger=_log, version="2.5.0", remove_in="3.0.0")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
msg = "Tests for sofc_keras_surrogate.py have started failing. The code will be removed no early than August if it is not fixed."
deprecation_warning(msg=msg, logger=_log, version="2.5.0", remove_in="3.0.0")
msg = "Tests for sofc_keras_surrogate.py have started failing following Tensorflow/Keras version updates (see IDAES/idaes-pse#1401). The code will be removed no earlier than August 2024 if it is not fixed."
deprecation_warning(msg=msg, logger=_log, version="2.5.0", remove_in="3.0.0")

@@ -93,7 +93,7 @@ class ExtraDependencies:
]
omlt = [
"omlt==1.1", # fix the version for now as package evolves
'tensorflow < 2.16.1 ; python_version < "3.12"',
"tensorflow",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: given the changes in this PR, would it make sense to require a minimum version of 2.16.1 here?

Copy link
Contributor

@bpaul4 bpaul4 Aug 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that would be appropriate. although removing the tag will automatically search for the latest version. Pinning a minimum version to support Keras 3 would be a good safety net.

Comment on lines +266 to +268
self._keras_model.save(
os.path.join(keras_folder_name, keras_model_name + ".keras")
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: since Keras seems to support pathlib.Path objects now, this could be rewritten to be slightly more concise/readable (requires replacing import os.path with from pathlib import Path at the top of the file):

Suggested change
self._keras_model.save(
os.path.join(keras_folder_name, keras_model_name + ".keras")
)
self._keras_model.save(
Path(keras_folder_name, keras_model_name).with_suffix(".keras")
)

Comment on lines +298 to +302

keras_model = keras.models.load_model(
os.path.join(keras_folder_name, keras_model_name + ".keras")
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous comments for using pathlib.Path instead of os.path.

Comment on lines -322 to +334
nn.save_weights(os.path.join(path, "{}.h5".format(name)))
nn.save(os.path.join(path, "{}.keras".format(name)))
nn.save_weights(os.path.join(path, "{}.weights.h5".format(name)))


def load_keras_json_hd5(path, name):
with open(os.path.join(path, "{}.json".format(name)), "r") as json_file:
json_model = json_file.read()
nn = keras.models.model_from_json(json_model)
nn.load_weights(os.path.join(path, "{}.h5".format(name)))
nn = keras.models.load_model(os.path.join(path, "{}.keras".format(name)))
nn.load_weights(os.path.join(path, "{}.weights.h5".format(name)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments above for using pathlib.Path instead of os.path.

lbianchi-lbl added a commit to IDAES/examples that referenced this pull request Aug 16, 2024
* update save, load syntax

* regenerate outdated notebooks

* generate new model files

* remove old keras files

* Test using IDAES/idaes-pse#1401

* Update version constraint for TensorFlow

* Address Python 3.8 failures due to Tensorflow 2.16.1 not being available

* Remove exclusion for Python 3.12 for Tensorflow

* Remove Python 3.8 support

* Restore installing idaes-pse from main branch

---------

Co-authored-by: Ludovico Bianchi <lbianchi@lbl.gov>
@lbianchi-lbl
Copy link
Contributor

lbianchi-lbl commented Aug 16, 2024

There might be another test that needs to be skipped/xfailed as it's currently causing a failure in the "user install" integration checks: https://github.com/IDAES/idaes-pse/actions/runs/10424410674/job/28873157657?pr=1401#step:6:538

image

@andrewlee94
Copy link
Member

@lbianchi-lbl I think we should look into that one a bit more first, and fix it if we can. Being that is in the core code, we should at least know why it fails.

@lbianchi-lbl
Copy link
Contributor

@lbianchi-lbl I think we should look into that one a bit more first, and fix it if we can. Being that is in the core code, we should at least know why it fails.

The fact that it's only failing in the user-mode check, plus the exception raised, seem to point towards the file not being included in the non-developer installation:

image

@lbianchi-lbl lbianchi-lbl merged commit 9dce004 into IDAES:main Aug 19, 2024
37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority:High High Priority Issue or PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Resolve Keras File Format Issues
6 participants