Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement tool for saved Keras model files inspection, diff and patching #19768

Closed
wants to merge 2 commits into from

Conversation

pmasousa
Copy link

@pmasousa pmasousa commented May 28, 2024

It consists of three sub features, that allow:
Visualization of the contents of .keras and .weights.h5 files in a notebook, where you can further expand the contents up until the weights of the file. Diff of a model compared to a reference model, being presented side by side with the differences highlighted. Patching a model, where it is possible to change a layer's name and changing a specified weight of a model.

Please let us know if this is what you had in mind.

closes #19705

It consists of three sub features, that allow:
Visualization of the contents of .keras and .weights.h5 files, where you can further expand the contents up until the weights of the file.
Diff of a model compared to a reference model, being presented side by side with the differences highlighted.
Patching a model, where it is possible to change a layer's name and changing a specified weight of a model.

Co-authored-by: Pedro Curto <pedro.a.curto@tecnico.ulisboa.pt>
@fchollet
Copy link
Member

Thanks for the PR! Do you have a Colab notebook that demos the new features?

@pmasousa
Copy link
Author

Yes, here

I couldn't implement the changes in the saved weights file, only in a .keras file.
If you could explain how to do it I would appreciate it.

@fchollet
Copy link
Member

Great work! Here are some things we should do.

  1. For every display output, there should be a shell mode (plain text with text color tags) and a notebook mode (html). We route to one or the other based on whether we detect we are in a notebook or not.
  2. The diff isn't super useful. What we should do is:
    • Compare the model structure; for instance if a layer is absent from one model but present in the other, we should highlight that. We should display the names of those layers, and the count of weights and sublayers associated with it.
    • For each layer that matches (by path) across the models, we should compare the weight structure (number of weights, weights shapes, dtypes) and highlight any differences.
  3. The edit tool can be a class, e.g.
editor = KerasFileEditor(filepath)
editor.list_layer_paths()  # Return all layer paths
editor.layer_info(layer_path)  # Show weight structure for this layer
editor.edit_layer(layer_path, new_name=..., new_vars=...)
editor.write_out(filepath)

I couldn't implement the changes in the saved weights file, only in a .keras file.

They aren't very different -- the weights file is one of the files present in the .keras file. You only need to implement these features for the weights file, then the same code will also work for the .keras file.

@pedro-curto
Copy link

Thank you very much for your feedback, we really appreciate it!

We're currently on our degree's final two weeks, so everything is very intensive right now and we barely have any time because of the projects. We would really like to make those changes and get back to you because we enjoyed working on this feature and would like it to be as good as possible and according to your needs and the specification. We will get back to you and make the required changes starting next week, if that is ok with you. Again, thanks for your time, patience and feedback!

@fchollet
Copy link
Member

Sure -- there's no rush! Thanks for working on this!

Implemented the solicited changes:
Changed inspect_file to differentiate between shell mode (with plain text with text color tags) and notebook mode (using HTML)
Changed the diff functionality to match the solicited requirements (comparing model structure and weight structure according to specification in PR discussion) and have clearer and better output
Reworked the edit tool to be a class and have the solicited methods (listing layer paths, showing weight structure, editing layers and writing out to a path)

Co-authored-by: Pedro Curto <pedro.a.curto@tecnico.ulisboa.pt>
@pedro-curto
Copy link

Hello. Sorry for taking so long.
We've made a commit with the changes that we believe that match your feedback on the things we should do on the tool. We have a link to a colab that we prepared, in case you want to test the functionalities and see what we changed and how in an interactive way: it's this colab. Is this what you had in mind?
Any feedback is greatly appreciated, and thank you for your time and patience with us so far!

@fchollet
Copy link
Member

fchollet commented Jul 6, 2024

Thanks for the update -- the functionality looks great! I think the interface could look more professional though. Maybe we can first focus on the HTML version of the interface (for Colab / notebooks) and then we can figure out later what the CLI / text-only version should look like?

@pedro-curto
Copy link

Glad you liked it!
I didn't understand what you meant by referring that the interface could look more professional, could you clarify a bit for us to know what we should change? Thanks for guiding us until now!

@fchollet
Copy link
Member

fchollet commented Jul 8, 2024

I didn't understand what you meant by referring that the interface could look more professional, could you clarify a bit for us to know what we should change?

Sure. What you had at the very end of this notebook was quite nice, for instance. Penzai is also reasonably nice.

@pedro-curto
Copy link

Just to clarify, you would like us to make the compare_models interface look more professional, like the inspect_file interface?

@fchollet
Copy link
Member

fchollet commented Jul 8, 2024

Just to clarify, you would like us to make the compare_models interface look more professional, like the inspect_file interface?

Yes, exactly -- preferably something with interactive HTML. It could list the layers for which there was a discrepancy, and clicking on the layer would reveal the issue. Interactiveness enables greater UX clarity.

@pedro-curto
Copy link

Okay, thank you for explaining. Me and my friend are currently working so this will only be possible to do in our free time, but we will keep you updated if there is progress!

Copy link

This PR is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale label Jul 27, 2024
@pedro-curto
Copy link

I'd really like to make the required changes but it's been impossible to have time, so I'm commenting for the PR not to close.

@fchollet
Copy link
Member

Thank you for the contributions here so far -- we merged a subset of this feature in keras/src/saving/file_editor.py. Currently it only really works with the CLI and not HTML/js, but you are more than welcome to contribute improvements in this direction going forward!

@fchollet fchollet closed this Sep 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Closed/Rejected
Development

Successfully merging this pull request may close these issues.

Implement tool for saved Keras model file inspection, diff, and patching.
4 participants