Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add tensor_hash #460

Closed
wants to merge 2 commits into from
Closed

Conversation

lessw2020
Copy link
Contributor

Summary:
This adds a tensor hash function for multi-dim tensors.
It's purpose is to allow easy full tensor verification on forward tensors for other unit tests
by storing the hash of the answer tensor.
This strikes a middle ground between simple verification (i.e. first row, mean of an axis) and every item in the tensor verification which would require storing the entire tensor in a unit test.

Usage =
hash the answer tensor, save in your unit test.
Run a forward in unit test, hash the output tensor.
Compare hashed tensors to confirm unit test result.

Test plan:
Verified a + .01 change to single item in a multi-dim tensor is detected, verified identical tensors generate identical hashes.
Since this is a unit test function, not sure that unit test for a unit test function is needed as will not change unless function itself is modified.

Fixes #{issue number}

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 5, 2023
@lessw2020
Copy link
Contributor Author

unit test failure is not related to this PR

@pbontrager
Copy link
Contributor

Could you provide a little more context for this function? It seems to compare sums for the final dimension, does this provide more precision than mean? It's just one extra dimension of data. I guess this would be more likely to catch tensors that had the right data but the wrong shape, is that the primary use case? To add clarity to my question, I am wondering what the potential mistakes are that are not caught by mean or mean(-1) and if this implementation of a hash catches all of them.

@lessw2020
Copy link
Contributor Author

Hi @pbontrager,
My concern with the mean test is that it's an average, and multiple combinations of numbers could potentially end up at the same mean, so I felt the mean test alone is not sufficient to fully check a tensor.

I can't guarantee that a hash will catch all issues and probably using both would result in the best coverage, but I show a simple example below where using the mean fails to detect the difference of .01 to an entry, vs the hash does using the default all_close settings that are in the multi-modal tests.

Here adding .01 to a single entry is not detected by mean (see line 18 allclose = True), but is detected with the hashing.

hash_and_mean

Copy link
Contributor

@rohan-varma rohan-varma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just testing

Copy link
Contributor

@rohan-varma rohan-varma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test

@rohan-varma rohan-varma self-requested a review September 5, 2023 22:44
@facebook-github-bot
Copy link
Contributor

@ebsmothers has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ebsmothers merged this pull request in 1034ed7.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants