Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How much value normalization will objecthash do? #16

Open
pedrocr opened this issue Jul 5, 2017 · 6 comments
Open

How much value normalization will objecthash do? #16

pedrocr opened this issue Jul 5, 2017 · 6 comments

Comments

@pedrocr
Copy link

pedrocr commented Jul 5, 2017

I see from the code that unicode normalization is done. Is the idea also to do NaN normalization in floats? There are uses where this is not ideal and you don't want normalization at all. Will this be an option in the future? I'd actually argue the default case should be that otherwise it's easy to generate matching hashes for objects that will then behave differently when passed to certain functions.

@tarcieri
Copy link
Member

tarcieri commented Jul 5, 2017

All other objecthash implementations make Unicode normalization a configurable toggle. The present ObjectHasher API does not carry around the context to provide such toggles, although it could.

I will go ahead and leave this issue open to discuss whether such a toggle should be added.

That said, there are two places objecthash performs the sort of normalization you might see in a canonicalization scheme:

  • Unicode normalization
  • Sorting object keys by their objecthashes

@pedrocr
Copy link
Author

pedrocr commented Jul 5, 2017

And how about float, any plan for that?

@tarcieri
Copy link
Member

tarcieri commented Jul 5, 2017

@pedrocr personally I hate floating points, and the nascent objecthash-inspired scheme I have been working on does not support them at all.

That said, NaN (along with Infinity) is not part of the JSON data model, and IMO is best avoided. Attempting to serialize those (or hash them) should be an error.

@pedrocr
Copy link
Author

pedrocr commented Jul 5, 2017

Ok, between this and the performance penalty of all those text conversions I'll have to roll my own then. Supporting float is essential and NaN's are just part of the data. serde->bincode->Sha256 seems to be working well but I may take a stab at just implementing some form of #[derive(CryptoHash)].

@tarcieri
Copy link
Member

tarcieri commented Jul 5, 2017

Note the scheme I'm working on avoids the text conversion

@pedrocr
Copy link
Author

pedrocr commented Jul 5, 2017

If it doesn't do floats it's a no-go for me. Good image processing is all about f32. I'd have to do a bunch of manual conversions to/from fixed-point or something of the sort and that just makes for extremely ugly code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants