Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hashing the expression that produces the value #13

Closed
CMCDragonkai opened this issue Mar 12, 2019 · 6 comments
Closed

Hashing the expression that produces the value #13

CMCDragonkai opened this issue Mar 12, 2019 · 6 comments

Comments

@CMCDragonkai
Copy link

CMCDragonkai commented Mar 12, 2019

I've been looking for a simple content-store written in Python that does similar things to Nix and Funflow.

Nix store is a content store, but instead of storing hashes of the content, they store hashes of a canonical and reproducible expression that produces that content. This is useful in package build graphs where you can avoid rebuilding a package (i.e. running a package build expression) if you already know that the package is already built. This is possible because it is the hash of the package build expression that was hashed, not the package itself.

In translating that to Python. Suppose I had a lambda function that was pure with no free variables (as in no usage of closures). Then as long I can hash the lambda function and the input parameters together, I can store the resulting value in a dictionary, and not have to re-run that function with the same input parameters if I check if the hash exists in the dictionary already.

This generalises to lambdas with closures, and what Nix does is that it traverses the closure and adds each pointer to outside the function into the content store as well.

What's your opinion of this for hashfs?

@dgilland
Copy link
Owner

Are you suggesting memoization around a function that results in the return from the function being stored by it's content hash and storing the input parameters to the function as a key that points to the content hash so that if the function is called again and the input parameters key exists, then the content would be returned without having to run the function again?

@CMCDragonkai
Copy link
Author

Yes that is exactly it!

@CMCDragonkai
Copy link
Author

CMCDragonkai commented Mar 14, 2019

There's only the pesky issue of closures. They would have to be lexically closed.

However up to programmer discipline to use default values to ensure lexically captured parameters. But only for things that aren't objects.

@dgilland
Copy link
Owner

Do you have an example of the closure issue you're talking about? I want to make sure I understand what the issue is.

@CMCDragonkai
Copy link
Author

In this you can see that the closure l is not lexically closed.

x = 1
l = lambda: x + 1
print(l())
x = 2
print(l())

@dgilland
Copy link
Owner

I'm not opposed to having memoization support added to hashfs. I wouldn't likely have any time myself to work on it, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants