Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using array-api-compat to unify backends over time #2253

Open
1 task done
matthewfeickert opened this issue Jul 16, 2023 · 3 comments
Open
1 task done

Consider using array-api-compat to unify backends over time #2253

matthewfeickert opened this issue Jul 16, 2023 · 3 comments
Labels
feat/enhancement New feature or request needs-triage Needs a maintainer to categorize and assign

Comments

@matthewfeickert
Copy link
Member

Summary

In a similar vein to Issue #2249, it should be possible over time to drastically reduce the code surface area that the pyhf backends need to provide themselves if array-api-compat is used. An advantage to array-api-compat compared to keras-core is that array-api-compat has no additional dependencies and extends the possible backends to anything that implements the Array API.

I'm not fully clear on the best way to implement this, but given the usage example from the README I was thinking (this might be wrong and @asmeurer might have more ideas) to create an array backend (src/pyhf/tensor/array_backend.py (maybe this should be called array_compat instead?)) that defines the typical tensor operations. The tensor backends that are based on Array API compatible libraries (at the moment NumPy, PyTorch) could then just be implementations of the array_backend class and extend or overwrite the API as needed for things that aren't in the standard yet. This would allow for easy transitions from using a full backend to the array backend (in the case of JAX which is on the way) and also allow for use of the custom backends in the case of TensorFlow that has no official plans to switch at this point.

For the way that we currently implement things like the PyTorch backend (https://github.com/scikit-hep/pyhf/blob/ff9cb94025e5485b23ea81a06ce8916055297c7f/src/pyhf/tensor/pytorch_backend.py) and set_backend and get_backend in the the manager (https://github.com/scikit-hep/pyhf/blob/ff9cb94025e5485b23ea81a06ce8916055297c7f/src/pyhf/tensor/manager.py) I would probably need to think about this with @kratsg. I am hoping that this would be not too difficult to do.

If this works, a decent test would to also try to see how implementing a CuPy backend would work (though I don't really think we need to add it).

Additional Information

c.f. @asmeurer's SciPy 2023 talk: Python Array API Standard: Toward Array Interoperability in the Scientific Python Ecosystem

Code of Conduct

  • I agree to follow the Code of Conduct
@matthewfeickert
Copy link
Member Author

c.f. scipy/scipy#18668 and scikit-learn/scikit-learn#25956 for how scipy and scikit-learn added support for array-api-compat. 👍

@matthewfeickert
Copy link
Member Author

And here's a Qunsight Labs blog post(!) by @thomasjpfan on how the scikit-learn support was done: Array API Support in scikit-learn

@matthewfeickert
Copy link
Member Author

And another Qunsight Labs blog post by @lucascolley on how the scipy support was done: The Array API Standard in SciPy 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat/enhancement New feature or request needs-triage Needs a maintainer to categorize and assign
Projects
None yet
Development

No branches or pull requests

1 participant