Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
MCM - tree-traversal implementation of native MCM execution (#5180)
### Before submitting Please complete the following checklist when submitting a PR: - [x] All new features must include a unit test. If you've fixed a bug or added code that should be tested, add a test to the test directory! - [x] All new functions and code must be clearly commented and documented. If you do make documentation changes, make sure that the docs build and render correctly by running `make docs`. - [x] Ensure that the test suite passes, by running `make test`. - [x] Add a new entry to the `doc/releases/changelog-dev.md` file, summarizing the change, and including a link back to the PR. - [x] The PennyLane source code conforms to [PEP8 standards](https://www.python.org/dev/peps/pep-0008/). We check all of our code against [Pylint](https://www.pylint.org/). To lint modified files, simply `pip install pylint`, and then run `pylint pennylane/path/to/file.py`. When all the above are checked, delete everything above the dashed line and fill in the pull request template. ------------------------------------------------------------------------------------------------------------ **Context:** Native MCM execution is slow because executing `n_shots` tapes is generally redundant and has a lot of overheads. **Description of the Change:** Introduce `simulate_tree_mcm` and make it the default execution mode when using finite shots & MCMs. `dynamic_one_shot` can still be applied explicitly as a transform. `simulate_tree_mcm` implements a "high-memory" depth-first tree-traversal algorithm. It is deemed high-memory because a copy of the state vector is made at every node in the tree. Since this is a depth first traversal, it incurs a memory cost proportional to `(n_mcm + 1) 2 ** n_qubit` to store the state vectors at any moment. **Benefits:** Much faster execution in almost any case. Opens avenues for improvement and other features, for example low-memory depth-first tree-traversal, high-prob-first traversal, quantum noise simulations. Here are some benchmarks to illustrate the gains. The following synthetic workloads shows that for small circuits with not too many MCM, deferred measurements is best. The tree-traversal approach is slower than deferred measurements, but much faster than the one-shot implementation. ![synthetic_time_vs_shots](https://github.com/PennyLaneAI/pennylane/assets/8711156/ad32a445-68b7-4823-b1ff-a14d87c020bf) A more meaningful example is to run iterative QPE for 10 iterations with a varying number of shots. The one-shot implementation is again sluggish. The tree-traversal implementation does better, but appears to scale worst then deferred measurements, again the fastest. ![iterqpe_time_vs_shots](https://github.com/PennyLaneAI/pennylane/assets/8711156/8cf92fe5-84dd-43fe-9ac1-70c990080910) The picture changes when running iterative QPE with 1e6 samples for varying iterations. We do not perform one-shot benchmarks since it is too slow. The tree-traversal implementation is indeed much slower in the 10-20 iteration range, but starts winning over deferred measurements beyond that. It thus appears to have a larger prefactor which is eventually compensated by slightly better scaling. ![iterqpe_time_vs_iters](https://github.com/PennyLaneAI/pennylane/assets/8711156/836fb41f-492b-4c59-ba00-4d23a191e111) Finally, we perform few-shots calculations to illustrate regimes where one-shot could be useful. There is indeed an observable cross-over between one-shot and deferred-measurements. The tree traversal implementation however is usually faster even with few shots because it then has a limited number of branches to explore before running out of shots. ![iterqpe_time_vs_iters_all](https://github.com/PennyLaneAI/pennylane/assets/8711156/2bf338a7-e79c-46fc-929e-9e1f326e6915) **Possible Drawbacks:** Some features not tested yet: - `jax.jit` - Catalyst `qjit` **Related GitHub Issues:** Mid circuit Measurements tree traversal implementation [sc-56035] [sc-65242] --------- Co-authored-by: Mudit Pandey <mudit.pandey@xanadu.ai> Co-authored-by: Christina Lee <christina@xanadu.ai> Co-authored-by: Matthew Silverman <matthews@xanadu.ai> Co-authored-by: Thomas R. Bromley <49409390+trbromley@users.noreply.github.com>
- Loading branch information