Faster Layout and MDC Store Creation #448

coreyostrove · 2024-06-05T05:37:24Z

Building off of the changes in PR #445 as part of an effort at reducing the overhead associated with running GST analysis, this PR further speeds up the construction of COPA layouts and MDC stores/objective functions. Both in general, but even more so in the context of iterative GST estimation.

Note: This branch is forked off of 'feature-faster-circuit-primitives', and so some of these commits are from that branch and will get cleaned up whenever #445 gets merged into develop (in case you notice duplicates).

Here is a summary of the changes:

Caching for layout creation: One of the main bottlenecks in layout creation is performing a variety of (relatively expensive) preprocessing steps on circuits. In the context of iterative GST we almost always use a nested circuit structure wherein for each new iteration we keep all of the previous circuits and add a few new ones. New layouts are constructed at each iteration, and previously we started from scratch each time. Now prior to starting the iterative estimation we precompute and construct a cache for the more expensive circuit structures used in layout generation, significantly reducing the amount of repeated calculations.
Caching for MDC store construction: Similar story to layout creation. Previously one of the bottlenecks in MDC store construction was computing the number of possible outcomes for a circuit (including POVMs and potentially instruments). This resulted in repeated calls to a method called expand_instruments_and_separate_povm which is quite expensive. Like layouts we construct a fresh MDC store at each iteration, so by caching this information we significantly reduce the amount of repeated calculations.
The expand_instruments_and_separate_povm method has been refactored, moving it from being a method of Circuits which takes as input a model into being a method of OpModels which takes as input a circuit. This method was a strange fit for the Circuit class, as it relies pretty heavily on knowledge of the internal details of a model, and from an abstraction hierarchy level this was also a out of character thing for a Circuit to be asked to do. It makes much more sense for an OpModel (the implementation effectively assumed you'd only be calling it on an OpModel given the attributes and methods it assumed were implemented, so now that is enforced), and OpModel is also where the conceptually similar complete_circuit(s) and split_circuit(s) methods live. In addition to the refactor, I have also added a new bulk_expand_instruments_and_separate_povm which more efficiently performs this operation on a list of circuits.
Miscellaneous performance updates in service of speeding up layout and MDC store construction.

I look forward to any and all feedback.

P.S. I didn't mention how these changes affect the bottom line on runtime yet. For single-qubit GST with the XYI gate set, the full TP parameterization, and a maximum depth of 128 (as before, I haven't yet tested this on the 2Q case, or for other experiment designs, so ymmv) overall we have:

MatrixCOPALayout creation is ~4X faster
MapCOPALayout creation is ~2X faster
ModelDataSetCircuitStore/TimeIndependentMDCObjectiveFunction creation is ~5X faster.

Those changes are measured relative to the runtimes following the changes made on #445. Combined with #445 this gives an overall reduction in end-to-end runtime (for the 1Q test case described above) of ~50% when using the MatrixForwardSimulator, and ~15% when using the MapForwardSimulator (proportionally less of the overall runtime is spent performing these subroutines with map, so this difference makes sense).

The creation of COPA layouts relies on a number of specialized circuit structures which require non-trivial time to construct. In the context of iterative GST estimation with nested circuit lists (i.e. the default) this results in unnecessarily repeat construction of these objects. This is an initial implementation of a caching scheme allowing for more efficient re-use of these circuit structures across iterations.

…layout-creation

Cache the expanded SPAM-free circuits to reduce recomputing things unnecessarily.

This updates the implementation of the SeparatePOVMCircuit containter class. The most important change is adding an attribute for the full_effect_labels that avoids uneeded reconstruction. To add protection then, to ensure that this is kept in sync with everything else, the povm_label and effect_labels attributes (which feed into full_effect_labels) have been promoted to properties with setters that ensure the full_effect_labels are kept synced.

Adds a new method to OpModel that allows for doing instrument expansion and povm expansion in bulk, speeding things up be avoiding recomputation of shared quantities. Also adds a pipeline for re-using completed or split circuits (as produced by the related OpModel methods) for more efficient re-use of done work.

Some minor performance oriented tweaks to the init for COPA layouts.

Refactor some of the ordered dictionaries in matrix layout creation into regular ones.

…layout-creation

Start adding infrastructure for caching things used in MDC store creation and for plumbing in stuff from layout creation.

Performance optimization for the method for adding omitted frequencies to incorporate caching of the number of outcomes per circuit (which is somewhat expensive since it goes through the instrument/povm expansion code). Additionally refactor some other parts of this code for improved efficiency. Also makes a few minor tweaks to the method for adding counts to speed that up as well. Can probably make this a bit faster still by merging the two calls to reduce redundancy, but that is a future us problem. Additionally make a few microoptimizations to the dataset code for grabbing counts, and to slicetools adding a function for directly giving a numpy array for a slice (instead of needing to cast from a list). Miscellaneous cleanup of old commented out code that doesn't appear needed any longer.

Fix a bug I introduced in dataset indexing into something that could be None.

Another minor bug caught by testing.

Not sure why this didn't get caught on the circuit update branch, but oh well...

…layout-creation

Fixes minor error in split_circuits.

Improve the performance of __getitem__ when indexing into static circuits by making use of the _copy_init code path.

Implement caching of circuit structures tailored to the map forward simulator's requirements.

This finishes the process of refactoring expand_instruments_and_separate_povm from a circuit method to a method of OpModel.

Refactor expand_instruments_and_separate_povm to use the multi-circuit version under the hood to reduce code duplication.

rileyjmurray

It looks like this branch came off of the feature-faster-circuit-primitives branch. That's making Git think this PR contains all the material from PR #445. I'll hold off on a full review of this PR until (1) #445 is merged and (2) this PR is updated, as needed, in order for Git to see the minimal set of changes.

That said, I do have three actionable comments. Two comments are given in-line. Here's a broader comment.

Right now the changes introduce free-functions like create_matrix_copa_layout_circuit_cache and create_map_copa_layout_circuit_cache. I'd prefer that these were static class functions, and that the name class signifier (_matrix_, _map__, etc..) be dropped from the function names. From there, you could invoke the function just with mld.sim.create_copa_layout_circuit_cache(...).

If you'd like you could make create_copa_layout_circuit_cache a method in the ForwardSimulator base class. If that method were to raise a NotImplementedError by default then you could replace some branching if-statements with

try:
    precomp_layour_circuit_cache = mdl.sim.create_copa_layout_circuit_cache(...)
except NotImplementedError:
    precomp_layour_circuit_cache = None

pygsti/algorithms/core.py

…layout-creation

Refactor cache creation functions into static methods of the corresponding forward simulator class. Also add an empty base version of this method, and clean up a few miscellaneous things caught by review.

coreyostrove · 2024-06-05T23:07:27Z

Thanks for the feedback and suggestions, @rileyjmurray. I have incorporated your feedback and refactored the cache creation methods into static methods of the corresponding forward simulator classes. I also addressed the two inline comments, so I have marked those as resolved.

coreyostrove · 2024-06-05T23:24:45Z

pygsti/layouts/matrixlayout.py

@@ -78,24 +82,29 @@ def add_expanded_circuits(indices, add_to_this_dict):
 for i in indices:
 nospam_c = unique_nospam_circuits[i]
 for unique_i in circuits_by_unique_nospam_circuits[nospam_c]: # "unique" circuits: add SPAM to nospam_c
- observed_outcomes = None if (dataset is None) else dataset[ds_circuits[unique_i]].unique_outcomes


Question for folks, but particularly aimed at @enielse. A comparable line to this one for computing the quantity named observed_outcomes appears here, but in the map layout code this queries the outcomes property of DataSetRow instead of the unique_outcomes property as is done here. I think this should be unique_outcomes in both cases (I don't see a reason why it wouldn't also be for map), but I wanted to run this by you before I made that change.

I don't see any reason why map cannot also use unique_outcomes instead of outcomes, but am basing that just on looking through this code while reviewing. Expert @enielse opinion still welcome.

rileyjmurray

This looks good overall, but the current changes have broken unit tests for some forward simulators. I don't know why all of the failures are happening. One of the failures is because TorchForwardSimulator uses Circuit.expand_instruments_and_separate_povm.

coreyostrove · 2024-07-30T17:40:57Z

This looks good overall, but the current changes have broken unit tests for some forward simulators. I don't know why all of the failures are happening. One of the failures is because TorchForwardSimulator uses Circuit.expand_instruments_and_separate_povm.

TIL that you can copy a link to a specific line of the github actions log, neat.

Yup, this makes sense. One of the changes made on this branch was moving that method from the Circuit class to the OpModel class. This branch was forked off of a version of develop before the torch forward simulator code was merged in, so I'll need to make some minor updates.

Fix a few minor issues related to refactored code and updates made in this branch.

sserita

Excited to get this in and get those speedups in layout creation! First, I have a few comments: one which I think will error in certain edge cases, and others that are optional but may cut down on branching to increase maintainability.

sserita · 2024-07-30T20:47:31Z

pygsti/algorithms/core.py

+
+ #pre-compute a dictionary caching completed circuits for layout construction performance.
+ unique_circuits = list({ckt for circuit_list in circuit_lists for ckt in circuit_list})
+ if isinstance(mdl.sim, (_fwdsims.MatrixForwardSimulator, _fwdsims.MapForwardSimulator)):


Are we checking this because only matrix and map forward sims have this defined? If so, thoughts on a try/catch on NotImplementedError instead so that if create_[...]_cache is defined on another forward sim, we don't have to remember and come back to change this check?

This staticmethod is implemented in the ForwardSimulator base class, but raises a NotImplementedError. Though that might eventually change (it would be possible in principle to implement caching for the base COPA layout class' cache creation, but I haven't done so yet).

I've been spending too much time the past few months on SWE youtube, and one of the SWE design philosophies du jour that I've been enamored with as of late is 'Locality of Behavior' (LoB, as described here). While I'm not a purist by any means, in this instance I think it is easier (for someone reading this for the first time, or the first time in a while) to understand this portion of the code by explicitly signaling for which forward simulators the caching will happen, rather than the try/except paradigm which would require looking into the source code for the various forward simulator classes to identify when it won't happen.

That said, you're right there is a trade-off in doing so, which is needing to remember to come back to this and explicitly enable caching in protocols which support it. (Realistically it'd most likely be one of either me, you or Riley messing around with pyGSTi internals at this level, so hopefully we'd remember).

I hadn't heard of LoB before, but I like it.

sserita · 2024-07-30T20:54:52Z

pygsti/forwardsims/matrixforwardsim.py

+ cache['split_circuits'] = {ckt: split_ckt for ckt, split_ckt in zip(circuits, split_circuits)}
+
+ #There is some potential aliasing that happens in the init that I am not
+ #doing here, but I think 90+% of the time this ought to be fine.


What does this mean? In the <10% of cases that you expect to run into, will the cache/resulting layout still be created correctly?

Good question, it took me a minute to remember what I was thinking about here. (It wasn't loaded into cache, ba dum dum tsss...)

The thing that happens in the init, that isn't happening here, are the following calls:

unique_circuits, to_unique = self._compute_unique_circuits(circuits) aliases = circuits.op_label_aliases if isinstance(circuits, _CircuitList) else None ds_circuits = _lt.apply_aliases_to_circuits(unique_circuits, aliases)

This allows for the possibility of aliasing in the keys of the DataSet. This aliasing is the part that I didn't implement in the code shown below. The thing that would happen in the <10% of cases I was alluding to, i.e. the cases where someone is using aliasing (as far as I am aware this is not a widely used feature) is that ds_row would be None some amount (maybe all) of the time, and we'd pass a list with None entries into bulk_expand_instruments_and_separate_povm. This is allowed (and is in fact the default), but it would result in expanded circuits being created for every POVM and instrument effect, instead of just those for which outcomes were observed. The resulting layout wouldn't be wrong, but would be less memory efficient than it could otherwise be, as it would have space allocated for probabilities we won't be calculating. It would also make the bulk_expand_instruments_and_separate_povm less efficient than it otherwise could be.

All of that said, I am not entirely sure why I didn't implement the aliasing for the DataSet keys at the time. The answer is probably something along the lines of: "I wasn't sure how the aliasing code worked and couldn't be arsed to figure it out at the time." Anyhow, I see no reason not to add this back in (I'm pretty sure I can just essentially copy and paste the lines above into the cache creation routine), so I'll give it a crack.

sserita · 2024-07-30T20:59:40Z

pygsti/layouts/matrixlayout.py

@@ -78,24 +82,29 @@ def add_expanded_circuits(indices, add_to_this_dict):
 for i in indices:
 nospam_c = unique_nospam_circuits[i]
 for unique_i in circuits_by_unique_nospam_circuits[nospam_c]: # "unique" circuits: add SPAM to nospam_c
- observed_outcomes = None if (dataset is None) else dataset[ds_circuits[unique_i]].unique_outcomes


I don't see any reason why map cannot also use unique_outcomes instead of outcomes, but am basing that just on looking through this code while reviewing. Expert @enielse opinion still welcome.

pygsti/layouts/matrixlayout.py

pygsti/objectivefns/objectivefns.py

Add in support for data set key aliasing in COPA layout cache creation.

Rework some of the if statement branching in the layout creation to instead use fallback behavior of get more.

I accidentally put down the wrong directory for temp testing files in the RB testing code.

coreyostrove · 2024-07-31T03:57:51Z

Thanks for the careful feedback, @sserita!

I have just pushed some changes addressing this feedback, to summarize the main changes:

I've added back in alias support during cache creation to better match the original implementations in the various COPA layout inits. (Inspired by one of your questions about a comment I had left).
I have replace some of the branching statements for cache existence with an empty dictionary + get based version where appropriate. There are a few places where the behavior in the case where the cache isn't present is meaningfully different than when invoking the fallback behavior for missing values (because they are using bulk methods), and in those instances I haven't made this change.

There is also a miscellaneous typo fix in the RB unit tests that I accidentally introduced with the IRB branch that I'm throwing in for good measure, in case you notice the bizarre inclusion.

sserita

The aliasing looks good. I just have one question left on the RB test change.

sserita · 2024-07-31T16:11:23Z

test/unit/protocols/test_rb.py

@@ -163,9 +163,9 @@ def test_combined_design_access(self):

 def test_serialization(self):

- self.irb_design.write('../../test_packages/temp_test_files/test_InterleavedRBDesign_serialization')
+ self.irb_design.write('../../test/test_packages/temp_test_files/test_InterleavedRBDesign_serialization')


I'm confused about why this change is needed. Aren't we in test/unit/protocols, so ../../test_packages is the right way to get to those files?

I thought so originally myself, but I noticed when running unit tests locally last night that a new test_packages directory was being created at the base level (i.e. at the same level as test and pygsti) and populated with these files. In retrospect I suppose I could have equivalently removed one of the sets of '..'. Not sure why this is the case though, tbh, maybe something to do with how pytest sets the cwd?

It is also possible this was related to which directory I was launching pytest from, so if you have a venv handy to run these tests on and have a different experience that'd be good to know.

Corey Ostrove added 19 commits May 22, 2024 16:50

Merge branch 'feature-faster-circuit-primitives' into feature-faster-…

66e7f78

…layout-creation

Add caching for spam-free circuit expansion

9bc47bc

Cache the expanded SPAM-free circuits to reduce recomputing things unnecessarily.

Minor COPA Layout __init__ tweaks

d97f786

Some minor performance oriented tweaks to the init for COPA layouts.

Refactor some OrderedDicts into regular ones

544fb55

Refactor some of the ordered dictionaries in matrix layout creation into regular ones.

Merge branch 'feature-faster-circuit-primitives' into feature-faster-…

1d4e5a0

…layout-creation

Start the process of adding caching to MDC store creation

91d5ebb

Start adding infrastructure for caching things used in MDC store creation and for plumbing in stuff from layout creation.

Fix dataset bug

e8e7004

Fix a bug I introduced in dataset indexing into something that could be None.

Another minor bugfix caught by testing

aa22c3c

Another minor bug caught by testing.

Another minor bugfix caught by testing

be80255

Update test_stdinputparser.py

ff13da6

Not sure why this didn't get caught on the circuit update branch, but oh well...

Merge branch 'feature-faster-circuit-primitives' into feature-faster-…

81bdacb

…layout-creation

Fix indentation error

f8c5840

Fixes minor error in split_circuits.

Faster implementation of __getitem__

0417c20

Improve the performance of __getitem__ when indexing into static circuits by making use of the _copy_init code path.

Implement caching for map layout creation

c39101d

Implement caching of circuit structures tailored to the map forward simulator's requirements.

Fix bugs in new extract_labels implementation

6cc69bc

coreyostrove changed the base branch from master to develop June 5, 2024 05:37

Corey Ostrove added 2 commits June 4, 2024 23:54

Finish refactoring expand_instruments_and_separate_povm

1ff8aeb

This finishes the process of refactoring expand_instruments_and_separate_povm from a circuit method to a method of OpModel.

Refactor expand_instruments_and_separate_povm

5db3e59

Refactor expand_instruments_and_separate_povm to use the multi-circuit version under the hood to reduce code duplication.

coreyostrove self-assigned this Jun 5, 2024

coreyostrove added this to the 0.9.13 milestone Jun 5, 2024

coreyostrove marked this pull request as ready for review June 5, 2024 06:48

coreyostrove requested review from rileyjmurray and a team as code owners June 5, 2024 06:48

coreyostrove requested a review from sserita June 5, 2024 06:48

rileyjmurray requested changes Jun 5, 2024

View reviewed changes

pygsti/algorithms/core.py Outdated Show resolved Hide resolved

pygsti/algorithms/core.py Outdated Show resolved Hide resolved

Corey Ostrove added 2 commits June 5, 2024 16:25

Merge branch 'feature-faster-circuit-primitives' into feature-faster-…

7f7a08d

…layout-creation

Refactor cache creation functions

53e2da6

Refactor cache creation functions into static methods of the corresponding forward simulator class. Also add an empty base version of this method, and clean up a few miscellaneous things caught by review.

coreyostrove commented Jun 5, 2024

View reviewed changes

Merge branch 'develop' into feature-faster-layout-creation

b735201

rileyjmurray requested changes Jul 30, 2024

View reviewed changes

Minor updates and unit test fixes

74dc215

Fix a few minor issues related to refactored code and updates made in this branch.

sserita requested changes Jul 30, 2024

View reviewed changes

Corey Ostrove added 3 commits July 30, 2024 20:29

Add in DataSet key aliasing

6f4af73

Add in support for data set key aliasing in COPA layout cache creation.

Minor refactors and updates

e7bad83

Rework some of the if statement branching in the layout creation to instead use fallback behavior of get more.

Unrelated RB testing fix

e0d3c47

I accidentally put down the wrong directory for temp testing files in the RB testing code.

coreyostrove requested a review from a team as a code owner July 31, 2024 03:43

coreyostrove requested a review from tjproct July 31, 2024 03:43

sserita reviewed Jul 31, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster Layout and MDC Store Creation #448

Faster Layout and MDC Store Creation #448

coreyostrove commented Jun 5, 2024 •

edited

Loading

rileyjmurray left a comment

coreyostrove commented Jun 5, 2024

coreyostrove Jun 5, 2024

sserita Jul 30, 2024

rileyjmurray left a comment

coreyostrove commented Jul 30, 2024

sserita left a comment

sserita Jul 30, 2024

coreyostrove Jul 31, 2024

rileyjmurray Jul 31, 2024

sserita Jul 30, 2024

coreyostrove Jul 31, 2024

sserita Jul 30, 2024

coreyostrove commented Jul 31, 2024

sserita left a comment

sserita Jul 31, 2024

coreyostrove Jul 31, 2024

coreyostrove Jul 31, 2024

Faster Layout and MDC Store Creation #448

Are you sure you want to change the base?

Faster Layout and MDC Store Creation #448

Conversation

coreyostrove commented Jun 5, 2024 • edited Loading

rileyjmurray left a comment

Choose a reason for hiding this comment

coreyostrove commented Jun 5, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rileyjmurray left a comment

Choose a reason for hiding this comment

coreyostrove commented Jul 30, 2024

sserita left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coreyostrove commented Jul 31, 2024

sserita left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coreyostrove commented Jun 5, 2024 •

edited

Loading