Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC648: Priority-Ordered Aspect Invocation #651

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

sumupitchayan
Copy link

This is a request for comments about Priority-Ordered Aspect Invocation. See #648 for
additional details.

APIs are signed off by @{BAR_RAISER}.


By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache-2.0 license

Signed-off-by: Sumu <sumughan@amazon.com>
Signed-off-by: Sumu <sumughan@amazon.com>
@mrgrain
Copy link
Contributor

mrgrain commented Nov 7, 2024

You know we can have images in an RFC, right?

@jiayiwang7
Copy link
Member

You know we can have images in an RFC, right?

LOL +1. Your internal doc has a better narrative and it's more clear around what problem we are solving and why, whats the invocation order for existing system vs for proposed solution (with the images you had), etc.

I know RFC has a different format than a traditional design doc used in Amazon. This is a non-trivial problem and don't be afraid of sharing a good amount of details you already had in the internal doc team reviewed. It helps the community to understand the problem and your solution better. :0

…inal quip doc

Signed-off-by: Sumu <sumughan@amazon.com>
@sumupitchayan
Copy link
Author

You know we can have images in an RFC, right?

LOL +1. Your internal doc has a better narrative and it's more clear around what problem we are solving and why, whats the invocation order for existing system vs for proposed solution (with the images you had), etc.

I know RFC has a different format than a traditional design doc used in Amazon. This is a non-trivial problem and don't be afraid of sharing a good amount of details you already had in the internal doc team reviewed. It helps the community to understand the problem and your solution better. :0

Added the original images. Discussed with @mrgrain today, the RFC format isn't ideal for this issue. I added a Background section at the beginning to provide the overall context and problem statement explanation.

Signed-off-by: Sumu <sumughan@amazon.com>
@moltar
Copy link

moltar commented Nov 12, 2024

IMO setting an explicit numeric priority is z-index all over again.

Is there a way to set the dependency order using a higher-order Aspect construct chain? E.g. each Aspect is a valid Construct. Then, we can set the dependency order based on chaining.

@johnf
Copy link

johnf commented Nov 12, 2024

Not sure if it's possible but it would be nice to have this exposed via the CLI e.g. cdk aspects which would list all the aspects and their priority in order.

3. MyAspect creates NewBucket as another Child at the root level
4. But traversal has already finished with this level, so New Bucket is never visited again to be Tagged

![Illustration of current Aspect invocation order](../images/AspectsDiagram1.png)
Copy link

@DaniloTommasinaTR DaniloTommasinaTR Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the "red" aspect also be applied to the newly created S3 bucket (since the "red" aspect is inherited from the root node)?
This creates the risk for endless recursion, but in this specific use-case, the aspect itself should be dealing with the singleton nature of the LogBucket and not add a new one, which would interrupt the recursion.

Another issue is related to execution priority of inherited aspects when merged with aspects set on the child nodes.
E.g. use priority 50 for the red aspect. Add a blue aspect with priority 10 on the "Abstraction" node.
What would be the correct execution order ?
Should it be:

red-a, green-a
blue-b, red-b, green-b
blue-c, red-c, green-c
red-d, green-d

or should it be:

blue-b, blue-c,
red-a, red-b, red-c, red-d
green-a, green-b, green-c, green-d

what if the red-aspect adds the LogBucket and adds also a yellow-aspect with priority 0 to it?
Should the execution order be:

red-a, green-a
blue-b, red-b, green-b
blue-c, red-c, green-c
yellow-d, red-d, green-d

or should it be:

yellow-d
blue-b, blue-c,
red-a, red-b, red-c, red-d
green-a, green-b, green-c, green-d

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comments here @DaniloTommasinaTR ! Responding to your points below:

Shouldn't the "red" aspect also be applied to the newly created S3 bucket (since the "red" aspect is inherited from the root node)?

In this situation, no - since the red aspect is not applied on the root (Stack), but rather to the Abstraction which is a child of the root. The Red aspect creates the New Bucket as a child of the root, not of the Abstraction. If it were applied to the root, then yes, the new Bucket would inherit the red aspect.

Another issue is related to execution priority of inherited aspects when merged with aspects set on the child nodes.
E.g. use priority 50 for the red aspect. Add a blue aspect with priority 10 on the "Abstraction" node.
What would be the correct execution order ?

There is no red-a here - the red aspect is only applied on the Abstraction.

Assuming green gets the default priority of 600, the correct execution order would be:

blue-b, blue-c
red-b, red-c
green-a, green-b, green-c, green-d

what if the red-aspect adds the LogBucket and adds also a yellow-aspect with priority 0 to it?

We do not allow nested aspects. The current algorithm checks if an aspect was added to a node while invoking another aspect. If it does, then this aspect will not be invoked an we emit a warning - we will continue this behavior in our aspects redesign.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the "red" aspect also be applied to the newly created S3 bucket (since the "red" aspect is inherited from the root node)?

In this situation, no - since the red aspect is not applied on the root (Stack), but rather to the Abstraction which is a child of the root. The Red aspect creates the New Bucket as a child of the root, not of the Abstraction. If it were applied to the root, then yes, the new Bucket would inherit the red aspect.

ooh, sorry I picked the wrong line in the md file for this comment, my comments refer to images/ProposedAspectInvocationOrder.png

what if the red-aspect adds the LogBucket and adds also a yellow-aspect with priority 0 to it?

We do not allow nested aspects. The current algorithm checks if an aspect was added to a node while invoking another aspect. If it does, then this aspect will not be invoked an we emit a warning - we will continue this behavior in our aspects redesign.

Right, I forgot about that check. Is there a particular reason why this should not be allowed? I understand that it requires the code to run multiple passes of the tree to process dynamically added aspects, but otherwise I don't see the reason, why it should not be allowed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooh, sorry I picked the wrong line in the md file for this comment, my comments refer to images/ProposedAspectInvocationOrder.png

Ah I see - well in this case, we still wouldn't want the newly created Logging Bucket to inherit the red Aspect. Maybe I can make this more clear in the RFC, but if an Aspect creates a new node, that new node will not inherit the Aspect that created it.

In this example, if the new Bucket inherited the same aspect which created it, it would recurse infinitely creating new buckets. This behavior is designed to prevent infinite recursion.

Right, I forgot about that check. Is there a particular reason why this should not be allowed? I understand that it requires the code to run multiple passes of the tree to process dynamically added aspects, but otherwise I don't see the reason, why it should not be allowed.

An Aspect creating another Aspect is a weird scenario and I can't imagine a use case for it - we do not allow it as it is probably a mistake on the user's behalf.

Copy link

@DaniloTommasinaTR DaniloTommasinaTR Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooh, sorry I picked the wrong line in the md file for this comment, my comments refer to images/ProposedAspectInvocationOrder.png

Ah I see - well in this case, we still wouldn't want the newly created Logging Bucket to inherit the red Aspect. Maybe I can make this more clear in the RFC, but if an Aspect creates a new node, that new node will not inherit the Aspect that created it.

In this example, if the new Bucket inherited the same aspect which created it, it would recurse infinitely creating new buckets. This behavior is designed to prevent infinite recursion.

Well if that inheritance does not happen, then this RFC will introduce a new feature but won't fix issue # 21341

In this case there should not be an infinite recursion since the LogBucket is a singleton and as such won't be created multiple times. However this use case is what I reported to not be working in my original bug report.
Let's say you put an Aspect on the root node that enforces blocking public access to S3 buckets. I would expect this aspect to be applied also to the LogBucket.
Other valid use cases could be related to aspects that perform tagging, if I add a tagging aspect on the root node, I would expect it to be run on all nodes including those that have been dynamically added. The tagging is particularly important when the company defines that all taggable resources have to be tagged with a set of mandatory elements (e.g. for financial tracking and cost allocation, owner tracking, ...).

This issue manifests itself e.g. within cdk pipelines where aspects are not applied as expected: https://github.com/aws/aws-cdk/blob/6bb142e805fdd754755cc54c31c0e6e7970be7f9/packages/aws-cdk-lib/pipelines/lib/main/pipeline-base.ts#L82

This line adds an Aspect that builds the cdk pipeline (and adds a variety of constructs to the tree) just before synthesis. Aspects placed on the root node won't be applied to the cdk pipeline constructs.
The workaround is to explicitly call buildPipeline() before processing the Aspects, this will expand the tree before the processing of aspects is triggered and workaround the issue, however this is not the behavior I would expect. Both the "automatic" building of the pipeline and the explicit call to build it, should produce the same result.

Another use cases I could think of: Company policy requirements for audit logging, you want to enforce logging of a set of actions to be stored to a central S3 bucket. Along with the LogBucket, you would have possibly Lambdas, things that adds entries into the cloud-init scripts for EC2 instances to install/start agents, ...
One would expect the aspects placed on the root node to also be applied to all dynamically added constructs.

In the original bug report other people added their use-cases as well: aws/aws-cdk#21341 (comment)

For context: This is a very valid use-case like for the company I am employed at.
We are developing a central library to be used by all product delivery teams, the central library is developed in a inner-source setup, we have contributors from various departments, those that create re-usable constructs for complex patterns (e.g. web app with cloud-front, S3, API gateway, ....), those that deal with auditing requirements, those that create security related aspects (e.g. enforce blocking public access on S3, apply a permission boundary to all IAM roles, ...) and those that deal with financial optimizations (e.g. prefer graviton based instances, ARM64 lambdas, ....).
In an ideal world the various teams could develop and contribute in parallel their own portions of work as independent units, the end-result is a dynamic merge of all the aspects defined by the various contributors.
We should not require team 'security' to keep merging their extensions into team 'audit', things should happen automatically through correct propagation of aspects.

Right, I forgot about that check. Is there a particular reason why this should not be allowed? I understand that it requires the code to run multiple passes of the tree to process dynamically added aspects, but otherwise I don't see the reason, why it should not be allowed.

An Aspect creating another Aspect is a weird scenario and I can't imagine a use case for it - we do not allow it as it is probably a mistake on the user's behalf.

I don't think it is a weird one, if we look at the cdk pipelines use-case I mentioned above. If we are dynamically adding complex constructs via Aspects, these complex constructs might come with their own Aspects somewhere in the sub-tree.
Things get weird (I admit) when these dynamically added resources try to add aspects back to e.g. the root node, in that case the priority based processing order could get messed up.

Copy link
Contributor

@rix0rrr rix0rrr Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @DaniloTommasinaTR,

That's good to hear! Let's talk about some details then.

If the inheritance rule of the aspects is not respected for dynamically added constructs, then bug aws/aws-cdk#21341 will not be solved.

In the current plan: if a construct is added by an aspect with priority X, then it will be visited by aspects with priority >X (but not necessarily by other aspects with priority X).

Is that respected enough in your opinion?

There is already code in place to ensure that the same aspect is invoked exactly once on the same node

I know. I'm speaking about a hypothetical future scenario where we wouldn't do priority-based aspect ordering, but we would do a stabilization loop instead. A stabilization loop is the only alternative to priority-based ordering, and it would have the property that aspects would need to be invoked however often is necessary for them to all stop making modifications.

I thought you were advocating against priorities, which is why I brought up the alternative and its consequences.

There is no requirement for the Aspect to return true or false, the loop on the tree is repeated as long as at least one Aspect is visited in a full pass of the tree.

Right. We considered this as well, and perhaps this doesn't come through in the doc as well as it should have. Your solution works perfectly well for newly-added constructs: we could detect them and invoke aspects only for constructs we haven't visited yet, and when aspects stop adding constructs we are done.

This does not cover the (possible) case however of mutations: when an aspects modifies a construct in-place, by calling a method or changing a property, another construct might want to respond to that as well (for example: one aspect sets a property on a bucket, and another aspect tags all buckets with that property set). Since in general there is no way of knowing if a construct was mutated, there is no way for us to detect this.

We would like to settle on a solution that covers both construct addition as well as construct mutation, which leaves us with either a full stabilization loop, or priorities.

We then prefer priorities because of its simplicity.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rix0rrr Thanks for the explanations, I looked again at the pseudo code in the proposed solution and I now understand. I was misinterpreting the collection of the Aspects and mapping to node happening once at the beginning and thought that this would not cover the use-case of inherited aspects. However now I see that it does at least under certain circumstances. The initial mapping just determines to nodes to which aspects have been explicitly added, but then you recurse into each child node, which covers the inheritance use-case. However the inheritance is valid only for the aspect itself being processed or any aspects with an higher priority.
So, issue # 213411 could be fixed, but only under some quite obscure conditions and you might end up with only part of the inherited Aspects to be processed, which is an "obscure" behavior in my opinion.

The image images/ProposedAspectInvocationOrder.png however does not reflect this behavior and it got me on the wrong track. The aspects being processed is inherited into the dynamically added child (with the risk of endless recursion if the Aspect does not explicit captures the case).

In my opinion there is no way around a stabilization loop, execution order-by-priority is defined per node, if you do not allow aspects adding other aspects, that's feasible. If you would allow aspects adding aspects you might keep execution order if you allow to add aspects only at higher priority values than the currently executed ones. Otherwise you would need to accept some trade-offs on execution order.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial mapping just determines to nodes to which aspects have been explicitly added, but then you recurse into each child node, which covers the inheritance use-case. However the inheritance is valid only for the aspect itself being processed or any aspects with an higher priority.

@DaniloTommasinaTR what do you mean by this last sentence? Do you have an example of where this would not work?

I'll update the image, thanks for pointing that out and apologies for the confusion there

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sumupitchayan
Yeah, this is getting complicated, it is this pseudo-code:

function invoke_aspect(node, aspect):
  aspect.visit(node)
  // Recurse and Invoke the aspect on all the node's children (inherited Aspect)
  for child of node.children:
    invoke_aspect(child, aspect)

You first collect all aspects and map them by prio, then you iterate on the map keys (the ordered priority value) and run the invoke_aspect() above on each aspect and node combination. This invocation will also recurse into the children. So e.g. if aspect A is triggered on the root node, then it starts iterating on its children, if Aspect A when visiting one child adds a node to the root, at some point the recursion will go back to the top for-loop and should also iterate on the newly added node which will invoke Aspect A also on the newly added node. This has a caveat though: If you have Aspect A, B and C all placed on the root node, with A having a stronger priority than B and B stronger than C (also A gets processed before B, B before C), if B is the one that adds a child to the root node, A will not be processed on the new child, while B and C will. This is where the things get somewhat obscure to the user in my opinion. One would expect A, B and C to all be processed on the new child following the inheritance rules... at least, I would expect it ;)

Copy link
Contributor

@rix0rrr rix0rrr Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One would expect A, B and C to all be processed on the new child following the inheritance rules... at least, I would expect it ;)

I think this is where we have the crux:

The contract we were thinking of is that if aspect A adds an aspect B, it would only run if prio(B) > prio(A).

And crucially: the same would hold whether it would be a new aspect on an existing node, as well as a new aspect on a new node.

The reason this would be true is so that we don't have a distinction between node additions and node mutations: it's all the same system, which is nice and consistent.

But after thinking about it a little: sometimes constructs come with their own aspects, like Role and CdkPipeline... and I agree with you that it would be unexpected if aspect A does new Role, but that new Role doesn't behave the same as if you did it in another location.

So I think I'm coming around to the idea of newly added nodes run all aspects in priority order, and that priority ordering is only for the relative ordering of mutating aspects.

* The priority value to apply on an Aspect.
* Priority must be a non-negative integer.
*/
readonly priority?: number;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about adding an option:

/** Defines if an aspect inherits to its children or not
 * @default true
 */
 readonly inherits?: boolean;

This would add the ability to have an aspect applied to only the node itself without being propagated to its children, might be useful for avoiding endless recursions and having more control on the execution flow?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting suggestion. I am trying to think of some specific use cases where a customer would want to use this option ...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could imagine situations when you place e.g. a tagging aspect on a L2+ construct and you want it applied only on the main CfnResource (defaultChild), but not on other sub-constructs.

Anyways, if there is no specific request or need for this feature, best to leave it out and keep the code slicker as a good practice.

Comment on lines 444 to 449
aspects_map = collect_all_aspects(root)

// Iterate through aspects sorted on priority:
for prio in aspects_map.keys().sorted():
for (aspect, node) in aspect_map[prio]:
invoke_aspect(node, aspect)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aspects are not collected for nodes added dynamically by other Aspects.
The collection of the aspects happens at the beginning before invoking them, if one of the "write aspects" adds nodes to the tree, they won't be collected and the aspects they should inherit would not be processes. This would not solve issue 21341

would you add a loop to repeat the collection after the a pass of invoke_aspects you would end up with something similar to alternate solution # 2, however the processing order of the aspects would be inconsistent with the priorities.

a solution # 3 could be an hybrid between # 1 and # 2, where you:

  1. Collect only priorities for all aspects on the tree
  2. Pick the first priority and invoke all aspects on the tree for this priority that have not been processed before
  3. Repeat from 1. until no new aspects are processed
  4. Pick the second priority and do the same as above and repeat for all other priorities

However also for this solution there is a gap, if a dynamically added node has an Aspect with an higher execution priority, the above solution would miss it or execute it later than expected (if you re-gather the list of priorities)

I fear that the only solution is # 2, and the priority based execution order can be defined only as being valid within the scope of the node and not of the entire tree.


Cons:

* This solution doesn’t address the third constraint of Aspect ordering. We would still need a way to specify the order for cases where mutating and
Copy link

@DaniloTommasinaTR DaniloTommasinaTR Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This depends on how you define the execution order, is the order relevant within the scope of each individual node, or is the order relevant for all aspects applied to a tree? (See comments about the diagram and on the solution pseudo code for more details)


* This solution doesn’t address the third constraint of Aspect ordering. We would still need a way to specify the order for cases where mutating and
validation Aspects apply to the same node.
* Validation aspects may still run before mutating ones, causing them to throw Errors and fail when they shouldn’t.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on the definition of ordering, the order could be controlled on the node itself, giving the guarantee that a validation aspect is executed after a modification aspect on the node itself. However If the validation aspect has to scan through its children sub-tree, then you would have a situation where the modification aspects in the sub-tree was not yet executed.
Is this a relevant use-case? I would expect a validation aspect to be applied only to the node itself and not involve processing its children. If the children require a validation, then you would add it as prioritized aspects to the children, so that you can still control the execution order.

Comment on lines 415 to 417
* * This solution will change customer’s infrastructure.
* This reason alone is enough not to pursue this solution. A solution that will change customer’s infrastructure without them making any changes
to their CDK code would be considered breaking.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could happen as a consequence of fixing a bug, meaning that Aspects will be applied (as one would expect) also to dynamically added nodes.
So, yes a change would possibly happen without the customer making any changes to their CDK code (other than updating dependencies) however this is the consequence of fixing a bug, not of adding a new feature.
By the way, the exact same problem would happen if solution # 1 would be truly fixing the bug that originated this PR. (see comments on the pseudo code for more details)

* * This solution will change customer’s infrastructure.
* This reason alone is enough not to pursue this solution. A solution that will change customer’s infrastructure without them making any changes
to their CDK code would be considered breaking.
* Additionally, this solution would be difficult to understand for customers, compared to assigning priority values.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you control execution order per-node, this is probably even easier to understand for customers.

@michanto
Copy link
Contributor

How do I tell if an aspect has already been added? I've done some hacky code around that but a centralized approach would help

Signed-off-by: Sumu <sumughan@amazon.com>
Signed-off-by: Sumu <sumughan@amazon.com>
Signed-off-by: Sumu <sumughan@amazon.com>
Signed-off-by: Sumu <sumughan@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants