Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API Gateway: Too Many Requests on API creation #15573

Open
tuanardouin opened this issue Jul 15, 2021 · 29 comments
Open

API Gateway: Too Many Requests on API creation #15573

tuanardouin opened this issue Jul 15, 2021 · 29 comments
Labels
@aws-cdk/aws-apigateway Related to Amazon API Gateway bug This issue is a bug. needs-cfn This issue is waiting on changes to CloudFormation before it can be addressed. p2

Comments

@tuanardouin
Copy link

Hello,

When creating an API that contains a lot of endpoints, we reach the API Gateway limit on resource creation and get the error

Too Many Requests (Service: ApiGateway, Status Code: 429, ...

The limits :
https://docs.amazonaws.cn/en_us/batch/latest/userguide/service_limits.html

Reproduction Steps

Create a REST API with a lot of resources.

What did you expect to happen?

I expected that CDK will consider this and have a 'sleep' between calls if necessary.

Right now I'm just commenting some of the nested stack that contains my ressources and unccoment them in batch.

Linked to this I think :
aws-cloudformation/cloudformation-coverage-roadmap#589

What actually happened?

Got the 429 error

Environment

  • CDK CLI Version : 1.109.0 (build c647e38)
  • Node.js Version: v12.18.1

This is 🐛 Bug Report

@tuanardouin tuanardouin added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jul 15, 2021
@github-actions github-actions bot added the @aws-cdk/aws-apigateway Related to Amazon API Gateway label Jul 15, 2021
@nija-at
Copy link
Contributor

nija-at commented Jul 28, 2021

As far as I'm aware, the CDK does not invoke the API Gateway endpoint as part of its standard operation.

Please update the issue with details following this guidelines - http://sscce.org/

@nija-at nija-at added the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Jul 28, 2021
@danmactough
Copy link

As far as I'm aware, the CDK does not invoke the API Gateway endpoint as part of its standard operation.

Please update the issue with details following this guidelines - http://sscce.org/

@nija-at I'm pretty sure what @tuanardouin is reporting is not that CDK invokes the API Gateway endpoint directly but that when the CloudFormation template is deployed, the service-to-service communication between CloudFormation and API Gateway gets rate-limited and the CloudFormation deploy fails (and reflects that rate-limit error). It's unclear to the user that this is the source of the error they see. If I'm correct about the source of this error (and it's not possible for the user to get any more information), this is actually an upstream bug in CloudFormation (since CDK can't control the behavior of CloudFormation), and I imagine it would be SUPER-HELPFUL if the CDK team could bubble this bug up to the CloudFormation team. It's not the first time they've heard about this long-standing bug https://forums.aws.amazon.com/thread.jspa?threadID=100414, and they don't appear to have taken any steps to solve it.

@nija-at
Copy link
Contributor

nija-at commented Aug 23, 2021

This will depend on the number of stacks being deployed in parallel for that account/region, number of API Gateway resources in each stack, custom resources that may be making calls directly to API gateway, etc.

If you have a specific CDK stack or CloudFormation resource that replicates this error consistently, I'll be happy to forward it to the relevant teams.

Otherwise, I would recommend contacting the AWS APIGateway team via AWS support for this issue.

@nija-at nija-at added closing-soon This issue will automatically close in 4 days unless further comments are made. and removed response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. labels Aug 23, 2021
@danmactough
Copy link

If you have a specific CDK stack or CloudFormation resource that replicates this error consistently, I'll be happy to forward it to the relevant teams.

@nija-at Oh, we definitely have example stacks where this happens consistently. We are a pretty small team, so there's usually max 1 stack being deployed at any time, and we don't have any custom resources on the stacks where this happens -- as you suggest, it is all about the number of API Gateway resources. But like I said, when we use CloudFormation to work with these resources, we have no ability to adapt to API Gateway rate limits for resources that CloudFormation is managing.

I'll be happy to forward it to the relevant teams.

This would be really helpful. I would be happy to work with the team (I think it would be CloudFormation) to help isolate this issue. Please let me know what you need from me. You can reach me by email at my GH username at gmail.

@nija-at
Copy link
Contributor

nija-at commented Aug 24, 2021

Please let me know what you need from me

As mentioned, provide the simplest full CDK app that consistently replicates this issue.

@peterwoodworth peterwoodworth removed the needs-triage This issue or PR still needs to be triaged. label Sep 20, 2021
@peterwoodworth
Copy link
Contributor

@danmactough have you provided the requested information? Ping me if you'd like to reopen this issue.

@github-actions
Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

@besh0y
Copy link

besh0y commented Sep 17, 2022

Hello @peterwoodworth @nija-at
I've been struggling with this issue for quite some time and I'd like to re-open it.

I've created this: repository, that consistently replicates the issue, please feel free to check it out.

The code creates a REST API with a large number of endpoints and methods, all created in multiple nested stacks.
Deployment of those stacks always fails, returning a too many requests error, also rollback fails for the same reason.

I could temporarily avoid this problem by decreasing the number of resources in each nested stack and making them depend on each other during deployment so they don't get deployed in parallel, but it's much slower and inefficient.

@jweyrich
Copy link

jweyrich commented Sep 18, 2022

Like someone already said, the problem is the rate limit of the API Gateway's own APIs. The CreateResource is limited to 5 per second per account.

We're facing the same problem with the Serveless Framework. Nobody solved it properly. AWS premium support suggests introducing DependsOn, but it's not a definitive solution for sure. The 2nd link below shows AWS published a private resource type Community::CloudFormation::Delay, which also doesn't feel like a definitive solution alone. We thought of using WaitCondition, but it's about the same.
I believe AWS should be able to handle the throttling between its service calls transparently. The "user" is not making these service-to-service calls.
IMHO, since the "user" is providing a valid template that could be fully deployed, if we ignore the rate limit for APIs that the "user" itself is not calling, it should work flawlessly.
However, it may become a hard optimization problem to solve.

This issue is related to:

  1. Too Many Requests (Service: ApiGateway, Status Code: 429, ...) from AWS::ApiGateway::Model  aws-cloudformation/cloudformation-coverage-roadmap#1095
  2. AWS::CloudFormation::Delay (new resource) aws-cloudformation/cloudformation-coverage-roadmap#589

@tuanardouin
Copy link
Author

tuanardouin commented Sep 19, 2022

I encounter this problem only for the first deployment of a Stack, after that, it's not an issue anymore, unless I end up deploying a huge change. So, the dependsOn ended up solving the issue for me.

It's not a perfect solution nor does it excuse the origin of the problem, but at least it's not slowing down our deployments.

@peterwoodworth peterwoodworth added p2 needs-cfn This issue is waiting on changes to CloudFormation before it can be addressed. and removed closing-soon This issue will automatically close in 4 days unless further comments are made. labels Sep 26, 2022
@peterwoodworth
Copy link
Contributor

This type of issue will likely have to be fixed by either CloudFormation or ApiGateway to handle. I'm not a fan of any solutions like the potential new Delay construct to be used as a permanent solution, which is probably going to lie on ApiGateway to handle this correctly.

I would recommend opening an issue in the CloudFormation coverage roadmap repo so that they are aware of this specific issue, or opening an issue with premium support if you have it

@jweyrich
Copy link

jweyrich commented Sep 27, 2022

@peterwoodworth they're aware. My previous comment contains a link to the CloudFormation roadmap issue (see here). And the Premium Support has an article How do I prevent "Rate exceeded" errors in CloudFormation? in their Knowledge Center.

@oanhhuynhpositive
Copy link

Hi @peterwoodworth any specific Idea on how to resolve this problem temporary ?

@jweyrich
Copy link

jweyrich commented Nov 11, 2022

@oanhhuynhpositive A coworker wrote this plugin for Serverless v2/v3 that uses a simple graph algorithm to solve the dependency tree. Does not generate the most performant tree, but works fine. Here is is if you want to give it a try: https://github.com/AlexsandroBezerra/serverless-custom-depends-on

@oanhhuynhpositive
Copy link

@jweyrich Thanks, actually i'm looking for a solution when I use cdk to deploy stack resources.

@jweyrich
Copy link

jweyrich commented Nov 11, 2022

@oanhhuynhpositive oh, my bad. I mixed both repos (cdk and serverless) as we've been dealing with the same issue.

@chessbyte
Copy link

@nija-at A CDK stack that reliably reproduces this issue was provided here. Is there any update on when this will be fixed in CloudFormation? If the CloudFormation deployment code were open-source, I would put in a PR myself to retry (with exponential backoff) on a 429 error. Conceptually, it seems quite straightforward. I am not sure why AWS is not really responding to this issue, as many people are facing it daily.

@nija-at
Copy link
Contributor

nija-at commented May 6, 2023

@chessbyte unfortunately I no longer work for AWS so I'm unable to answer any of your questions.

@chessbyte
Copy link

@nija-at wishing you well!

@peterwoodworth
Copy link
Contributor

We're not the CloudFormation team, so we cannot answer these questions. There's no action CDK can take here with our construct library - While this bug persists, it will be up to customers to configure dependencies between the resources they create to ensure they deploy sequentially rather than in parallel. See this comment for an example

I've created a ticket internally to make sure the right team sees this. I'll provide updates when they become available P88246032

@bouwerp
Copy link

bouwerp commented Jun 15, 2024

It has been more than a year - has there been any movement on this?

@tuanardouin
Copy link
Author

@bouwerp No change and still a problem on CDK 2. Our legacy code still has this issue, but we don't deploy new CloudFormation often, so we just swept that under the rug.

We started using Terraform partly because of that and didn't encounter this problem.

@bouwerp
Copy link

bouwerp commented Jun 18, 2024

Thanks for the info @tuanardouin. I have been looking for a reason to move to terraform.

@erjenkins29
Copy link

erjenkins29 commented Jun 24, 2024

Same issue here -- but suddenly started working after about half an hour and moving to a separate api gateway instance....

@danielMiron
Copy link

same issue, using latest CDK 2 version
this has become a major issue for us, we have stacks with many API routes and integrations, and we deploy them as part of our CI, deploys used to work for us most of the time but failed everyone in a while, but since yesterday it seems impossible to deploy, on different APIs even different AWS accounts
CDK is creating one temple with all routes and there is nothing we can do to prevent it from all being deployed at once.
did anyone find a workaround? right now this completely breaks our workflow and might be the reason we will give up on CDK altogether

@tuanardouin
Copy link
Author

@danielMiron are you encountering this on new deployments or for already available APIs ?

@danielMiron
Copy link

@danielMiron are you encountering this on new deployments or for already available APIs ?

new deployments

@tuanardouin
Copy link
Author

@danielMiron A quick hack is to comment half your endpoints and then deploy your stack. Once it's done, you can uncomment the rest and deploy again.

@danielMiron
Copy link

@tuanardouin This is what we ended up doing to deploy locally, but it doesn't help us much with CI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-apigateway Related to Amazon API Gateway bug This issue is a bug. needs-cfn This issue is waiting on changes to CloudFormation before it can be addressed. p2
Projects
None yet
Development

No branches or pull requests