-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prefer routing traffic within an AWS availability zone to save $$$ #686
Comments
How do client preceived latencies differ between nodes in different AZs? I'd rather use that data to rank targets than to target specific cloud vendors in an rpc library. |
So the idea here is more about $ savings than latencies tbh |
Right, we can solve the problem without vendor-specific implementation. |
Latencies are the same +- 0.1, 0.2ms. |
basically - this isn't a perf thing - it's a spend thing. And just to be clear it's not $0.01 as the doc implies - Amazon are sneaky and charge you on the way in and on the way out for $0.02 per GB. |
and with latencies esp when transitives are involved you also start taking into account their good or bad decisions - because with latency you can't help but care about all the hops, whereas you really want to care about only the one you'd like to make. But nice try :) |
Again, my point is that this is the wrong place to approach that type of problem. |
Most of our services have nodes in different availability zones. Given that there's nothing constraining traffic in any aws-specific way, every time a node of email-service wants to talk to a node of MP, it might pick a node in any region. This means we're probably paying $$$ in cross-AZ traffic when we don't need to.
Pricing diagram from this blog post
It seems like if we can slightly bias connections towards staying in their region (e.g.
eu-west-1a
<->eu-west-1a
) then we'd be able to cut down on our spend a bit.Proposal
When a server is running on AWS, there's a magic IP address we can call to find out which region it's currently in, e.g.
(Atlasdb currently uses this).
Then, servers could either advertise this information somehow. Either using a header, or a dedicated metadata endpoint, or perhaps even plumbed through yaml somehow. We might even be able to DNS resolve the hosts we're given and match them against amazon's published IP ranges: https://ip-ranges.amazonaws.com/ip-ranges.json.
With this information, I'd suggest that we add a tiny constant bias to the Balanced Channel's scores, so rather than starting everything off at 0, we'd say hosts that are in other availability zones get a minimum score of 1. This would mean that under zero utilization, the first request would always go intra AZ.
Possible downsides?
Obviously this would need to fail gracefully when running locally, in docker or on Azure.
The text was updated successfully, but these errors were encountered: