Prefer routing traffic within an AWS availability zone to save $$$ #686

iamdanfox · 2020-04-28T17:09:10Z

Most of our services have nodes in different availability zones. Given that there's nothing constraining traffic in any aws-specific way, every time a node of email-service wants to talk to a node of MP, it might pick a node in any region. This means we're probably paying $$$ in cross-AZ traffic when we don't need to.

Pricing diagram from this blog post

It seems like if we can slightly bias connections towards staying in their region (e.g. eu-west-1a <-> eu-west-1a) then we'd be able to cut down on our spend a bit.

Proposal

When a server is running on AWS, there's a magic IP address we can call to find out which region it's currently in, e.g.

$ curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone
eu-west-1a

(Atlasdb currently uses this).

Then, servers could either advertise this information somehow. Either using a header, or a dedicated metadata endpoint, or perhaps even plumbed through yaml somehow. We might even be able to DNS resolve the hosts we're given and match them against amazon's published IP ranges: https://ip-ranges.amazonaws.com/ip-ranges.json.

With this information, I'd suggest that we add a tiny constant bias to the Balanced Channel's scores, so rather than starting everything off at 0, we'd say hosts that are in other availability zones get a minimum score of 1. This would mean that under zero utilization, the first request would always go intra AZ.

Possible downsides?

Obviously this would need to fail gracefully when running locally, in docker or on Azure.

The text was updated successfully, but these errors were encountered:

carterkozak · 2020-04-28T17:16:26Z

How do client preceived latencies differ between nodes in different AZs? I'd rather use that data to rank targets than to target specific cloud vendors in an rpc library.
Another option is for deployment infrastructure to provide a quality-factor based on availability zones along with URIs, centralizing that discovery.

iamdanfox · 2020-04-28T17:35:35Z

So the idea here is more about $ savings than latencies tbh

carterkozak · 2020-04-28T17:46:36Z

Right, we can solve the problem without vendor-specific implementation.

j-baker · 2020-04-28T20:13:47Z

Latencies are the same +- 0.1, 0.2ms.

j-baker · 2020-04-28T20:31:09Z

basically - this isn't a perf thing - it's a spend thing. And just to be clear it's not $0.01 as the doc implies - Amazon are sneaky and charge you on the way in and on the way out for $0.02 per GB.

j-baker · 2020-04-28T20:32:46Z

and with latencies esp when transitives are involved you also start taking into account their good or bad decisions - because with latency you can't help but care about all the hops, whereas you really want to care about only the one you'd like to make. But nice try :)

carterkozak · 2020-04-28T20:34:02Z

Again, my point is that this is the wrong place to approach that type of problem.

iamdanfox mentioned this issue May 27, 2020

Balanced channel measures RTT, hoping to stay within AZ (and save $) #794

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prefer routing traffic within an AWS availability zone to save $$$ #686

Prefer routing traffic within an AWS availability zone to save $$$ #686

iamdanfox commented Apr 28, 2020

carterkozak commented Apr 28, 2020

iamdanfox commented Apr 28, 2020

carterkozak commented Apr 28, 2020

j-baker commented Apr 28, 2020

j-baker commented Apr 28, 2020

j-baker commented Apr 28, 2020

carterkozak commented Apr 28, 2020

Prefer routing traffic within an AWS availability zone to save $$$ #686

Prefer routing traffic within an AWS availability zone to save $$$ #686

Comments

iamdanfox commented Apr 28, 2020

Pricing diagram from this blog post

Proposal

Possible downsides?

carterkozak commented Apr 28, 2020

iamdanfox commented Apr 28, 2020

carterkozak commented Apr 28, 2020

j-baker commented Apr 28, 2020

j-baker commented Apr 28, 2020

j-baker commented Apr 28, 2020

carterkozak commented Apr 28, 2020