You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
From everything I can see in terms of resource usage, we're actually massively underutilizing the i3.2xlarge instances. We're only even allocating half of the instance's memory to the task, and then less than half of that ends up getting used.
There are a few steps in the analysis that run in parallel, but much of it runs single-threaded. The parallel steps are big ones, like scoring neighborhood_ways segments, but still, most of the the analysis time is spent running single-threaded tasks like indexing the big tables and calculating the accessibility scores for different destination types.
The i3 instances are actually somewhat old, so given the above, I think there's a good chance that Fargate would speed things up because the single-threaded performance would be better on the newer CPUs in the fleet. I tried to figure out what CPUs are actually used for Fargate and it appears that the answer is "it depends" and it's not configurable. But I found this Stack Overflow post where someone gathered their own statistics. The first processor listed there, the Xeon E5-2686 v4 @ 2.30GHz, is actually the processor used by the i3 instances, but it's the slowest one on the list, and that list is from a year ago, so possibly more of those have been cycled out of the fleet by now.
If you scroll down to the "Container" section on the job detail page, you can see the parameters the job was run with. Which is handy because the PFB_SHPFILE_URL value will tell you what city the job is for. The failed on I linked above was for Helsinki.
I started watching this one, for Houston, yesterday evening because it had been running for days. It actually finished since then, for a total runtime of just over 5 days. So it doesn't seem to be the case that the huge ones can't succeed. Though I don't know if that's what she was actually saying. But yeah, I think we should separate the time question from the failures question, and for the latter we should focus on diagnosing, for individual failed jobs, what actually brought them down.
The text was updated successfully, but these errors were encountered:
Notes from Klaas:
The text was updated successfully, but these errors were encountered: