-
-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
suggestion: Refactor scaling strategy #1289
base: feat/auto-scale-clock-time
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your implementation definitely is more eager to scale, which is probably a good thing. I also like that it's simpler 👍
func (n nullMetrics) GetWorkerQueueDepth(name string) int { | ||
return 0 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this mean that we will never scale in case of null metrics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, not sure if we want to do an actual implementation. nullMetrics
isn't really supposed to be used (it just saves us from writing if metrics != nil { metrics.RecordMetric }
in which case it would still be 0
-ish behavior).
// dispatch requests to all worker threads in order | ||
worker.threadMutex.RLock() | ||
for _, thread := range worker.threads { | ||
select { | ||
case thread.requestChan <- r: | ||
worker.threadMutex.RUnlock() | ||
<-fc.done | ||
metrics.StopWorkerRequest(worker.fileName, time.Since(fc.startedAt)) | ||
return | ||
default: | ||
// thread is busy, continue | ||
} | ||
} | ||
worker.threadMutex.RUnlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to keep the logic of dispatching requests to workers in order. It was introduced in #1289 and minimizes potential external connections and memory usage when not all threads are fully utilized. In other words, some workers stay 'hot' while higher index workers remain 'cold'.
It also reduces CPU contention, but that matter less than I initially expected so it's probably fine to remove it for regular threads.
I was thinking about cases where I'd want it to scale beyond my default setting:
and things like that, so in those cases we probably do want to be more eager in scaling. I was thinking it might be good to refactor it so that it can implement "autoscaling strategies" so we could have "conservative" which is more like your implementation, and "eager" which is more like mine, or mix-and-match them as needed. For example, an API probably wants aggressive scaling because it is more sensitive to latency, while an http job handler (such as those in google cloud) probably want a conservative one because they are less sensitive to latency and we don't want to steal threads from out api/frontend. |
This PR simplifies the logic around scaling with a goal of reducing latency. Using the provided benchmarking scenarios:
This is more of a bigger suggestion in the form of a PR, but feel free to merge it or close it!