-
Notifications
You must be signed in to change notification settings - Fork 69
Computing 930 2021 Tracks
Ying Huang edited this page Sep 17, 2021
·
43 revisions
- Burst scheduling support
- QPS 40 per 10K cluster
- 1TP/1RP QPS >= 40
- 50K cluster QPS 200: 5TP/5RP, 3TP/4RP possible < 200
- Minimal management cost for 50K cluster
- Number of TP <= 5, number of RP <= 5
- Service Support in scale out cluster
- Daemonset handling in RP (on hold in favor of service support)
- System tenant only
- Multi-tenancy daemonset out of scope
- System partition pod handling raw design
- 5TP/5RP 50K cluster 20 QPS, pod start up latency <= 6s (p99)
a. 1TP/1RP maximal nodes - 1x30K
Date | Cluster Size | QPS | p50(s) | p90(s) | p99(s) | Changes | Note |
---|---|---|---|---|---|---|---|
8/26 | 1x25K | 100/5 | 1.82 | 2.65 | 5.00 | Reduced list/watch pods in perf test | |
9/01 | 1x25K | 100/5 | 1.82 | 2.65 | 4.88 | Using event receiving time as watch time in perf test | |
9/08 | 1x25K | 100/5 | 1.81 | 2.62 | 4.51 | Index pod by label selector in perf test | |
9/14 | 1x25K | 150/25 | 1.82 | 2.66 | 5.53 | * Increased cache size for 25K cluster (previous cachesize is for 10K cluster) * Send bookmark event to client * Saturation pod QPS 100 -> 150, Latency pod QPS 5->25 |
|
9/15 | 1x30K | 150/30 | 1.83 | 2.76 | 6.96 | * Cachesize increased to 30K cluster * Latency pod QPS 30 |
|
9/16 | 1x35K | 150/35 | 1.87 | 2.89 | 8.63 | * Cachesize increased to 35K cluster * Latency pod QPS 35 |
* KCM 500 error |
b. 2TP/2RP 2x25K=50K node
Date | Cluster Size | QPS | p50(s) | p90(s) | p99(s) | Changes |
---|---|---|---|---|---|---|
8/17 | 2x25K | 2x100 | 1.87 1.87 |
2.74 2.73 |
6.59 6.29 |
Could be incorrect as misconfiguration caused watchers number are much less than usually (using as a reference) |
9/02 | 2x25K | 2x100 | 1.87 1.87 |
2.90 2.84 |
9.14 8.27 |
Same as 9/1 1x25K cluster run |
9/09 | 2x25K | 2x100 | 1.83 1.85 |
2.81 2.80 |
9.89 9.01 |
9/8 + increased cache size for pod to accomodate 25K cluster |
9/13 | 2x25K | 2x100 | 1.79 1.79 |
2.54 2.54 |
2.99 2.96 |
9/9 + send bookmark event to client |
c. 930 release test plan
- Scale up/scale out/1TP1RP/2TP2RP 50K
- Density/Load
- Cluster QPS/Latency pod QPS
d. 50K cluster tp99 improvement thoughts
- Reduce # of secret watcher - Yunwen (TBD)
- Implemented components, need to enable & verify (WIP)
- kubernetes service entries: have to be network specific, instead of kubernetes global (done)
- kube-dns (in kube-system namespace) service entries: have to be network specific; each network should have its own deployment (done)
- Make flannel working in arktos
- Scale up (done)
- Scale out (WIP)
- Start dns pods in arktos
- Scale up (done)
- Scale out (WIP)
- Arktos network controller: whenever a teannt is created, the default network object should be created automatically, plus its kubernetes and kube-dns service entry; for flat type network, it should also take care of kube-dns deployment.
- kubelet: when initializing pod sandbox, should provision /etc/resolv.conf with proper kube-dns_{network} service IP
- Make Kube proxy aware multi-tenancy
- Simple on/off feature gate(s)
- Containerize network controller
- Data entry in Prometheus (and solve previous 404 issue)
- Add new node into scale up cluster, update manual - DONE
- Add new node into scale out cluster, update manual - TBD
-
Issue 1126 - github dependabot alerts
- github.com/gorilla/websocket to v1.4.1: code/PR ready: https://github.com/CentaurusInfra/arktos/pull/1127 - (DONE - Sonya)
- containerd to v1.4.8: As dependency, k8s.io/utils upgrade is necessary; tracked by issue https://github.com/CentaurusInfra/arktos/issues/924
- runc to v1.0.0-rc95: As dependency, k8s.io/utils upgrade is necessary; tracked by issue https://github.com/CentaurusInfra/arktos/issues/924
- Reduce perf test duration
- Increase QPS for latency pod creation - perhaps in cluster loader (saturation 20->100, latency 5->25?)
- Fine tuning
- Evaluate all list requests from clients that go to ETCD directly (on demand)
- Burst scheduling support
- 1TP/1RP 10K cluster 100 QPS - 6/21, pod start up latency <= 3s (p99)
- 1TP/1PR 15K cluster 100 QPS - 7/26, pod start up latency p50 1.8s, p90 2.6s, p99 4.2s
- 1TP/1PR 20K cluster 100 QPS - 7/29, pod start up latency p50 1.8s, p90 2.7s, p99 5.5s
- 1TP/1RP 25K cluster 100 QPS - 8/26, pod start up latency p50 1.8s, p90 2.7s, p99 5.0s
- 1.18.5 15K cluster 100 QPS (1 API server, 1 ETCD) - 7/1, pod start up latency 1.41s, 2.17s, 5.25s
- Diff between pod_start_up and run_to_watch is whole seconds: 0s (9192), 1s(4961), 2s(449), 3s(211), 4s, 5s(34)
- 1.21 15K cluster 100 QPS - 7/15, pod start up latency 1.54s, 2.6s1, 5.65s
- 1.21 20K cluster 100 QPS - 7/16. Scheduler restarted multiple times due to leader election lost
- 1.21 20K cluster 100 QPS - 7/22. p50 1.7s, p90 3.3s, p99 9.3s, saturation latency bad (p50 926s)
- Set up Promethus for k8s 1.18.5 & 1.21 (7/12)
- Identify 1.18 perf improvement changes
- Node controller has expensive list pods, switched to watch PR 1129, PR 1151 (Issue 77733)
- Reduce cachesize for event in apiserver (https://github.com/kubernetes/kubernetes/pull/96117)
- Arktos perf change
- Reduce kubelet getting node PR 835
- Increased watch timeout from 5min mean to 30 min mean (reverted - watch cannot be longer than 10 min)
- Reduce list pods from perf test PR 1163
- Add indexer to perf test PR 1169
- Increase pod cache size - YingH PR 1175
- Send fake bookmark event to client to reduce size of initEvents - YingH PR 1179
- Bug fix
- Fix user agent of event client - PR 1120 https://github.com/CentaurusInfra/arktos/pull/1120
- Minimal management cost for 50K cluster
- Start TP in parallel, start RP in parallel - Done 7/8 PR 1113
- Daemonset handling in RP
- Design - Hongwei (Done 7/6)
- Reduce perf test duration
- Skip garbage collection step - (Done 8/23)
- Promethus support for k8s perf test
- Automatically preserve historical promethus data
- Periodically pulling profiling data automatically
- API server performance: log, code analysis
- Kubelet container died - Yunwen
- Pod creation event diff in 1.18&arktos audit log - YingH (Parking)
- 1.18.5 behave the same as arktos in local cluster up
- 1.18.5 uses v1 for event in kube up, no audit log
- Arktos use v1beta1 for event in kube up (same as local cluster up)
- Scan all K8s performance improvment - Carl - still necessary?
- Current focus on watch improvement
- Daemonset handling
- Implementation - TBD
- Scalability improvement thoughts
- Utilization of audit log (post 930)
- Enable apiserver audit in local dev env
- Auto scan audit log, summarize all request type, resources, duration, etc. (Look for existing tools)
- Enable api server request latency (post 930)
- Migrate to scalability metrics framework PR 980
- Start cluster in parallel
- Start TP/RP in parallel - needs a lot of work, does not have significant improvement
- Utilization of audit log (post 930)