-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.html
692 lines (560 loc) · 80.2 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>sandflee blog</title>
<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1">
<meta property="og:type" content="website">
<meta property="og:title" content="sandflee blog">
<meta property="og:url" content="http://yoursite.com/index.html">
<meta property="og:site_name" content="sandflee blog">
<meta name="twitter:card" content="summary">
<meta name="twitter:title" content="sandflee blog">
<link rel="alternate" href="/atom.xml" title="sandflee blog" type="application/atom+xml">
<link rel="icon" href="/favicon.png">
<link href="//fonts.googleapis.com/css?family=Source+Code+Pro" rel="stylesheet" type="text/css">
<link rel="stylesheet" href="/css/style.css">
</head>
<body>
<div id="container">
<div id="wrap">
<header id="header">
<div id="banner"></div>
<div id="header-outer" class="outer">
<div id="header-title" class="inner">
<h1 id="logo-wrap">
<a href="/" id="logo">sandflee blog</a>
</h1>
<h2 id="subtitle-wrap">
<a href="/" id="subtitle">学习 总结 思考</a>
</h2>
</div>
<div id="header-inner" class="inner">
<nav id="main-nav">
<a id="main-nav-toggle" class="nav-icon"></a>
<a class="main-nav-link" href="/">Home</a>
<a class="main-nav-link" href="/archives">Archives</a>
</nav>
<nav id="sub-nav">
<a id="nav-rss-link" class="nav-icon" href="/atom.xml" title="RSS Feed"></a>
<a id="nav-search-btn" class="nav-icon" title="搜索"></a>
</nav>
<div id="search-form-wrap">
<form action="//google.com/search" method="get" accept-charset="UTF-8" class="search-form"><input type="search" name="q" results="0" class="search-form-input" placeholder="Search"><button type="submit" class="search-form-submit"></button><input type="hidden" name="sitesearch" value="http://yoursite.com"></form>
</div>
</div>
</div>
</header>
<div class="outer">
<section id="main">
<article id="post-tracepoint" class="article article-type-post" itemscope itemprop="blogPost">
<div class="article-meta">
<a href="/2016/12/20/tracepoint/" class="article-date">
<time datetime="2016-12-19T16:00:00.000Z" itemprop="datePublished">2016-12-20</time>
</a>
<div class="article-category">
<a class="article-category-link" href="/categories/tools/">tools</a>
</div>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="article-title" href="/2016/12/20/tracepoint/">利用tracepoint查看系统调用相关信息</a>
</h1>
</header>
<div class="article-entry" itemprop="articleBody">
<h2 id="利用tracepoint查看系统调用相关信息"><a href="#利用tracepoint查看系统调用相关信息" class="headerlink" title="利用tracepoint查看系统调用相关信息"></a>利用tracepoint查看系统调用相关信息</h2><p>线上机器出现大量socket TIME_WAIT,netstat 发现本地有进程一直在连接container export端口.但netstat lsof都找不到哪个在连接.<br>zhiguo介绍了tracepoint工具.</p>
<h3 id="step"><a href="#step" class="headerlink" title="step"></a>step</h3><ol>
<li>mount发现debugfs已经mount debugfs on /sys/kernel/debug type debugfs (rw,relatime)</li>
<li>查看是否支持connect, cat /sys/kernel/debug/tracing/available_events | grep connect –> syscalls:sys_exit_connect<br>syscalls:sys_enter_connect</li>
<li>开启connect的trace, echo 1 > /sys/kernel/debug/tracing/events/syscalls/sys_enter_connect/enable</li>
<li>查看output, cat /sys/kernel/debug/tracing/trace, 发现haproxy一直在connect,进行健康探测</li>
<li>关闭trace,echo 0 > /sys/kernel/debug/tracing/events/syscalls/sys_enter_connect/enable</li>
</ol>
<h3 id="other"><a href="#other" class="headerlink" title="other"></a>other</h3><ol>
<li>tracepoint基于内核kprobe机制, 基本原理在调用时hook</li>
<li>systemtap 提供了更强大的可编程支持</li>
</ol>
<p><a href="https://www.kernel.org/doc/Documentation/trace/tracepoints.txt" target="_blank" rel="external">https://www.kernel.org/doc/Documentation/trace/tracepoints.txt</a><br><a href="http://blog.csdn.net/trochiluses/article/details/10185951t" target="_blank" rel="external">http://blog.csdn.net/trochiluses/article/details/10185951t</a></p>
</div>
<footer class="article-footer">
<a data-url="http://yoursite.com/2016/12/20/tracepoint/" data-id="ciwxlcg8m00008t2lv34i5dsh" class="article-share-link">Share</a>
<ul class="article-tag-list"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/linux/">linux</a></li><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/tool/">tool</a></li></ul>
</footer>
</div>
</article>
<article id="post-hdfs-node" class="article article-type-post" itemscope itemprop="blogPost">
<div class="article-meta">
<a href="/2016/11/26/hdfs-node/" class="article-date">
<time datetime="2016-11-25T16:00:00.000Z" itemprop="datePublished">2016-11-26</time>
</a>
<div class="article-category">
<a class="article-category-link" href="/categories/hadoop/">hadoop</a>
</div>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="article-title" href="/2016/11/26/hdfs-node/">HDFS笔记</a>
</h1>
</header>
<div class="article-entry" itemprop="articleBody">
<h2 id="HDFS笔记"><a href="#HDFS笔记" class="headerlink" title="HDFS笔记"></a>HDFS笔记</h2><blockquote>
<p>HDFS is designed more for batch processing rather than interactive use by users. The emphasis is on high throughput of data access rather than low latency of data access</p>
</blockquote>
<h3 id="数据模型:"><a href="#数据模型:" class="headerlink" title="数据模型:"></a>数据模型:</h3><ul>
<li>write once,read many times, 不支持在中间位置写。适合mr批数据处理</li>
<li>采用名字空间和数据分离的架构,名字空间由namenode维护,数据由datanode维护。提升系统的可扩展性。</li>
</ul>
<h3 id="模块:"><a href="#模块:" class="headerlink" title="模块:"></a>模块:</h3><h4 id="namenode"><a href="#namenode" class="headerlink" title="namenode"></a>namenode</h4><p>nn的元数据由名字空间,文件和block的映射关系(image和edit log的形式持久化),block和datanode的映射关系(datanode动态保存)等组成。</p>
<h3 id="datanode"><a href="#datanode" class="headerlink" title="datanode"></a>datanode</h3><p>负责具体的数据存储,维护了block的具体信息</p>
<h3 id="journalnode"><a href="#journalnode" class="headerlink" title="journalnode"></a>journalnode</h3><p>为namenode edit log提供持久化存储</p>
<h3 id="数据流:"><a href="#数据流:" class="headerlink" title="数据流:"></a>数据流:</h3><h4 id="client"><a href="#client" class="headerlink" title="client"></a>client</h4><h5 id="DistributeFileSystem"><a href="#DistributeFileSystem" class="headerlink" title="DistributeFileSystem"></a>DistributeFileSystem</h5><p>提供了对用户的操作接口,create open mkdir remove等,具体跟name node的交互在DFSClient实现。create后返回DFSOutputstream, open返回DFSInputStream</p>
<h5 id="DFSOutputstream"><a href="#DFSOutputstream" class="headerlink" title="DFSOutputstream"></a>DFSOutputstream</h5><ol>
<li>write时先写在本地buf(FSOutputSummer),buf满了再调用DFSOutputStream#writeChunk</li>
<li>writeChunk 将数据打包成packet交给DateStreamer处理</li>
<li>DateStreamer 将数据保存再dataQueue,后台线程负责发送</li>
<li>DateStreamer利用DFSClient向name node请求block locate,以pipeline方式发送</li>
</ol>
<h5 id="DFSInputStream"><a href="#DFSInputStream" class="headerlink" title="DFSInputStream"></a>DFSInputStream</h5><p>从namenode取出相应block location,进行读取</p>
<h5 id="DFSClient"><a href="#DFSClient" class="headerlink" title="DFSClient"></a>DFSClient</h5><p>负责跟namenode协议交互,具体参考ClientProtocol</p>
<h3 id="模块协议:"><a href="#模块协议:" class="headerlink" title="模块协议:"></a>模块协议:</h3><h4 id="ClientProtocol"><a href="#ClientProtocol" class="headerlink" title="ClientProtocol"></a>ClientProtocol</h4><p>包括create,mkdir, delete等基本的文件接口<br>getBlockLocations(对应open接口), addBlock(write完成后通过这个接口获取block location信息,实现上将其作为前一个blocks的commit信息),renewlease等hdfs特有接口</p>
<h4 id="DataNodeProtocol"><a href="#DataNodeProtocol" class="headerlink" title="DataNodeProtocol"></a>DataNodeProtocol</h4><ol>
<li>registerDataNode 注册datanode</li>
<li>sendHeartBeart 定期heartbeat,同时作为namenode向datanode发送命令的一个渠道</li>
<li>BlockReport datanode向namenode汇报所有block信息</li>
<li>BlockReceviedAndDeleted datanode收受新的block或者delete block后向namenode发送</li>
</ol>
<h4 id="DataTransferProtocol"><a href="#DataTransferProtocol" class="headerlink" title="DataTransferProtocol"></a>DataTransferProtocol</h4><p>具体数据交互。发送实现 sender(client,nn,dn会调用) 接收serverXceiver(dn实现)</p>
<ol>
<li>readBlock 读取block信息</li>
<li>writeBlock pipeline write时调用此接口</li>
<li>transferBlock 把一个block copy到其他datanode? balancer会用到</li>
</ol>
<h4 id="namenode-1"><a href="#namenode-1" class="headerlink" title="namenode"></a>namenode</h4><p>nameNodeRpcServer 实现所有rpc协议,proxy给具体的server处理<br>namenodeprotocol dataNode和nameNode通信的唯一通道,registerNode和heartBeat. FSNameSystem.BlockManager处理<br>ClientProtocol client和namespace通信的通道。FSNameSystem处理</p>
<p>todo: namenode datanode 数据交互流程的实现</p>
<h3 id="机制:"><a href="#机制:" class="headerlink" title="机制:"></a>机制:</h3><h4 id="replica-放置"><a href="#replica-放置" class="headerlink" title="replica 放置"></a>replica 放置</h4><p>不把在机器上均衡资源作为自己的目标。依靠后面的balancer来实现<br>本地磁盘满了,还能在本地放置数据吗?</p>
<blockquote>
<p>The purpose of a rack-aware replica placement policy is to improve data reliability, availability, and network bandwidth utilization</p>
<p>For the common case, when the replication factor is three, HDFS’s placement policy is to put one replica on one node in the local rack, another on a different node in the local rack, and the last on a different node in a different rack. This policy cuts the inter-rack write traffic which generally improves write performance. The chance of rack failure is far less than that of node failure; this policy does not impact data reliability and availability guarantees. However, it does reduce the aggregate network bandwidth used when reading data since a block is placed in only two unique racks rather than three. With this policy, the replicas of a file do not evenly distribute across the racks. One third of replicas are on one node, two thirds of replicas are on one rack, and the other third are evenly distributed across the remaining racks. This policy improves write performance without compromising data reliability or read performance.</p>
</blockquote>
<h4 id="namenode-ha"><a href="#namenode-ha" class="headerlink" title="namenode ha"></a>namenode ha</h4><p>分为active nn和backup nn,active nn负责实际的数据处理,并把edit log写在journal node,backup从journal node读取edit log维护最新的状态,并定期作checkpoint<br>datanode 连接不上active时尝试连接standby nn<br>每台机器上可以部署zkfc进程进行自动failover</p>
<h4 id="lease机制"><a href="#lease机制" class="headerlink" title="lease机制"></a>lease机制</h4><p>保证only one writer。</p>
<h4 id="safe-mode"><a href="#safe-mode" class="headerlink" title="safe mode"></a>safe mode</h4><blockquote>
<p>During start up the NameNode loads the file system state from the fsimage and the edits log file. It then waits for DataNodes to report their blocks so that it does not prematurely start replicating the blocks though enough replicas already exist in the cluster. During this time NameNode stays in Safemode. Safemode for the NameNode is essentially a read-only mode for the HDFS cluster, where it does not allow any modifications to file system or blocks. Normally the NameNode leaves Safemode automatically after the DataNodes have reported that most file system blocks are available. If required, HDFS could be placed in Safemode explicitly usingbin/hadoop dfsadmin -safemode command. NameNode front page shows whether Safemode is on or off. A more detailed description and configuration is maintained as JavaDoc for setSafeMode().</p>
</blockquote>
<h4 id="why-pipeline-write?"><a href="#why-pipeline-write?" class="headerlink" title="why pipeline write?"></a>why pipeline write?</h4><p>最小化集群网络开销。如果都在client机器write,对client机器压力比较大。</p>
<h4 id="为什么不保存机器和block的映射关系?"><a href="#为什么不保存机器和block的映射关系?" class="headerlink" title="为什么不保存机器和block的映射关系?"></a>为什么不保存机器和block的映射关系?</h4><p>We initially attempted to keep chunk location information persistently at the master, but we decided that it was much simpler to request the data from chunkservers at startup, and periodically thereafter. This eliminated the problem of keeping the master and chunkservers in sync as chunkservers join and leave the cluster, change names, fail, restart, and so on</p>
<h3 id="问题:"><a href="#问题:" class="headerlink" title="问题:"></a>问题:</h3><ol>
<li><p>批处理任务启动时,几百个客户端同时读取文件.可以人工调节文件replica</p>
</li>
<li><p>程序启动时,同时操作hdfs,本地缓存导致内存暴增。NM</p>
</li>
</ol>
<p><a href="http://itm-vm.shidler.hawaii.edu/HDFS/ArchDocCommunication.html" target="_blank" rel="external">http://itm-vm.shidler.hawaii.edu/HDFS/ArchDocCommunication.html</a><br><a href="http://blog.csdn.net/anzhsoft/article/details/23428355" target="_blank" rel="external">http://blog.csdn.net/anzhsoft/article/details/23428355</a><br><a href="https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html" target="_blank" rel="external">https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html</a></p>
</div>
<footer class="article-footer">
<a data-url="http://yoursite.com/2016/11/26/hdfs-node/" data-id="ciwxlcg94000e8t2lr3ytvhbh" class="article-share-link">Share</a>
<ul class="article-tag-list"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/hadoop/">hadoop</a></li><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/hdfs/">hdfs</a></li><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/存储/">存储</a></li></ul>
</footer>
</div>
</article>
<article id="post-k8s-apiserver" class="article article-type-post" itemscope itemprop="blogPost">
<div class="article-meta">
<a href="/2016/10/17/k8s-apiserver/" class="article-date">
<time datetime="2016-10-16T16:00:00.000Z" itemprop="datePublished">2016-10-17</time>
</a>
<div class="article-category">
<a class="article-category-link" href="/categories/k8s/">k8s</a>
</div>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="article-title" href="/2016/10/17/k8s-apiserver/">k8s apiserver分析</a>
</h1>
</header>
<div class="article-entry" itemprop="articleBody">
<h1 id="k8s-apiserver"><a href="#k8s-apiserver" class="headerlink" title="k8s apiserver"></a>k8s apiserver</h1><blockquote>
<p>The Kubernetes API server validates and configures data for the api objects which include pods, services, replicationcontrollers, and others. The API Server services REST operations and provides the frontend to the cluster’s shared state through which all other components interact.</p>
</blockquote>
<h2 id="resource-amp-amp-Group-amp-amp-version"><a href="#resource-amp-amp-Group-amp-amp-version" class="headerlink" title="resource && Group && version"></a>resource && Group && version</h2><h3 id="resource"><a href="#resource" class="headerlink" title="resource"></a>resource</h3><h4 id="resource描述"><a href="#resource描述" class="headerlink" title="resource描述"></a>resource描述</h4><p>pod service这类对象,etcd上存储的最小单位。<br>一个资源的描述一般包括4部分,</p>
<ol>
<li>TypeMeta 资源的元信息,资源的类型,属于哪个Group/version</li>
<li>ObjectMeta 对象的元信息,对象的名字,label,annotation等</li>
<li>Spec 对象期望的状态</li>
<li>Status 对象实际的状态</li>
</ol>
<figure class="highlight go"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div></pre></td><td class="code"><pre><div class="line"><span class="keyword">type</span> Pod <span class="keyword">struct</span> {</div><div class="line"> unversioned.TypeMeta <span class="string">`json:",inline"`</span></div><div class="line"> ObjectMeta <span class="string">`json:"metadata,omitempty" protobuf:"bytes,1,opt,name=metadata"`</span></div><div class="line"> Spec PodSpec <span class="string">`json:"spec,omitempty" protobuf:"bytes,2,opt,name=spec"`</span></div><div class="line"> Status PodStatus <span class="string">`json:"status,omitempty" protobuf:"bytes,3,opt,name=status"`</span></div><div class="line">}</div></pre></td></tr></table></figure>
<h3 id="Group"><a href="#Group" class="headerlink" title="Group"></a>Group</h3><p>一般类似功能的资源放到一个Group下,比如batch Group下面有job和ScheduledJob. 一些不成熟的会放到extensions</p>
<h3 id="version"><a href="#version" class="headerlink" title="version"></a>version</h3><p>每一个Group都会有不同version(升级,向前兼容), resource从属于一个version,version从属于Group,如果要升级到新的version,需要把钱一个version的resource在新version中实现,并创建转换函数负责不同版本间相同resource的convert<br>每个Group都必须提供unversioned的resource,供apiserver和其他模块使用。</p>
<h2 id="ApiServer分层"><a href="#ApiServer分层" class="headerlink" title="ApiServer分层"></a>ApiServer分层</h2><ol>
<li>REST接口层, 对用户暴漏REST接口</li>
<li>resource storage层,具体某个资源的实现.每个REST接口都跟一个storage关联,storage提供了Create/Update/Delete/Get/Watch等接口。一般基于generic#store实现,每种resource只需要实线特定的stratgy,generic#store负责回调.用户只需要关心具体的实现策略即可</li>
<li>cache层 如果启用–watch-cache,会有额外的cache层(cacher.go),如果没有启用,generic#store直接操作raw storage</li>
<li>raw storage层, 跟etcd打交道,有etcd集群信息和版本信息,把数据直接更新到etcd,</li>
</ol>
<h3 id="REST接口"><a href="#REST接口" class="headerlink" title="REST接口"></a>REST接口</h3><p>api_installer.go#registerResourceHandlers将resource和具体的RestApi绑定起来.</p>
<ol>
<li>创建decoder,decoder负责将version对象字节流decode成unversion对象</li>
<li>必要时(CREATE/UPDATE/DELETE)进行准入控制</li>
<li>调用resource storage相关接口(创建调用create,更新调用update等)</li>
</ol>
<h3 id="resource-storage层"><a href="#resource-storage层" class="headerlink" title="resource storage层"></a>resource storage层</h3><h4 id="storage向apiserver注册"><a href="#storage向apiserver注册" class="headerlink" title="storage向apiserver注册"></a>storage向apiserver注册</h4><p>每个ApiGroup需要创建ApiGroupInfo,里面包含group的版本,以及每个版本的resource map. ApiServer根据ApiGroup Info将其跟rest接口绑定<br><em>master.go#installApis</em> 作为注册的总入口,创建对应ApiGroup并将其注册</p>
<h5 id="1-创建ApiGroupInfo"><a href="#1-创建ApiGroupInfo" class="headerlink" title="1. 创建ApiGroupInfo"></a>1. 创建ApiGroupInfo</h5><p>groupMeta为每个Group install.go中注册的GroupMeta, Scheme,ParameterCodec,NegotiatedSerializer为全局变量,不需要额外创建<br>VersionedResourcesStorageMap 记录了每个GroupVersion都要哪些resource,以及对应的storage实现</p>
<figure class="highlight go"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div></pre></td><td class="code"><pre><div class="line">genericApiserver.<span class="keyword">go</span></div><div class="line"></div><div class="line">APIGroupInfo {</div><div class="line"> GroupMeta apimachinery.GroupMeta</div><div class="line"> VersionedResourcesStorageMap [][]rest.Storage</div><div class="line"> IsLegacyGroup <span class="keyword">bool</span></div><div class="line"> OptionsExternalVersion *unversioned.GroupVersion</div><div class="line"></div><div class="line"> Scheme *runtime.Scheme</div><div class="line"> NegotiatedSerializer runtime.NegotiatedSerializer</div><div class="line"> ParameterCodec runtime.ParameterCodec</div><div class="line"></div><div class="line"> SubresourceGroupVersionKind []unversioned.GroupVersionKind</div><div class="line">}</div></pre></td></tr></table></figure>
<h5 id="2-从ApiGroupInfo生成ApiGroupVersion,对应每个版本的信息"><a href="#2-从ApiGroupInfo生成ApiGroupVersion,对应每个版本的信息" class="headerlink" title="2. 从ApiGroupInfo生成ApiGroupVersion,对应每个版本的信息"></a>2. 从ApiGroupInfo生成ApiGroupVersion,对应每个版本的信息</h5><p>apiGroupInfo对应的是一个ApiGroup的信息,ApiGroupVersion对应其中一个特定版本<br><figure class="highlight go"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div><div class="line">21</div></pre></td><td class="code"><pre><div class="line"><span class="function"><span class="keyword">func</span> <span class="params">(s *GenericAPIServer)</span> <span class="title">newAPIGroupVersion</span><span class="params">(apiGroupInfo *APIGroupInfo, groupVersion unversioned.GroupVersion)</span> <span class="params">(*apiserver.APIGroupVersion, error)</span></span> {</div><div class="line"> <span class="keyword">return</span> &apiserver.APIGroupVersion{</div><div class="line"> RequestInfoResolver: s.NewRequestInfoResolver(),</div><div class="line"></div><div class="line"> GroupVersion: groupVersion,</div><div class="line"></div><div class="line"> ParameterCodec: apiGroupInfo.ParameterCodec,</div><div class="line"> Serializer: apiGroupInfo.NegotiatedSerializer,</div><div class="line"> Creater: apiGroupInfo.Scheme,</div><div class="line"> Convertor: apiGroupInfo.Scheme,</div><div class="line"> Copier: apiGroupInfo.Scheme,</div><div class="line"> Typer: apiGroupInfo.Scheme,</div><div class="line"> SubresourceGroupVersionKind: apiGroupInfo.SubresourceGroupVersionKind,</div><div class="line"> Linker: apiGroupInfo.GroupMeta.SelfLinker,</div><div class="line"> Mapper: apiGroupInfo.GroupMeta.RESTMapper,</div><div class="line"></div><div class="line"> Admit: s.AdmissionControl,</div><div class="line"> Context: s.RequestContextMapper,</div><div class="line"> MinRequestTimeout: s.MinRequestTimeout,</div><div class="line"> }, <span class="literal">nil</span></div><div class="line">}</div></pre></td></tr></table></figure></p>
<h5 id="3-ApiGroupVersion中的每个资源注册到apiserver"><a href="#3-ApiGroupVersion中的每个资源注册到apiserver" class="headerlink" title="3. ApiGroupVersion中的每个资源注册到apiserver"></a>3. ApiGroupVersion中的每个资源注册到apiserver</h5><p>api_installer.go#registerResourceHandlers把下面storage中的resource和storage做绑定<br><figure class="highlight go"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div></pre></td><td class="code"><pre><div class="line">petsetStorage, petsetStatusStorage := petsetetcd.NewREST(restOptionsGetter(apps.Resource(<span class="string">"petsets"</span>)))</div><div class="line">storage[<span class="string">"petsets"</span>] = petsetStorage</div></pre></td></tr></table></figure></p>
<h4 id="generic-storage实现"><a href="#generic-storage实现" class="headerlink" title="generic storage实现"></a>generic storage实现</h4><p>generic storerage在进行实际的etcd操作前进行了很多hook,用户只需要实线具体的stratery即可<br>其具体实现在<em>generic/registry/store.go</em></p>
<h5 id="create"><a href="#create" class="headerlink" title="create"></a>create</h5><ol>
<li>执行rest#BeforeCreate,<br>1.1 strategy.PrepareForCreate<br>1.2 创建uuid,如果没有名字产生名字<br>1.3 strategy.Validate验证资源的合法性</li>
<li>底层storage执行create</li>
<li>执行AfterCreate回调</li>
</ol>
<h5 id="update-todo"><a href="#update-todo" class="headerlink" title="update todo"></a>update todo</h5><h5 id="Delete-todo"><a href="#Delete-todo" class="headerlink" title="Delete todo"></a>Delete todo</h5><h5 id="Get-todo"><a href="#Get-todo" class="headerlink" title="Get todo"></a>Get todo</h5><h5 id="rest-策略层"><a href="#rest-策略层" class="headerlink" title="rest 策略层"></a>rest 策略层</h5><p>提供了BeforeUpdate/BeforeCreate/BeforeDelete的实现,会回调每个资源的一些创建,更新策略<br><figure class="highlight go"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div><div class="line">21</div><div class="line">22</div><div class="line">23</div><div class="line">24</div><div class="line">25</div><div class="line">26</div><div class="line">27</div><div class="line">28</div><div class="line">29</div><div class="line">30</div><div class="line">31</div><div class="line">32</div><div class="line">33</div><div class="line">34</div><div class="line">35</div><div class="line">36</div><div class="line">37</div><div class="line">38</div><div class="line">39</div><div class="line">40</div><div class="line">41</div><div class="line">42</div><div class="line">43</div><div class="line">44</div><div class="line">45</div><div class="line">46</div><div class="line">47</div><div class="line">48</div><div class="line">49</div><div class="line">50</div></pre></td><td class="code"><pre><div class="line"><span class="keyword">type</span> RESTCreateStrategy <span class="keyword">interface</span> {</div><div class="line"> runtime.ObjectTyper</div><div class="line"> <span class="comment">// The name generate is used when the standard GenerateName field is set.</span></div><div class="line"> <span class="comment">// The NameGenerator will be invoked prior to validation.</span></div><div class="line"> api.NameGenerator</div><div class="line"></div><div class="line"> <span class="comment">// NamespaceScoped returns true if the object must be within a namespace.</span></div><div class="line"> NamespaceScoped() <span class="keyword">bool</span></div><div class="line"> <span class="comment">// PrepareForCreate is invoked on create before validation to normalize</span></div><div class="line"> <span class="comment">// the object. For example: remove fields that are not to be persisted,</span></div><div class="line"> <span class="comment">// sort order-insensitive list fields, etc. This should not remove fields</span></div><div class="line"> <span class="comment">// whose presence would be considered a validation error.</span></div><div class="line"> PrepareForCreate(ctx api.Context, obj runtime.Object)</div><div class="line"> <span class="comment">// Validate is invoked after default fields in the object have been filled in before</span></div><div class="line"> <span class="comment">// the object is persisted. This method should not mutate the object.</span></div><div class="line"> Validate(ctx api.Context, obj runtime.Object) field.ErrorList</div><div class="line"> <span class="comment">// Canonicalize is invoked after validation has succeeded but before the</span></div><div class="line"> <span class="comment">// object has been persisted. This method may mutate the object.</span></div><div class="line"> Canonicalize(obj runtime.Object)</div><div class="line">}</div><div class="line"></div><div class="line"><span class="keyword">type</span> RESTGracefulDeleteStrategy <span class="keyword">interface</span> {</div><div class="line"> <span class="comment">// CheckGracefulDelete should return true if the object can be gracefully deleted and set</span></div><div class="line"> <span class="comment">// any default values on the DeleteOptions.</span></div><div class="line"> CheckGracefulDelete(ctx api.Context, obj runtime.Object, options *api.DeleteOptions) <span class="keyword">bool</span></div><div class="line">}</div><div class="line"></div><div class="line"><span class="keyword">type</span> RESTUpdateStrategy <span class="keyword">interface</span> {</div><div class="line"> runtime.ObjectTyper</div><div class="line"> <span class="comment">// NamespaceScoped returns true if the object must be within a namespace.</span></div><div class="line"> NamespaceScoped() <span class="keyword">bool</span></div><div class="line"> <span class="comment">// AllowCreateOnUpdate returns true if the object can be created by a PUT.</span></div><div class="line"> AllowCreateOnUpdate() <span class="keyword">bool</span></div><div class="line"> <span class="comment">// PrepareForUpdate is invoked on update before validation to normalize</span></div><div class="line"> <span class="comment">// the object. For example: remove fields that are not to be persisted,</span></div><div class="line"> <span class="comment">// sort order-insensitive list fields, etc. This should not remove fields</span></div><div class="line"> <span class="comment">// whose presence would be considered a validation error.</span></div><div class="line"> PrepareForUpdate(ctx api.Context, obj, old runtime.Object)</div><div class="line"> <span class="comment">// ValidateUpdate is invoked after default fields in the object have been</span></div><div class="line"> <span class="comment">// filled in before the object is persisted. This method should not mutate</span></div><div class="line"> <span class="comment">// the object.</span></div><div class="line"> ValidateUpdate(ctx api.Context, obj, old runtime.Object) field.ErrorList</div><div class="line"> <span class="comment">// Canonicalize is invoked after validation has succeeded but before the</span></div><div class="line"> <span class="comment">// object has been persisted. This method may mutate the object.</span></div><div class="line"> Canonicalize(obj runtime.Object)</div><div class="line"> <span class="comment">// AllowUnconditionalUpdate returns true if the object can be updated</span></div><div class="line"> <span class="comment">// unconditionally (irrespective of the latest resource version), when</span></div><div class="line"> <span class="comment">// there is no resource version specified in the object.</span></div><div class="line"> AllowUnconditionalUpdate() <span class="keyword">bool</span></div><div class="line">}</div></pre></td></tr></table></figure></p>
<h3 id="cacher层"><a href="#cacher层" class="headerlink" title="cacher层"></a>cacher层</h3><p>对watch请求进行cache,其他的Get/Update/Create/Delete直接走raw storage层.</p>
<h4 id="创建storage"><a href="#创建storage" class="headerlink" title="创建storage"></a>创建storage</h4><p>master.go 创建generic.RESTOptions时,通过storageDecorator赋值给Decorator</p>
<figure class="highlight go"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div></pre></td><td class="code"><pre><div class="line">genericapiserver.<span class="keyword">go</span></div><div class="line"><span class="function"><span class="keyword">func</span> <span class="params">(s *GenericAPIServer)</span> <span class="title">StorageDecorator</span><span class="params">()</span> <span class="title">generic</span>.<span class="title">StorageDecorator</span></span> {</div><div class="line"> <span class="keyword">if</span> s.enableWatchCache {</div><div class="line"> <span class="keyword">return</span> registry.StorageWithCacher</div><div class="line"> }</div><div class="line"> <span class="keyword">return</span> generic.UndecoratedStorage</div><div class="line">}</div></pre></td></tr></table></figure>
<figure class="highlight go"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div></pre></td><td class="code"><pre><div class="line">master.<span class="keyword">go</span></div><div class="line">generic.RESTOptions{</div><div class="line"> StorageConfig: storageConfig,</div><div class="line"> Decorator: m.StorageDecorator(),</div><div class="line"> DeleteCollectionWorkers: m.deleteCollectionWorkers,</div><div class="line"> ResourcePrefix: c.StorageFactory.ResourcePrefix(resource),</div><div class="line"> }</div></pre></td></tr></table></figure>
<figure class="highlight go"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div></pre></td><td class="code"><pre><div class="line"><span class="keyword">type</span> RESTOptions <span class="keyword">struct</span> {</div><div class="line"> <span class="comment">// etcd相关配置,etcd2/etcd3? etcd location/prefix等,还包括codec, resource memory version和storageVersion相互转换</span></div><div class="line"> StorageConfig *storagebackend.Config</div><div class="line"> <span class="comment">// storage的修饰器,返回一个func,生成具体的storage接口,分为storageWithCacher和UndecodedStorage</span></div><div class="line"> Decorator StorageDecorator</div><div class="line"> DeleteCollectionWorkers <span class="keyword">int</span></div><div class="line"></div><div class="line"> ResourcePrefix <span class="keyword">string</span></div><div class="line">}</div></pre></td></tr></table></figure>
<ul>
<li>每个Group都需要创建store对象,调用RestOptions.Decorator生成storage(cacher or raw)</li>
</ul>
<figure class="highlight go"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div></pre></td><td class="code"><pre><div class="line"> registry#tapp/etcd#etcd.<span class="keyword">go</span></div><div class="line">storageInterface, _ := opts.Decorator(</div><div class="line"> opts.StorageConfig,</div><div class="line"> cachesize.GetWatchCacheSizeByResource(cachesize.TApp),</div><div class="line"> &gaiaapi.TApp{},</div><div class="line"> prefix,</div><div class="line"> tapp.Strategy,</div><div class="line"> newListFunc,</div><div class="line"> storage.NoTriggerPublisher,</div><div class="line">)</div><div class="line"> store := &registry.Store{</div><div class="line"> Storage: storageInterface,</div><div class="line"> }</div></pre></td></tr></table></figure>
<h4 id="cacher实现-todo"><a href="#cacher实现-todo" class="headerlink" title="cacher实现 todo"></a>cacher实现 todo</h4><h3 id="raw-storage层"><a href="#raw-storage层" class="headerlink" title="raw storage层"></a>raw storage层</h3><p>etcd lib的具体实现.etcd2的实线在storage/etcd_helper.go, etcd3的实线在storage/etcd3/store.go<br>Note:<br>1.具体存储前调用encoder将unversion resource转换成version resource字节流<br>2.没有字段存储ResourceVersion,采用etcd modify index.</p>
<h2 id="what-happend-when-create-a-resource?"><a href="#what-happend-when-create-a-resource?" class="headerlink" title="what happend when create a resource?"></a>what happend when create a resource?</h2><ol>
<li>客户端通过RestApi请求创建petset</li>
<li>apiserver 执行回调函数restHandler#createHandler, 将字节流转换成unversion resource object,通过准入控制后,执行generic#store.Create</li>
<li>generic#store.create 流程参见前面描述,调用cacher#create</li>
<li>cacher#create不做处理直接调用raw storage回调,如果为etcd2执行etcd_helper#create</li>
<li>raw storage etcd helper将unversion object转换为versioned object并存储在etcd</li>
<li>raw storage 将etcd返回的value decode成unversion resource, 并根据返回的modifyIndex设置对象的resourceVersion</li>
<li>generic#store 执行AfterCreate回调</li>
<li>apiserver 将unversion resource转换为version resource并返回</li>
</ol>
<p>###scheme 记录GroupVersionKind和type的映射关系<br>主要用于不同version resource的相互转换<br>重要接口:</p>
<ol>
<li>addKnownTypes </li>
<li>addDefaultFuncs</li>
<li>addConversionFuncs </li>
<li>AddFieldLabelConversionFunc field label?</li>
</ol>
</div>
<footer class="article-footer">
<a data-url="http://yoursite.com/2016/10/17/k8s-apiserver/" data-id="ciwxlcg97000n8t2lj9gd5zki" class="article-share-link">Share</a>
<ul class="article-tag-list"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/apiserver/">apiserver</a></li><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/k8s/">k8s</a></li></ul>
</footer>
</div>
</article>
<article id="post-controller-manager" class="article article-type-post" itemscope itemprop="blogPost">
<div class="article-meta">
<a href="/2016/09/26/controller-manager/" class="article-date">
<time datetime="2016-09-25T16:00:00.000Z" itemprop="datePublished">2016-09-26</time>
</a>
<div class="article-category">
<a class="article-category-link" href="/categories/k8s/">k8s</a>
</div>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="article-title" href="/2016/09/26/controller-manager/">k8s controller manager分析</a>
</h1>
</header>
<div class="article-entry" itemprop="articleBody">
<h1 id="controller-manager"><a href="#controller-manager" class="headerlink" title="controller manager"></a>controller manager</h1><p>k8s用户只需要描述一个对象的desired state, 系统会根据desired state做一些操作,使得real state匹配desired state.<br>controller manager负责协调匹配各个资源的状态,其具体的逻辑通过功能独立的controller实现。</p>
<p>PRE NOTE</p>
<ol>
<li>介绍中的watch一个资源,在具体实现的时候可能为listAndWatch</li>
<li>一个资源的变化时需要找到相关的另一个资源,采用label匹配的方法</li>
<li>更新/删除一个资源,一般表示通过apiserver接口更新/删除资源,最终回写到etcd</li>
<li>进行资源同步时,所有的操作都在同一个namespace下</li>
</ol>
<h2 id="controller介绍"><a href="#controller介绍" class="headerlink" title="controller介绍"></a>controller介绍</h2><h3 id="replication-controller"><a href="#replication-controller" class="headerlink" title="replication controller"></a>replication controller</h3><p>作用:负责维护系统中alive的pod数目和rc期望的pod数目(rc.spec.replicas)一致<br>实现:<br>监听apiserver中的rc和pod,当pod/rc发生变化,找到相应的rc(label匹配),做同步<br>同步过程:</p>
<ul>
<li>rc中期望的pod数目N1(rc.spec.replicas),如果跟podCache中alive的pod数目N2不同(pod.status.phase not in (FAILED,SUCCESSED) and pod.DeletionTimestamp != null),调用apiserver接口增加/删除pod</li>
<li>调用apiserver接口把rc.status.replica字段更新为N2,</li>
<li>删除/新增 pod又会触发新一轮的同步,最终N1 == N2</li>
</ul>
<p>gaia目前通过container complete消息通知AM,AM根据失败类型重新申请container,拉起。其中一个环节出错,container的拉起会有问题。相应的要做很多容错处理(container complete通知机制,AM状态保存)<br>NOTE:pod的存在可以独立于rc</p>
<h3 id="node-controller"><a href="#node-controller" class="headerlink" title="node-controller"></a>node-controller</h3><p>维护node的状态,</p>
<ul>
<li>监听pod/node/deamonSet对象。pod对象,对pod.DeletionTimestamp > 0对pod,如果node不存在删除pod。监听node/deamonSet对象,缓存在本地cache中</li>
<li>周期性monitorNodeStatus</li>
</ul>
<ol>
<li>node被删除,清理上面的container</li>
<li>controller会记录最新的node READY condition, 如果当前的READY conditioin != saved condition,保存最新的ready condition并更新nodeStatus.probetimestamp, 如果probetimestamp很长时间没更新(默认40s),则认为node可能出现问题,将node READY condition变成UNKNOWN。并回写apiserver</li>
<li>node ready -> 非ready 将node上所有pod的readyConditioin设置为false</li>
<li>如果node ready condition 为false/unknown 超过5min,清理上面的container</li>
<li>node 非ready变为ready,node controller没有动作,scheduler对这个感兴趣</li>
<li>如果node处于非ready状态会向cloud请求node是否存在,如果不存在,把node从etcd中删除<br>NOTE: 清理pod时,如果pod属于DeamonSet,node controller不会清理,等待DaemonSet controller清理。</li>
</ol>
<ul>
<li>有一个routine定期扫面绑定在node上的pod (pod.spec.nodemame != “”), 如果对应的node在nodeCache中找不到了,删除这个pod</li>
</ul>
<h3 id="petset"><a href="#petset" class="headerlink" title="petset"></a>petset</h3><ul>
<li>系统中petset的pod为Pod1, 期望的pod为Pod2, 需要同步的pod为pod2,需要删除的pod为pod1 - pod2</li>
<li>同步过程: 如果系统中没有pod,创建。如果有,则比较petId(对名字/网络/pvc identifier的签名)是否相同,如果不同则更新对应的pod</li>
<li>删除过程: 调用apiserver接口删除,只是更新deleteTimestamp,等待kubelet物理删除</li>
<li>Note:<br>每个petset只能同时创建/删除一个pod.<br>创建一个pod后,需要等待pod状态变为running,才进行下一个操作。<br>删除pod后,需要等待pod从apiserver物理删除才进行下一个操作。<br>每个petset正在操作的pod会放入unhealthyPetTracker#store中</li>
</ul>
<h3 id="service-controller"><a href="#service-controller" class="headerlink" title="service-controller"></a>service-controller</h3><p>维护service和loadBlancer的对应关系</p>
<ul>
<li>service有变化, 创建/删除对应的lb</li>
<li>node发生变化, 调用lb update接口更新hosts列表</li>
<li>service有三类,clusterIP/NodePort/lb, 前两个资源的分配放到了apiserver</li>
</ul>
<h3 id="endpoint-controller"><a href="#endpoint-controller" class="headerlink" title="endpoint-controller"></a>endpoint-controller</h3><p>维护service和endPoint对象的映射关系</p>
<ul>
<li>启动时获取所有endPoint对象,同步对应的service</li>
<li>watch service, 如果有service发生变化,同步service</li>
<li>同步service罗辑: <br>– 如果serivce已经删除,删除对应的endpoint对象<br>– 如果service增加/变化,获取所有相关pod信息,利用podId,生成endPoint对象,调用apiserver接口进行更新</li>
<li>例子<br>– service : {“ports”:[{“protocol”:”TCP”,”port”:8000,”targetPort”:80}],”clusterIP\”:\”10.0.0.72\”}<br>– endpoint : {“addresses”:[{“ip”:”1.1.1.1”},{“ip”:”1.1.1.2”}],”ports”:[{“port”:80,”protocol”:”TCP”}]}</li>
<li>endpoint对象的变化由kube-proxy捕获,维护对应的路由信息,其他pod就可以通过serviceIP访问service</li>
</ul>
<h3 id="namespace-controller"><a href="#namespace-controller" class="headerlink" title="namespace-controller"></a>namespace-controller</h3><ul>
<li>namespace 创建后处于active状态,可以在namespace下创建各种资源</li>
<li>如果删除namespace, 处于terminating状态,Namespace.ObjecMeta.DeletionTimestamp被设置为当前时间,namespace controller发现这一事件,清理namespace下已知的资源,清理完成后将”kubernetes”从Namespace.Spec.Finalizers中删除</li>
<li>Namespace.Spec.Finalizers为空时,把namespace从etcd中删除,这个逻辑主要是保护用户在自己namespace创建自己的资源类型,等待所有资源被删除后才会删除namespace</li>
</ul>
<h3 id="resourcequota-controller"><a href="#resourcequota-controller" class="headerlink" title="resourcequota-controller"></a>resourcequota-controller</h3><ul>
<li><p>quota在一个namespace内限制,quota跟踪的是request资源,不是limit资源<br>apiserver在创建对象时检查是否超过quota,如果超过则拒绝请求。</p>
<blockquote>
<p>$ kube-apiserver –admission-control=ResourceQuota</p>
</blockquote>
</li>
<li><p>resourcequota-controller监听resourcequota,Pod,service,rc,PersistentVolumeClaim,Secret资源,<br>如果resourcequota发生变化,pod状态发生变化(变成succ/fail),其他资源被delete则会触发 resourcequota的sync。pod会影响内存/cpu quota,其他资源影响resource 数目quota</p>
</li>
<li>同步过程,通过quota#registry接口获取相关resource的资源汇总,跟quota.status.used做比较,如果不相同则更新apiserver中的quota.status.used</li>
<li>scope, 创建resourcequota时可以制定scope,计算资源使用量时首先会判断pod是否属于这个scope,<br>如果pod没有显示的资源请求,isBestEffort(pod)为true</li>
</ul>
<figure class="highlight go"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div></pre></td><td class="code"><pre><div class="line"><span class="keyword">switch</span> scope {</div><div class="line"><span class="keyword">case</span> api.ResourceQuotaScopeTerminating:</div><div class="line"> <span class="keyword">return</span> isTerminating(pod)</div><div class="line"><span class="keyword">case</span> api.ResourceQuotaScopeNotTerminating:</div><div class="line"> <span class="keyword">return</span> !isTerminating(pod)</div><div class="line"><span class="keyword">case</span> api.ResourceQuotaScopeBestEffort:</div><div class="line"> <span class="keyword">return</span> isBestEffort(pod)</div><div class="line"><span class="keyword">case</span> api.ResourceQuotaScopeNotBestEffort:</div><div class="line"> <span class="keyword">return</span> !isBestEffort(pod)</div><div class="line">}</div></pre></td></tr></table></figure>
<h3 id="garbage-collector"><a href="#garbage-collector" class="headerlink" title="garbage-collector"></a>garbage-collector</h3><p>每隔20s,如果结束的pod(pod.status.phase not in (RUNNING,PENDING,UNKNOWN))超过一定数目(默认12500),选出最老的pod从apiserver删除.</p>
<h3 id="horizontal-pod-autoscaler"><a href="#horizontal-pod-autoscaler" class="headerlink" title="horizontal-pod-autoscaler"></a>horizontal-pod-autoscaler</h3><p>负责根据pod负载情况自动增加/删除 pod</p>
<ul>
<li>每一个hpa对象创建时会跟一个rc/deployment绑定, 后续对pod进行增加删除的动作通过rc/deployment的scale接口进行<blockquote>
<p>kubectl autoscale rc foo –max=5 –cpu-percent=80</p>
</blockquote>
</li>
<li>系统默认只支持根据cpu负载进行auto scale,用户也可以添加自定义的metric信息。通过HeapsterMetrics获取metric信息</li>
<li>周期性进行hpa对象同步,同步过程:hirozonal#reconcileAutoscaler,获取实际使用cpu的负载,并targe负载做比较,决定要不scale,如果需要,则操作rc/deployment scale接口,并更新hpa状态<figure class="highlight java"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div></pre></td><td class="code"><pre><div class="line">usageRatio := float64(*currentUtilization) / float64(targetUtilization)</div><div class="line"><span class="keyword">if</span> math.Abs(<span class="number">1.0</span>-usageRatio) > <span class="number">0.1</span> {</div><div class="line"> <span class="keyword">return</span> <span class="keyword">int</span>(math.Ceil(usageRatio * float64(currentReplicas))), currentUtilization, timestamp, nil</div><div class="line">} <span class="keyword">else</span> {</div><div class="line"> <span class="keyword">return</span> currentReplicas, currentUtilization, timestamp, nil</div><div class="line">}</div></pre></td></tr></table></figure>
</li>
</ul>
<h3 id="daemon-set-controller"><a href="#daemon-set-controller" class="headerlink" title="daemon-set-controller"></a>daemon-set-controller</h3><p>控制在node上启动指定的pod,<br>如果指定了.spec.template.spec.nodeSelector或.spec.template.metadata.annotations,会在匹配的node上启动pod,否则在所有node上启动。daemonSet创建的pod直接指定pod.spec.nodeName不经过调度器调度</p>
<ul>
<li><p>监听deemonSet, node, pod 三种resource, 下面集中情况会进行daemonSet同步<br>– Add/update/Delete deamonSet<br>– 如果有变化的pod相关的demaonSet(label匹配),会把相关的DeamonSet进行同步<br>– nodeAdd,nodeShouldRunDaemonPod返回除,nodeUpdate nodeShouldRunDaemonPod(oldNode) != nodeShouldRunDaemonPod(NewNode)</p>
</li>
<li><p>DaemonSet同步过程<br>1. 遍历podStore中deamonSet的所有pod,以nodeName为key放到map里<br>2. 遍历nodeStore中的node, 用nodeShouldRunDaemonPod判断是否可以运行pod,跟上面得到的结果做对比,判断是否需要增加/删除pod, 如果创建pod, pod.spec.nodeName指定为所在的nodeName,也就是创建的pod不需要经过调度器调度</p>
</li>
<li>nodeShouldRunDaemonPod 会参考nodeCondition, 是否有空闲资源,是否pod端口冲突</li>
</ul>
<h3 id="job-controller"><a href="#job-controller" class="headerlink" title="job-controller"></a>job-controller</h3><p>维护短作业的生命周期</p>
<ul>
<li><p>参数<br>job.Spec.Completions pod完成几个后job认为已经成功<br>job.Spec.Parallelism job的并行度,最多运行active的pod数目</p>
</li>
<li><p>如果设置了超时时间job.Spec.ActiveDeadlineSeconds,并且没有在这一段事件完成,会杀掉所有active pod,并把job状态设置为FAILED</p>
</li>
<li><p>job controller监听job和pod对象,如果有相关变化,进行job同步</p>
</li>
<li><p>job同步过程, 从podStore找到属于自己的pod, 并找出active,succ,fail的pod,如果succ pod数目大于job.Spec.Completions,认为job成功结束,如果小于,则对比期望的activePod数目和找到的activePod数目,如果不一致,创建/删除pod</p>
</li>
</ul>
<h3 id="deployment-controller"><a href="#deployment-controller" class="headerlink" title="deployment-controller"></a>deployment-controller</h3><p>deployment会把pod和rs一块儿发布。支持新建/更新/删除/回退/deployment</p>
<ul>
<li>deployment把pod和replicaset一直发布。并且有一个操作版本的概念,可以对deployment升级,比如替换image,可以回退到某个版本。</li>
<li>deployment controller负责监听deployment/replicaset/pod对象,如果发生变化则同步deployment</li>
<li>deployment找出所有新的replicaset和老的replicaset,根据deployment.Spec.Strategy.Type,判断是几个几个升级还是把老的都kill掉(通过操作replicaset.spec.replica字段)</li>
<li>如何判断新老rs. hash(deployment.Spec.Template)得到一个value, 跟rs.labels[DefaultDeploymentUniqueLabelKey]比较,如果相同则是新的,如果不同,就是旧的</li>
<li>版本号的实现。rs.Annotations[deploymentutil.RevisionAnnotation]保存了当前rs的版本号,如果想回退到某个版本,只需要把这个版本的rs.spec.template copy到 deployment.spec.Template,回退的版本就是最新的版本。</li>
<li>升级的具体过程</li>
<li>deployment 如何创建rs? deployment的label对rs和pod都没有影响,annotation会传给rs. hash key会传给template.label,最终影响rs和pod</li>
</ul>
<ol>
<li>newTemplate = deployment.spec.template</li>
<li>add hashKey label to newTemplate.ObjectMeta.Labels (第一步已经把template中的label copy过去)</li>
<li>newRS.spec.selector = deployment.Selector + hashKey selector</li>
<li>newRS.annotation = deployment.anotation</li>
<li>create rs object </li>
<li>rs的label如何生成? 从结果上看是从template.labels上生成的</li>
</ol>
<p>deployment_controller.go#getNewReplicaSet<br><figure class="highlight go"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div></pre></td><td class="code"><pre><div class="line">newRS := extensions.ReplicaSet{</div><div class="line"> ObjectMeta: api.ObjectMeta{</div><div class="line"> <span class="comment">// Make the name deterministic, to ensure idempotence</span></div><div class="line"> Name: deployment.Name + <span class="string">"-"</span> + fmt.Sprintf(<span class="string">"%d"</span>, podTemplateSpecHash),</div><div class="line"> Namespace: namespace,</div><div class="line"> },</div><div class="line"> Spec: extensions.ReplicaSetSpec{</div><div class="line"> Replicas: <span class="number">0</span>,</div><div class="line"> Selector: newRSSelector,</div><div class="line"> Template: newRSTemplate,</div><div class="line"> },</div><div class="line">}</div></pre></td></tr></table></figure></p>
<ul>
<li>rs如何创建pod? </li>
</ul>
<ol>
<li>desiredLabels = template.labels</li>
<li>desiredAnnotations = template.annotations + createBy annotation</li>
<li>pod.spec = template.spec</li>
</ol>
<figure class="highlight go"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div></pre></td><td class="code"><pre><div class="line">pod := &api.Pod{</div><div class="line"> ObjectMeta: api.ObjectMeta{</div><div class="line"> Labels: desiredLabels,</div><div class="line"> Annotations: desiredAnnotations,</div><div class="line"> GenerateName: prefix,</div><div class="line"> },</div><div class="line">}</div></pre></td></tr></table></figure>
<ul>
<li><p>hash生成, 把hashKey从template.spec.labels中去除,对template做签名</p>
<figure class="highlight go"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div></pre></td><td class="code"><pre><div class="line"><span class="function"><span class="keyword">func</span> <span class="title">GetPodTemplateSpecHash</span><span class="params">(rs extensions.ReplicaSet)</span> <span class="title">string</span></span> {</div><div class="line"> meta := rs.Spec.Template.ObjectMeta</div><div class="line"> meta.Labels = labelsutil.CloneAndRemoveLabel(meta.Labels, extensions.DefaultDeploymentUniqueLabelKey)</div><div class="line"> <span class="keyword">return</span> fmt.Sprintf(<span class="string">"%d"</span>, podutil.GetPodTemplateSpecHash(api.PodTemplateSpec{</div><div class="line"> ObjectMeta: meta,</div><div class="line"> Spec: rs.Spec.Template.Spec,</div><div class="line"> }))</div><div class="line">}</div></pre></td></tr></table></figure>
</li>
<li><p>deployment 利用spec.template.metadata.labels生成selector,具体实现在kubectl/run.go#Generate</p>
</li>
</ul>
<h3 id="replicasets"><a href="#replicasets" class="headerlink" title="replicasets"></a>replicasets</h3><blockquote>
<p>Replica Set is the next-generation Replication Controller. The only difference between a Replica Set and a Replication Controller right now is the selector support. Replica Set supports the new set-based selector requirements as described in the labels user guide whereas a Replication Controller only supports equality-based selector requirements.</p>
</blockquote>
<p>###<a href="http://kubernetes.io/docs/user-guide/persistent-volumes/" target="_blank" rel="external">Persistent-volume related</a><br>PersistentVolume (PV)作为一种资源被k8s管理,PersistentVolumeClaim (PVC)表示用户对PV资源的请求,使用的过程分为几个阶段</p>
<ol>
<li>Provisioning, 用户创建pv</li>
<li>binding 用户创建pvc后,controller分配pv的过程,pv.spec.ClaimRef = pvc</li>
<li>using 用户pod使用pv</li>
<li>Releasing, 用户删除pvc</li>
<li>Reclaiming, 回收pv,涉及不同的回收策略</li>
</ol>
<p>PV phase:<br><figure class="highlight go"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div></pre></td><td class="code"><pre><div class="line"><span class="comment">// used for PersistentVolumes that are not available</span></div><div class="line">VolumePending PersistentVolumePhase = <span class="string">"Pending"</span></div><div class="line"><span class="comment">// used for PersistentVolumes that are not yet bound</span></div><div class="line"><span class="comment">// Available volumes are held by the binder and matched to PersistentVolumeClaims</span></div><div class="line">VolumeAvailable PersistentVolumePhase = <span class="string">"Available"</span></div><div class="line"><span class="comment">// used for PersistentVolumes that are bound</span></div><div class="line">VolumeBound PersistentVolumePhase = <span class="string">"Bound"</span></div><div class="line"><span class="comment">// used for PersistentVolumes where the bound PersistentVolumeClaim was deleted</span></div><div class="line"><span class="comment">// released volumes must be recycled before becoming available again</span></div><div class="line"><span class="comment">// this phase is used by the persistent volume claim binder to signal to another process to reclaim the resource</span></div><div class="line">VolumeReleased PersistentVolumePhase = <span class="string">"Released"</span></div><div class="line"><span class="comment">// used for PersistentVolumes that failed to be correctly recycled or deleted after being released from a claim</span></div><div class="line">VolumeFailed PersistentVolumePhase = <span class="string">"Failed"</span></div></pre></td></tr></table></figure></p>
<p>PVC phase:<br><figure class="highlight go"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div></pre></td><td class="code"><pre><div class="line"><span class="comment">// used for PersistentVolumeClaims that are not yet bound</span></div><div class="line">ClaimPending PersistentVolumeClaimPhase = <span class="string">"Pending"</span></div><div class="line"><span class="comment">// used for PersistentVolumeClaims that are bound</span></div><div class="line">ClaimBound PersistentVolumeClaimPhase = <span class="string">"Bound"</span></div></pre></td></tr></table></figure></p>
<h4 id="persistent-volume-provisioner"><a href="#persistent-volume-provisioner" class="headerlink" title="persistent-volume-provisioner"></a>persistent-volume-provisioner</h4><ul>
<li>reconcileClaim 如果是新的claim, 调用plugin#NewProvisioner接口创建privisioner,最终创建persistemVollumn, 跟claim绑定</li>
</ul>
<figure class="highlight go"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div></pre></td><td class="code"><pre><div class="line"><span class="keyword">if</span> claim.annotations[pvProvisioningRequiredAnnotationKey] == pvProvisioningCompletedAnnotationValue</div><div class="line"> <span class="keyword">return</span></div><div class="line">provisioner = controller.newProvisioner()</div><div class="line">newVollumn = provisioner.NewPersistentVolumeTemplate()</div><div class="line">newVolume.Spec.ClaimRef = claimRef</div><div class="line">newVolume.Annotations[pvProvisioningRequiredAnnotationKey] = <span class="string">"true"</span></div><div class="line">controller.client.CreatePersistentVolume(newVolume)</div><div class="line">claim.Annotations[pvProvisioningRequiredAnnotationKey] = pvProvisioningCompletedAnnotationValue</div><div class="line">controller.client.UpdatePersistentVolumeClaim(claim)</div></pre></td></tr></table></figure>
<ul>
<li>reconcileClaim 调用privisioner#Provision 分配具体的资源</li>
</ul>
<figure class="highlight go"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div></pre></td><td class="code"><pre><div class="line"><span class="keyword">if</span> pv.Spec.ClaimRef == <span class="literal">nil</span> || pv.annotations[pvProvisioningRequiredAnnotationKey] == pvProvisioningCompletedAnnotationValue </div><div class="line"> <span class="keyword">return</span></div><div class="line">provisioner := controller.newProvisioner(controller.provisioner, claim, pv)</div><div class="line">provisioner.Provision(pv)</div><div class="line">pv.Annotations[pvProvisioningRequiredAnnotationKey] = pvProvisioningCompletedAnnotationValue</div><div class="line">controller.client.UpdatePersistentVolume(volumeClone)</div></pre></td></tr></table></figure>
<h4 id="persistent-volume-binder"><a href="#persistent-volume-binder" class="headerlink" title="persistent-volume-binder"></a>persistent-volume-binder</h4><ul>
<li>syncVolumn 等待volumn provision完成,从pending状态到Available状态时会如果claim还是处于pending状态,会调用syncClaim,进行绑定</li>
<li>syncclaim 等待claim provision完成。claim如果处于pending状态,会选择一个pv(acessMode符合,capacity浪费最小)并绑定,进入Bound状态。<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div></pre></td><td class="code"><pre><div class="line">volume = findBestMatchForClaim(claim)</div><div class="line">claim.Spec.VolumeName = volume.Name</div><div class="line">binderClient.UpdatePersistentVolumeClaim(claim)</div><div class="line">claim.Status.Phase = api.ClaimBound</div><div class="line">claim.Status.AccessModes = volume.Spec.AccessModes</div><div class="line">claim.Status.Capacity = volume.Spec.Capacity</div><div class="line">binderClient.UpdatePersistentVolumeClaimStatus(claim)</div></pre></td></tr></table></figure>
</li>
</ul>
<h4 id="persistent-volume-recycler"><a href="#persistent-volume-recycler" class="headerlink" title="persistent-volume-recycler"></a>persistent-volume-recycler</h4><p>如果persistentVolume处于released状态,根据Spec.PersistentVolumeReclaimPolicy回收资源</p>
<ul>
<li><p>PersistentVolumeReclaimRecycle, 调用插件的recycle函数,并且persistent-volume变为pending状态等待被绑定</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div></pre></td><td class="code"><pre><div class="line">volRecycler = plugin.NewRecycler(spec)</div><div class="line">volRecycler.Recycle()</div><div class="line">pv.Status.Phase = api.VolumePending</div><div class="line">recycler.client.UpdatePersistentVolumeStatus(pv)</div></pre></td></tr></table></figure>
</li>
<li><p>PersistentVolumeReclaimDelete 调用插件的deleter删除pv,并向apiserver发送请求删除</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div></pre></td><td class="code"><pre><div class="line">deleter = plugin.NewDeleter(spec)</div><div class="line">deleter.Delete()</div><div class="line">recycler.client.DeletePersistentVolume(pv)</div></pre></td></tr></table></figure>
</li>
</ul>
<h3 id="service-account-controller"><a href="#service-account-controller" class="headerlink" title="service-account-controller"></a>service-account-controller</h3><p>保证“default” serviceAccount的存在</p>
<ul>
<li><p>service Acout</p>
<figure class="highlight go"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div></pre></td><td class="code"><pre><div class="line"><span class="keyword">type</span> ServiceAccount <span class="keyword">struct</span> {</div><div class="line"> TypeMeta <span class="string">`json:",inline" yaml:",inline"`</span></div><div class="line"> ObjectMeta <span class="string">`json:"metadata,omitempty" yaml:"metadata,omitempty"`</span></div><div class="line"></div><div class="line"> username <span class="keyword">string</span></div><div class="line"> securityContext ObjectReference <span class="comment">// (reference to a securityContext object)</span></div><div class="line"> secrets []ObjectReference <span class="comment">// (references to secret objects</span></div><div class="line">}</div></pre></td></tr></table></figure>
</li>
<li><p>监听”default”这个serviceAcount对象,如果被删除了,重新创建</p>
</li>
<li>监听namespace,如果新增/更新namespace,如果没有serviceAcount创建”default”serviceAccout</li>
</ul>
<h3 id="tokens-controller"><a href="#tokens-controller" class="headerlink" title="tokens-controller"></a>tokens-controller</h3><p>维护serviceAcount和secret的对应关系: 一个serviceAccount可能会对应多个secret,每个secret都有一个token.</p>
<ul>
<li><p>监听serviceAccount, 如果增加/更新serviceAcount,如果没有secret跟serviceAccount绑定,则创建secret,token并绑定。如果删除serviceAccount,从apiserver删除相关的secret.<br>创建secret过程:</p>
<figure class="highlight go"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div></pre></td><td class="code"><pre><div class="line">secret := &api.Secret{</div><div class="line"> ObjectMeta: api.ObjectMeta{</div><div class="line"> Name: secret.Strategy.GenerateName(fmt.Sprintf(<span class="string">"%s-token-"</span>, serviceAccount.Name)),</div><div class="line"> Namespace: serviceAccount.Namespace,</div><div class="line"> Annotations: <span class="keyword">map</span>[<span class="keyword">string</span>]<span class="keyword">string</span>{</div><div class="line"> api.ServiceAccountNameKey: serviceAccount.Name,</div><div class="line"> api.ServiceAccountUIDKey: <span class="keyword">string</span>(serviceAccount.UID),</div><div class="line"> },</div><div class="line"> },</div><div class="line"> Type: api.SecretTypeServiceAccountToken,</div><div class="line"> Data: <span class="keyword">map</span>[<span class="keyword">string</span>][]<span class="keyword">byte</span>{},</div><div class="line">}</div><div class="line"> token, err := e.token.GenerateToken(*serviceAccount, *secret)</div><div class="line"> secret.Data[api.ServiceAccountTokenKey] = []<span class="keyword">byte</span>(token)</div><div class="line">secret.Data[api.ServiceAccountNamespaceKey] = []<span class="keyword">byte</span>(serviceAccount.Namespace)</div><div class="line"> secret.Data[api.ServiceAccountRootCAKey] = e.rootCA</div><div class="line"> e.client.Core().Secrets(serviceAccount.Namespace).Create(secret);</div><div class="line"> liveServiceAccount.Secrets = <span class="built_in">append</span>(liveServiceAccount.Secrets, api.ObjectReference{Name: secret.Name})</div><div class="line"> serviceAccounts.Update(liveServiceAccount)</div></pre></td></tr></table></figure>
</li>
<li><p>监听secret, 如果增加/更新secret,如果找不到相应的serviceAccount,删除secret.如果secret.token不存在会生成新的token,如果删除secret,会把secret从serviceAccount中删除,并更新serviceAccount</p>
</li>
</ul>
<h2 id="数据结构"><a href="#数据结构" class="headerlink" title="数据结构"></a>数据结构</h2><h3 id="informer"><a href="#informer" class="headerlink" title="informer"></a>informer</h3><p>informer提供了当apiserver中的资源发生变化时,获得通知的框架</p>
<ul>
<li>需要用户提供listWatcher从apiserver同步resource, 以及ResourceHandler接口当资源发生改变时回调Added/Deleted/updated接口。每个controller只需要完成ResourceHandler逻辑即可。</li>
<li>创建informer时,会创建一个store和controller,store保存了最新的resource在本地的cache, controller则通过listWatcher获取资源的最新信息,更新store,如果resource发生变化,回调ResourceHandler</li>
<li>indexInformer 创建store时用户可以传入indexer,做二级索引</li>
<li>sharedIndexInforme</li>
</ul>
<h3 id="reflector"><a href="#reflector" class="headerlink" title="reflector"></a>reflector</h3><ul>
<li>利用listWatcher获取数据的变化<ol>
<li>client.list获取所有对象,调用deltaQueue#replcase方法(delete老数据,add新数据)</li>
<li>如果设置rsync period,每隔一段时间对deltaQueue中所有known key发送sync事件</li>
<li>不断watch新的事件,并将事件放入deltaQueue</li>
</ol>
</li>
<li>将事件放入delta queue</li>
<li>process函数获取(pop)变化的事件,根据事件类型更新store,并触发注册的回调函数</li>
</ul>
<h3 id="workQueue"><a href="#workQueue" class="headerlink" title="workQueue"></a>workQueue</h3><ul>
<li>特殊的FIFO,如果在pop前,push一个对象多次,只能取出一个。informer判断对象需要同步时会把对象放入workQueue, worker负责具体的同步逻辑,因为是同步操作,所以只需要同步一次。</li>
<li>一个对象在同步时会被放入dirty map中,保证同时只能被一个worker处理</li>
</ul>
<h3 id="DeltaQueue"><a href="#DeltaQueue" class="headerlink" title="DeltaQueue"></a>DeltaQueue</h3><ul>
<li><p>类似FIFO队列,取出一个对象时,会把这段时间关于这个对象的所有操作取出来</p>
<figure class="highlight go"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div></pre></td><td class="code"><pre><div class="line">DeltaQueue.add(a)</div><div class="line">DeltaQueue.add(b)</div><div class="line">DeltaQueue.add(b)</div><div class="line">DeltaQueue.<span class="built_in">delete</span>(a)</div><div class="line">item, delta = DeltaQueue.get()</div><div class="line">item == a</div><div class="line">delta == [ADD,DETELE]</div></pre></td></tr></table></figure>
</li>
<li><p>replace方法,</p>
</li>
<li>hasSynced<br>replace产生的对象已经都被pop完, 对应的store是一份完整的视图 (实现有bug? delete的元素没有考虑进去)</li>
</ul>
<h3 id="store"><a href="#store" class="headerlink" title="store"></a>store</h3><ul>
<li><p>提供基本对象存储功能,有add/get/delete接口,底层实现依赖ThreadSafeStore</p>
<figure class="highlight go"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div></pre></td><td class="code"><pre><div class="line"><span class="comment">// cache responsibilities are limited to:</span></div><div class="line"><span class="comment">// 1. Computing keys for objects via keyFunc</span></div><div class="line"><span class="comment">// 2. Invoking methods of a ThreadSafeStorage interface</span></div><div class="line"><span class="keyword">type</span> cache <span class="keyword">struct</span> {</div><div class="line"> <span class="comment">// cacheStorage bears the burden of thread safety for the cache</span></div><div class="line"> cacheStorage ThreadSafeStore</div><div class="line"> <span class="comment">// keyFunc is used to make the key for objects stored in and retrieved from items, and</span></div><div class="line"> <span class="comment">// should be deterministic.</span></div><div class="line"> keyFunc KeyFunc</div><div class="line">}</div></pre></td></tr></table></figure>
</li>
<li><p>threadSafeStore 基本可以认为是线程安全的map, 其中的indexers提供了辅助索引的功能,实际系统中好像没什么用</p>
</li>
</ul>
<h3 id="generation-amp-amp-observedGeneration"><a href="#generation-amp-amp-observedGeneration" class="headerlink" title="generation && observedGeneration"></a>generation && observedGeneration</h3><p>对象创建时generation为1,一般spec发成更改时,generation++, 以rs为例,实现在registry/replicaset/strategy.go#PrepareForCreate/PrepareForUpdate<br>status.ObservedGeneration, 以rs为例,syncRS时会把observedGeneration变成generation.<br>deployment更新rs spec后会等待generation == status.observedGeneration,才会进行下一步的动作,起到两个资源的同步作用。当他们相等时,说明对spec的改变rs已经recives</p>
</div>
<footer class="article-footer">
<a data-url="http://yoursite.com/2016/09/26/controller-manager/" data-id="ciwxlcg9100078t2l78d4wfqa" class="article-share-link">Share</a>
<ul class="article-tag-list"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/controller/">controller</a></li><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/k8s/">k8s</a></li></ul>
</footer>
</div>
</article>
</section>
<aside id="sidebar">
<div class="widget-wrap">
<h3 class="widget-title">分类</h3>
<div class="widget">
<ul class="category-list"><li class="category-list-item"><a class="category-list-link" href="/categories/hadoop/">hadoop</a></li><li class="category-list-item"><a class="category-list-link" href="/categories/k8s/">k8s</a></li><li class="category-list-item"><a class="category-list-link" href="/categories/tools/">tools</a></li></ul>
</div>
</div>
<div class="widget-wrap">
<h3 class="widget-title">标签</h3>
<div class="widget">
<ul class="tag-list"><li class="tag-list-item"><a class="tag-list-link" href="/tags/apiserver/">apiserver</a></li><li class="tag-list-item"><a class="tag-list-link" href="/tags/controller/">controller</a></li><li class="tag-list-item"><a class="tag-list-link" href="/tags/hadoop/">hadoop</a></li><li class="tag-list-item"><a class="tag-list-link" href="/tags/hdfs/">hdfs</a></li><li class="tag-list-item"><a class="tag-list-link" href="/tags/k8s/">k8s</a></li><li class="tag-list-item"><a class="tag-list-link" href="/tags/linux/">linux</a></li><li class="tag-list-item"><a class="tag-list-link" href="/tags/tool/">tool</a></li><li class="tag-list-item"><a class="tag-list-link" href="/tags/存储/">存储</a></li></ul>
</div>
</div>
<div class="widget-wrap">
<h3 class="widget-title">标签云</h3>
<div class="widget tagcloud">
<a href="/tags/apiserver/" style="font-size: 10px;">apiserver</a> <a href="/tags/controller/" style="font-size: 10px;">controller</a> <a href="/tags/hadoop/" style="font-size: 10px;">hadoop</a> <a href="/tags/hdfs/" style="font-size: 10px;">hdfs</a> <a href="/tags/k8s/" style="font-size: 20px;">k8s</a> <a href="/tags/linux/" style="font-size: 10px;">linux</a> <a href="/tags/tool/" style="font-size: 10px;">tool</a> <a href="/tags/存储/" style="font-size: 10px;">存储</a>
</div>
</div>
<div class="widget-wrap">
<h3 class="widget-title">归档</h3>
<div class="widget">
<ul class="archive-list"><li class="archive-list-item"><a class="archive-list-link" href="/archives/2016/12/">十二月 2016</a></li><li class="archive-list-item"><a class="archive-list-link" href="/archives/2016/11/">十一月 2016</a></li><li class="archive-list-item"><a class="archive-list-link" href="/archives/2016/10/">十月 2016</a></li><li class="archive-list-item"><a class="archive-list-link" href="/archives/2016/09/">九月 2016</a></li></ul>
</div>
</div>
<div class="widget-wrap">
<h3 class="widget-title">最新文章</h3>
<div class="widget">
<ul>
<li>
<a href="/2016/12/20/tracepoint/">利用tracepoint查看系统调用相关信息</a>
</li>
<li>
<a href="/2016/11/26/hdfs-node/">HDFS笔记</a>
</li>
<li>
<a href="/2016/10/17/k8s-apiserver/">k8s apiserver分析</a>
</li>
<li>
<a href="/2016/09/26/controller-manager/">k8s controller manager分析</a>
</li>
</ul>
</div>
</div>
</aside>
</div>
<footer id="footer">
<div class="outer">
<div id="footer-info" class="inner">
© 2016 sandflee<br>
Powered by <a href="http://hexo.io/" target="_blank">Hexo</a>
</div>
</div>
</footer>
</div>
<nav id="mobile-nav">
<a href="/" class="mobile-nav-link">Home</a>
<a href="/archives" class="mobile-nav-link">Archives</a>
</nav>
<script src="//ajax.googleapis.com/ajax/libs/jquery/2.0.3/jquery.min.js"></script>
<link rel="stylesheet" href="/fancybox/jquery.fancybox.css">
<script src="/fancybox/jquery.fancybox.pack.js"></script>
<script src="/js/script.js"></script>
</div>
</body>
</html>