-
Notifications
You must be signed in to change notification settings - Fork 0
/
search.xml
3139 lines (3137 loc) · 458 KB
/
search.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<?xml version="1.0" encoding="utf-8"?>
<search>
<entry>
<title>jenkins的安装使用</title>
<url>/2019/12/28/jenkins-1/</url>
<content><![CDATA[<h3 id="jenkins的安装"><a href="#jenkins的安装" class="headerlink" title="jenkins的安装"></a>jenkins的安装</h3><ol>
<li>安装环境<br> centos 8.0.1905<br> user->jenkins ; group -> jenkins</li>
<li>安装过程<br> a. 从官网<a href="https://jenkins.io/zh/download/">https://jenkins.io/zh/download/</a>下载安装包<br> b. 安装 <figure class="highlight bash"><table><tr><td class="code"><pre><span class="line">sudo yum install -y jenkins-2.204.5-1.1.noarch.rpm</span><br><span class="line">``` </span><br><span class="line">c. 运行jenkins </span><br><span class="line">```bash</span><br><span class="line">systemctl start jenkins</span><br></pre></td></tr></table></figure></li>
<li><p>配置<br> a. 添加一个用户<br> <a href="https://jenkins.io/zh/doc/book/installing/">https://jenkins.io/zh/doc/book/installing/</a><br> b. 修改插件更新位置<br> <a href="http://192.168.226.131:8080/pluginManager/advanced">http://192.168.226.131:8080/pluginManager/advanced</a><br> 修改为国内地址,这里使用的是清华大学的开源软件地址<br> <a href="https://mirrors.tuna.tsinghua.edu.cn/jenkins/updates/update-center.json">https://mirrors.tuna.tsinghua.edu.cn/jenkins/updates/update-center.json</a><br> <center>
<img src="https://www.zhangbohan.com.cn/images/jenkins/jenkins%20install.png" src="修改插件地址为国内源" />
</center><br> c. 安装需要的插件 </p>
<ul>
<li>locale</li>
<li>maven</li>
<li>git</li>
<li>…</li>
</ul>
</li>
</ol>
<h3 id="jenkins的使用"><a href="#jenkins的使用" class="headerlink" title="jenkins的使用"></a>jenkins的使用</h3><ol>
<li>创建一个新的maven项目<center>
<img src="https://www.zhangbohan.com.cn/images/jenkins/jenkins%20create%20new%20item.png" src="修改插件地址为国内源" />
</center></li>
<li>项目配置<center>
<img src="https://www.zhangbohan.com.cn/images/jenkins/jenkins%20create%20new%20item1.png" src="修改插件地址为国内源" />
</center></li>
<li>查看项目构建情况 <center>
<img src="https://www.zhangbohan.com.cn/images/jenkins/jenkins%20create%20new%20item2.png" src="修改插件地址为国内源" />
</center></li>
</ol>
]]></content>
<categories>
<category>jenkins</category>
</categories>
<tags>
<tag>java</tag>
<tag>CI/CD</tag>
<tag>jenkins</tag>
</tags>
</entry>
<entry>
<title>git本地仓库搭建</title>
<url>/2020/03/29/git-using/</url>
<content><![CDATA[<h3 id="在centos下搭建本地git"><a href="#在centos下搭建本地git" class="headerlink" title="在centos下搭建本地git"></a>在centos下搭建本地git</h3><ol>
<li>安装git<figure class="highlight bash"><table><tr><td class="code"><pre><span class="line">yum install git</span><br></pre></td></tr></table></figure></li>
<li>初始化仓库 <figure class="highlight bash"><table><tr><td class="code"><pre><span class="line">git init --bare cloud-eureka.git</span><br></pre></td></tr></table></figure>
在当前目录下创建cloud-eureka.git文件夹</li>
<li>创建git用户用于git连接<figure class="highlight bash"><table><tr><td class="code"><pre><span class="line"><span class="comment">#添加用户组git</span></span><br><span class="line">groupadd git</span><br><span class="line"><span class="comment">#添加用户git</span></span><br><span class="line">useradd git -g git -p password</span><br></pre></td></tr></table></figure></li>
<li>配置公钥 <figure class="highlight bash"><table><tr><td class="code"><pre><span class="line"><span class="built_in">cd</span> ~/.ssh</span><br><span class="line"><span class="comment">#如果没有xxx.ras或者xxx.pub文件,生成一下:</span></span><br><span class="line">ssh-keygen -t rsa</span><br><span class="line">su git</span><br><span class="line"><span class="built_in">cd</span> /home/git </span><br><span class="line"><span class="built_in">ls</span> -a </span><br><span class="line"><span class="comment">#首先确认是否有ssh密钥:若有.ssh文件夹,进入查看是否有authorized_keys文件,如果都没有</span></span><br><span class="line"><span class="comment">#创建.ssh文件夹</span></span><br><span class="line"><span class="built_in">mkdir</span> .ssh</span><br><span class="line"><span class="comment">#创建authorized_keys文件:</span></span><br><span class="line"><span class="built_in">touch</span> authorized_keys</span><br><span class="line"><span class="comment">#将登录电脑的公钥填写到authorized_keys,即可免密</span></span><br></pre></td></tr></table></figure></li>
<li>禁用git的普通shell<figure class="highlight bash"><table><tr><td class="code"><pre><span class="line"> <span class="comment">#禁用普通的bash</span></span><br><span class="line">vim /etc/passwd</span><br></pre></td></tr></table></figure>
<figure class="highlight vim"><table><tr><td class="code"><pre><span class="line">#修改</span><br><span class="line"> gi<span class="variable">t:x</span>:<span class="number">1001</span>:<span class="number">1001</span>::/home/git:/bin/bash</span><br><span class="line">#为</span><br><span class="line"> gi<span class="variable">t:x</span>:<span class="number">1001</span>:<span class="number">1001</span>::/home/git:/usr/bin/git-<span class="keyword">shell</span></span><br></pre></td></tr></table></figure></li>
<li>修改远程访问权限<figure class="highlight bash"><table><tr><td class="code"><pre><span class="line"><span class="comment">#在git项目目录下</span></span><br><span class="line"><span class="built_in">chown</span> -R git:git cloud-eureka.git</span><br></pre></td></tr></table></figure></li>
<li>访问项目<br> git@192.168.226.130:/opt/project/cloud-eureka.git </li>
</ol>
]]></content>
<categories>
<category>git</category>
</categories>
<tags>
<tag>CI/CD</tag>
<tag>git</tag>
</tags>
</entry>
<entry>
<title>将项目发布到maven过程记录</title>
<url>/2020/01/08/push2maven/</url>
<content><![CDATA[<h2 id="说明"><a href="#说明" class="headerlink" title="说明"></a>说明</h2><p>  用于记录将自己的项目发布到maven中央仓库的过程。 </p>
<h2 id="记录"><a href="#记录" class="headerlink" title="记录"></a>记录</h2><ol>
<li><p>注册sonatype账号:【申请上传资格】<br>注册地址:<a href="https://issues.sonatype.org/secure/Signup!default.jspa">https://issues.sonatype.org/secure/Signup!default.jspa</a></p>
</li>
<li><p>登录sonatype并初始化</p>
</li>
<li><p>新建issue并解决提出的问题<br>如:<a href="https://issues.sonatype.org/browse/OSSRH-54353">https://issues.sonatype.org/browse/OSSRH-54353</a></p>
</li>
<li><p>创建密钥并发布<br> 安装gpg,新建密钥对(输入账号、邮箱、密码),发布密钥到服务服务端</p>
<figure class="highlight cmd"><table><tr><td class="code"><pre><span class="line"># 列出密钥</span><br><span class="line">gpg --list-keys</span><br><span class="line">gpg --keyserver http://keyserver.ubuntu.com:<span class="number">11371</span> --send-keys key_id</span><br><span class="line">gpg --keyserver http://pool.sks-keyservers.<span class="built_in">net</span>:<span class="number">11371</span> --send-keys key_id</span><br><span class="line">gpg --keyserver http://keyserver.ubuntu.com:<span class="number">11371</span> --send-keys key_id</span><br><span class="line">gpg --keyserver http://keys.gnupg.<span class="built_in">net</span>:<span class="number">11371</span> --send-keys key_id</span><br><span class="line">gpg --keyserver http://keys.openpgp.org:<span class="number">11371</span> --send-keys key_id</span><br></pre></td></tr></table></figure>
</li>
<li><p>配置maven</p>
<ol>
<li><p>修改setting.xml</p>
<figure class="highlight xml"><table><tr><td class="code"><pre><span class="line"><span class="tag"><<span class="name">server</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">id</span>></span>ossrh<span class="tag"></<span class="name">id</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">username</span>></span>sonatype用户名<span class="tag"></<span class="name">username</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">password</span>></span>sonatype密码<span class="tag"></<span class="name">password</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">server</span>></span></span><br></pre></td></tr></table></figure>
</li>
<li><p>修改项目pom.xml</p>
<font color="red">注意:snapshotRepository节点和repository节点的id要和上面server配置的id一致</font>
<figure class="highlight xml"><table><tr><td class="code"><pre><span class="line"><span class="comment"><!-->项目的协议<--></span></span><br><span class="line"><span class="tag"><<span class="name">licenses</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">license</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">name</span>></span>The Apache Software License, Version 2.0<span class="tag"></<span class="name">name</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">url</span>></span>http://www.apache.org/licenses/LICENSE-2.0.txt<span class="tag"></<span class="name">url</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">distribution</span>></span>actable<span class="tag"></<span class="name">distribution</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">license</span>></span></span><br><span class="line"><span class="tag"></<span class="name">licenses</span>></span></span><br><span class="line"></span><br><span class="line"><span class="comment"><!-->开发者的信息<--></span></span><br><span class="line"><span class="tag"><<span class="name">developers</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">developer</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">name</span>></span>example<span class="tag"></<span class="name">name</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">email</span>></span>example@outlook.com<span class="tag"></<span class="name">email</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">developer</span>></span></span><br><span class="line"><span class="tag"></<span class="name">developers</span>></span></span><br><span class="line"></span><br><span class="line"><span class="comment"><!-->项目的版本管理地址<--></span></span><br><span class="line"><span class="tag"><<span class="name">scm</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">url</span>></span>https://github.com/Bpazy/Id<span class="tag"></<span class="name">url</span>></span></span><br><span class="line"><span class="tag"></<span class="name">scm</span>></span></span><br><span class="line"><span class="tag"><<span class="name">profiles</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">profile</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">id</span>></span>release<span class="tag"></<span class="name">id</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">properties</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">profile.env</span>></span>prod<span class="tag"></<span class="name">profile.env</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">properties</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">build</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">resources</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">resource</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">directory</span>></span>src/main/java<span class="tag"></<span class="name">directory</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">resource</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">resources</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">plugins</span>></span></span><br><span class="line"> <span class="comment"><!-- Source --></span></span><br><span class="line"> <span class="tag"><<span class="name">plugin</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">groupId</span>></span>org.apache.maven.plugins<span class="tag"></<span class="name">groupId</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">artifactId</span>></span>maven-source-plugin<span class="tag"></<span class="name">artifactId</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">version</span>></span>2.2.1<span class="tag"></<span class="name">version</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">executions</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">execution</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">phase</span>></span>package<span class="tag"></<span class="name">phase</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">goals</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">goal</span>></span>jar-no-fork<span class="tag"></<span class="name">goal</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">goals</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">execution</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">executions</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">plugin</span>></span></span><br><span class="line"> <span class="comment"><!-- Javadoc --></span></span><br><span class="line"> <span class="tag"><<span class="name">plugin</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">groupId</span>></span>org.apache.maven.plugins<span class="tag"></<span class="name">groupId</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">artifactId</span>></span>maven-javadoc-plugin<span class="tag"></<span class="name">artifactId</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">version</span>></span>2.9.1<span class="tag"></<span class="name">version</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">configuration</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">show</span>></span>private<span class="tag"></<span class="name">show</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">nohelp</span>></span>true<span class="tag"></<span class="name">nohelp</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">charset</span>></span>UTF-8<span class="tag"></<span class="name">charset</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">encoding</span>></span>UTF-8<span class="tag"></<span class="name">encoding</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">docencoding</span>></span>UTF-8<span class="tag"></<span class="name">docencoding</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">additionalparam</span>></span>-Xdoclint:none<span class="tag"></<span class="name">additionalparam</span>></span></span><br><span class="line"> <span class="comment"><!-- TODO 临时解决不规范的javadoc生成报错,后面要规范化后把这行去掉 --></span></span><br><span class="line"> <span class="tag"></<span class="name">configuration</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">executions</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">execution</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">phase</span>></span>package<span class="tag"></<span class="name">phase</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">goals</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">goal</span>></span>jar<span class="tag"></<span class="name">goal</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">goals</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">execution</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">executions</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">plugin</span>></span></span><br><span class="line"> <span class="comment"><!-- GPG --></span></span><br><span class="line"> <span class="tag"><<span class="name">plugin</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">groupId</span>></span>org.apache.maven.plugins<span class="tag"></<span class="name">groupId</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">artifactId</span>></span>maven-gpg-plugin<span class="tag"></<span class="name">artifactId</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">version</span>></span>1.5<span class="tag"></<span class="name">version</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">executions</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">execution</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">phase</span>></span>verify<span class="tag"></<span class="name">phase</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">goals</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">goal</span>></span>sign<span class="tag"></<span class="name">goal</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">goals</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">execution</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">executions</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">plugin</span>></span></span><br><span class="line"></span><br><span class="line"> <span class="comment"><!--Compiler --></span></span><br><span class="line"> <span class="tag"><<span class="name">plugin</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">groupId</span>></span>org.apache.maven.plugins<span class="tag"></<span class="name">groupId</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">artifactId</span>></span>maven-compiler-plugin<span class="tag"></<span class="name">artifactId</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">version</span>></span>3.0<span class="tag"></<span class="name">version</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">configuration</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">source</span>></span>1.8<span class="tag"></<span class="name">source</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">target</span>></span>1.8<span class="tag"></<span class="name">target</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">fork</span>></span>true<span class="tag"></<span class="name">fork</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">verbose</span>></span>true<span class="tag"></<span class="name">verbose</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">encoding</span>></span>UTF-8<span class="tag"></<span class="name">encoding</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">showWarnings</span>></span>false<span class="tag"></<span class="name">showWarnings</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">configuration</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">plugin</span>></span></span><br><span class="line"> <span class="comment"><!--Release --></span></span><br><span class="line"> <span class="tag"><<span class="name">plugin</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">groupId</span>></span>org.apache.maven.plugins<span class="tag"></<span class="name">groupId</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">artifactId</span>></span>maven-release-plugin<span class="tag"></<span class="name">artifactId</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">version</span>></span>2.5.3<span class="tag"></<span class="name">version</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">plugin</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">plugins</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">build</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">distributionManagement</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">snapshotRepository</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">id</span>></span>ossrh<span class="tag"></<span class="name">id</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">name</span>></span>Sonatype Nexus Snapshots<span class="tag"></<span class="name">name</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">url</span>></span>https://oss.sonatype.org/content/repositories/snapshots/<span class="tag"></<span class="name">url</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">snapshotRepository</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">repository</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">id</span>></span>ossrh<span class="tag"></<span class="name">id</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">name</span>></span>Nexus Release Repository<span class="tag"></<span class="name">name</span>></span></span><br><span class="line"> <span class="tag"><<span class="name">url</span>></span>https://oss.sonatype.org/service/local/staging/deploy/maven2/<span class="tag"></<span class="name">url</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">repository</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">distributionManagement</span>></span></span><br><span class="line"> <span class="tag"></<span class="name">profile</span>></span></span><br><span class="line"><span class="tag"></<span class="name">profiles</span>></span></span><br></pre></td></tr></table></figure>
</li>
</ol>
</li>
<li><p>打包发布</p>
<figure class="highlight cmd"><table><tr><td class="code"><pre><span class="line">mvn clean deploy -P release</span><br></pre></td></tr></table></figure>
</li>
<li><p>将上传的开源库发布出去</p>
<ol>
<li>登录<a href="https://oss.sonatype.org/#stagingRepositories">https://oss.sonatype.org/#stagingRepositories</a>查看staging中的包</li>
<li>选中包,closed,解决closed的时候出现的问题,再次打包上传,直到可以closed为止</li>
<li>选中包,release</li>
</ol>
</li>
<li><p>登录sonatype并回复issue项目已发布,等待审核 </p>
</li>
</ol>
]]></content>
<categories>
<category>maven</category>
</categories>
<tags>
<tag>java</tag>
<tag>maven</tag>
<tag>jar</tag>
</tags>
</entry>
<entry>
<title>python通过boost.python使用c/c++</title>
<url>/2020/01/15/pythonwithc-1/</url>
<content><![CDATA[<h3 id="安装boost环境"><a href="#安装boost环境" class="headerlink" title="安装boost环境"></a>安装boost环境</h3><p><a href="https://www.boost.org/">https://www.boost.org/</a>下载,解压设置环境变量即可在c/c++中日常使用<br>但是对于boost.python来说需要对boost进行编译 </p>
<p>编写user-config.jam文件,设置python目录、链接文件目录、头文件目录,位于c:/users/用户/user-config.jam </p>
<figure class="highlight plaintext"><table><tr><td class="code"><pre><span class="line">import toolset : using ;</span><br><span class="line">using python</span><br><span class="line">: 3.7</span><br><span class="line">: D:/Applications/python/python37/python.exe </span><br><span class="line">: D:/Applications/python/python37/include</span><br><span class="line">: D:/Applications/python/python37/libs</span><br><span class="line">: <toolset>gcc</span><br><span class="line">;</span><br></pre></td></tr></table></figure>
<figure class="highlight cmd"><table><tr><td class="code"><pre><span class="line">bootstrap.bat gcc #生成b2.exe等文件</span><br><span class="line">b2 --with-python --toolset=gcc architecture=x86 address-model=<span class="number">64</span> link=shared #在stage/lib下生成boost.python的对应Python版本的dll和dll.a文件</span><br></pre></td></tr></table></figure>
<h3 id="编写并编译c-程序"><a href="#编写并编译c-程序" class="headerlink" title="编写并编译c++程序"></a>编写并编译c++程序</h3><p>使用boost.python导出需要使用的c++模块,使用说明见<a href="https://wiki.python.org/moin/boost.python/module">https://wiki.python.org/moin/boost.python/module</a><br>需要注意的是,BOOST_PYTHON_MODULE(my_module)中的my_module必须和文件名相同<br>编译c++程序生成dll文件</p>
<h3 id="python使用c-编译后的dll文件"><a href="#python使用c-编译后的dll文件" class="headerlink" title="python使用c++编译后的dll文件"></a>python使用c++编译后的dll文件</h3><p>将使用boost.python的dll文件改名为pyd(python的链接库文件)<br>在Python中添加路径(可以使用sys.path.append或者复制到python执行目录下) ,需要注意,除了添加.pyd文件之外,还需要添加 boost.python的对应Python版本的dll文件 </p>
<figure class="highlight python"><table><tr><td class="code"><pre><span class="line"><span class="keyword">import</span> my_module</span><br></pre></td></tr></table></figure>
]]></content>
<categories>
<category>c/c++</category>
</categories>
<tags>
<tag>python</tag>
<tag>c/c++</tag>
<tag>boost</tag>
</tags>
</entry>
<entry>
<title>Springboot源码探析 — 自动配置</title>
<url>/2023/02/26/springboot%E6%BA%90%E7%A0%81%E6%8E%A2%E6%9E%90/</url>
<content><![CDATA[<h1 id=""><a href="#" class="headerlink" title=" "></a> </h1><p>复杂的场景—规范(框架)—按照特定的规则,配置</p>
<blockquote>
<p>JavaWeb Servlet+Tomcat 繁琐~演进<br>SpringMVC<br>Springboot</p>
</blockquote>
<h2 id="Spring开源框架"><a href="#Spring开源框架" class="headerlink" title="Spring开源框架"></a>Spring开源框架</h2><p> 目的:简化企业开发 2003~2020<br>为了降低java的开发复杂性,spring采取了以下4种关键策略</p>
<ol>
<li>基于POJO的轻量级和小侵入性编程</li>
<li>通过IOC、依赖注入(DI)和面向接口实现松耦合</li>
<li>基于切面(AOP)和惯例进行声明式编程</li>
<li>通过切面和模板减少样式代码</li>
</ol>
<h3 id="创建项目流程"><a href="#创建项目流程" class="headerlink" title="创建项目流程"></a>创建项目流程</h3><ol>
<li>创建一个项目,导入一堆的依赖</li>
<li>配置web.xml</li>
<li>配置spring的文件</li>
<li>编码、测试</li>
<li>配置tomcat</li>
<li>发布</li>
</ol>
<h2 id="Springboot"><a href="#Springboot" class="headerlink" title="Springboot"></a>Springboot</h2><p>自动配置</p>
<h3 id="主要优点"><a href="#主要优点" class="headerlink" title="主要优点"></a>主要优点</h3><ul>
<li>为所有spring开发者更快的入门</li>
<li>开箱即用,提供各种默认配置来简化项目配置</li>
<li>内嵌式容器简化web项目</li>
<li>没有冗余代码生成的xml配置的要求</li>
</ul>
<h3 id="一些问题"><a href="#一些问题" class="headerlink" title="一些问题"></a>一些问题</h3><p> Spring Boot(甚至整个Spring)大量采用了“Convention over configuration”约定胜于配置的设计模式,有大量的机制(比方说配置发现、方法命名等等)依赖于隐式的约定。当对这些约定不熟悉的时候,就会陷入“撞大运”编程模式,这里改改,那里调调,虽然不知道为什么,但程序就工作了;又或者就不工作了,然后需要耗费大量的时间去debug,或者请教他人。<br>其实还有另一种相反的编程箴言:<br> Explicit is better than implicit.<br>Spring有一个项目来解决这个问题<a href="https://spring.io/blog/2018/10/02/the-evolution-of-spring-fu">Kofu</a></p>
<h3 id="创建项目流程-1"><a href="#创建项目流程-1" class="headerlink" title="创建项目流程"></a>创建项目流程</h3><p> 必须在主启动类的统计或者子级目录下编写代码?why?</p>
<ol>
<li>创建一个项目,导入一些启动器(封装了很多依赖)</li>
<li>编写代码,测试运行</li>
<li>发布</li>
</ol>
<p>元注解:注解别的注解的注解<br>@Target(ElementType.TYPE)<br>@Retention(RetentionPolicy.RUNTIME)<br>@Documented<br>@Inherited</p>
<h3 id="源码解读"><a href="#源码解读" class="headerlink" title="源码解读"></a>源码解读</h3><h4 id="pom依赖部分"><a href="#pom依赖部分" class="headerlink" title="pom依赖部分"></a>pom依赖部分</h4><figure class="highlight plaintext"><table><tr><td class="code"><pre><span class="line"> <!-- 项目里的父依赖 --></span><br><span class="line"> <parent></span><br><span class="line"> <groupId>org.springframework.boot</groupId></span><br><span class="line"> <artifactId>spring-boot-starter-parent</artifactId></span><br><span class="line"> <version>1.5.6.RELEASE</version></span><br><span class="line"> <relativePath/> <!-- lookup parent from repository --></span><br><span class="line"> </parent></span><br><span class="line"> <!-- spring-boot-starter-parent的pom.xml --></span><br><span class="line"> <!-- 插件、资源过滤 --></span><br><span class="line"> <parent></span><br><span class="line"> <groupId>org.springframework.boot</groupId></span><br><span class="line"> <artifactId>spring-boot-dependencies</artifactId></span><br><span class="line"> <version>1.5.6.RELEASE</version></span><br><span class="line"> <relativePath>../../spring-boot-dependencies</relativePath></span><br><span class="line"></parent></span><br><span class="line"> <build></span><br><span class="line"> <!-- Turn on filtering by default for application properties --></span><br><span class="line"> <resources></span><br><span class="line"> <resource></span><br><span class="line"> <directory>${basedir}/src/main/resources</directory></span><br><span class="line"> <filtering>true</filtering></span><br><span class="line"> <includes></span><br><span class="line"> <!-- yml等同于yaml --></span><br><span class="line"> <include>**/application*.yml</include></span><br><span class="line"> <include>**/application*.yaml</include></span><br><span class="line"> <include>**/application*.properties</include></span><br><span class="line"> </includes></span><br><span class="line"> </resource></span><br><span class="line"> <resource></span><br><span class="line"> <directory>${basedir}/src/main/resources</directory></span><br><span class="line"> <excludes></span><br><span class="line"> <exclude>**/application*.yml</exclude></span><br><span class="line"> <exclude>**/application*.yaml</exclude></span><br><span class="line"> <exclude>**/application*.properties</exclude></span><br><span class="line"> </excludes></span><br><span class="line"> </resource></span><br><span class="line"> ...</span><br><span class="line"> </resources></span><br><span class="line"> ...</span><br><span class="line"> </build></span><br><span class="line"> <!-- spring-boot-dependencies的pom.xml --></span><br><span class="line"> <!-- 版本控制中心 --></span><br><span class="line"> <properties></span><br><span class="line"> <!-- Dependency versions --></span><br><span class="line"> <activemq.version>5.14.5</activemq.version></span><br><span class="line"> <antlr2.version>2.7.7</antlr2.version></span><br><span class="line"> <appengine-sdk.version>1.9.54</appengine-sdk.version></span><br><span class="line"> <artemis.version>1.5.5</artemis.version></span><br><span class="line"> <aspectj.version>1.8.10</aspectj.version></span><br><span class="line"> <assertj.version>2.6.0</assertj.version></span><br><span class="line"> <atomikos.version>3.9.3</atomikos.version></span><br><span class="line"> ...</span><br><span class="line"> </properties></span><br></pre></td></tr></table></figure>
<h4 id="java代码部分"><a href="#java代码部分" class="headerlink" title="java代码部分"></a>java代码部分</h4><p> 一个Springboot项目</p>
<figure class="highlight plaintext"><table><tr><td class="code"><pre><span class="line">@SpringBootApplication</span><br><span class="line">public class StudyApplication {</span><br><span class="line"> public static void main(String[] args) {</span><br><span class="line"> SpringApplication.run(StudyApplication.class, args);</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<p>可以看到主要就是一个注解 SpringBootApplication 和一个启动类 SpringApplication</p>
<p>查看SpringBootApplication注解的源码</p>
<figure class="highlight plaintext"><table><tr><td class="code"><pre><span class="line">@Target(ElementType.TYPE)</span><br><span class="line">@Retention(RetentionPolicy.RUNTIME)</span><br><span class="line">@Documented</span><br><span class="line">@Inherited</span><br><span class="line">@SpringBootConfiguration</span><br><span class="line">@EnableAutoConfiguration</span><br><span class="line">@ComponentScan(excludeFilters = {</span><br><span class="line"> @Filter(type = FilterType.CUSTOM, classes = TypeExcludeFilter.class),</span><br><span class="line"> @Filter(type = FilterType.CUSTOM, classes = AutoConfigurationExcludeFilter.class) })</span><br><span class="line">public @interface SpringBootApplication {</span><br><span class="line"> ...</span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<p>刨除掉元注解,我们看到剩下的</p>
<ul>
<li>@SpringBootConfiguration</li>
<li>@EnableAutoConfiguration</li>
<li>@ComponentScan</li>
</ul>
<ol>
<li><p>@SpringBootConfiguration<br>@SpringBootConfiguration — @Configuration — @Component<br> SpringBootConfiguration注解注释的主启动类本身也是一个组件,该组件的作用就是负责启动</p>
</li>
<li><p>@EnableAutoConfiguration</p>
<figure class="highlight plaintext"><table><tr><td class="code"><pre><span class="line"> // EnableAutoConfiguration</span><br><span class="line"> @AutoConfigurationPackage</span><br><span class="line"> @Import(AutoConfigurationImportSelector.class)</span><br><span class="line"> // AutoConfigurationPackage</span><br><span class="line"> @Import(AutoConfigurationPackages.Registrar.class)</span><br><span class="line"> // AutoConfigurationPackages.Registrar</span><br><span class="line"> static class Registrar implements ImportBeanDefinitionRegistrar, DeterminableImports {</span><br><span class="line"></span><br><span class="line"> @Override</span><br><span class="line"> public void registerBeanDefinitions(AnnotationMetadata metadata,</span><br><span class="line"> BeanDefinitionRegistry registry) {</span><br><span class="line"> register(registry, new PackageImport(metadata).getPackageName());</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> @Override</span><br><span class="line"> public Set<Object> determineImports(AnnotationMetadata metadata) {</span><br><span class="line"> return Collections.singleton(new PackageImport(metadata));</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<p> 可以看到这个静态中获取主启动类的包,也就是<code>必须在主启动类的统计或者子级目录下编写代码</code>的原因</p>
<figure class="highlight plaintext"><table><tr><td class="code"><pre><span class="line"> // AutoConfigurationImportSelector</span><br><span class="line"> @Override</span><br><span class="line">public String[] selectImports(AnnotationMetadata annotationMetadata) {</span><br><span class="line"> if (!isEnabled(annotationMetadata)) {</span><br><span class="line"> return NO_IMPORTS;</span><br><span class="line"> }</span><br><span class="line"> AutoConfigurationMetadata autoConfigurationMetadata = AutoConfigurationMetadataLoader</span><br><span class="line"> .loadMetadata(this.beanClassLoader);</span><br><span class="line"> AutoConfigurationEntry autoConfigurationEntry = getAutoConfigurationEntry(</span><br><span class="line"> autoConfigurationMetadata, annotationMetadata);</span><br><span class="line"> return StringUtils.toStringArray(autoConfigurationEntry.getConfigurations());</span><br><span class="line">}</span><br><span class="line"> // getAutoConfigurationEntry() 获取自动配置的实体</span><br><span class="line"> protected AutoConfigurationEntry getAutoConfigurationEntry(</span><br><span class="line"> AutoConfigurationMetadata autoConfigurationMetadata,</span><br><span class="line"> AnnotationMetadata annotationMetadata) {</span><br><span class="line"> if (!isEnabled(annotationMetadata)) {</span><br><span class="line"> return EMPTY_ENTRY;</span><br><span class="line"> }</span><br><span class="line"> AnnotationAttributes attributes = getAttributes(annotationMetadata);</span><br><span class="line"> List<String> configurations = getCandidateConfigurations(annotationMetadata,</span><br><span class="line"> attributes);</span><br><span class="line"> configurations = removeDuplicates(configurations);</span><br><span class="line"> Set<String> exclusions = getExclusions(annotationMetadata, attributes);</span><br><span class="line"> checkExcludedClasses(configurations, exclusions);</span><br><span class="line"> configurations.removeAll(exclusions);</span><br><span class="line"> configurations = filter(configurations, autoConfigurationMetadata);</span><br><span class="line"> fireAutoConfigurationImportEvents(configurations, exclusions);</span><br><span class="line"> return new AutoConfigurationEntry(configurations, exclusions);</span><br><span class="line">}</span><br><span class="line"> // getCandidateConfigurations() 获取候选的配置</span><br><span class="line"> protected List<String> getCandidateConfigurations(AnnotationMetadata metadata,</span><br><span class="line"> AnnotationAttributes attributes) {</span><br><span class="line"> List<String> configurations = SpringFactoriesLoader.loadFactoryNames(</span><br><span class="line"> getSpringFactoriesLoaderFactoryClass(), getBeanClassLoader());</span><br><span class="line"> Assert.notEmpty(configurations,</span><br><span class="line"> "No auto configuration classes found in META-INF/spring.factories. If you "</span><br><span class="line"> + "are using a custom packaging, make sure that file is correct.");</span><br><span class="line"> return configurations;</span><br><span class="line">}</span><br><span class="line"> // loadFactoryNames()</span><br><span class="line"> public static List<String> loadFactoryNames(Class<?> factoryClass, @Nullable ClassLoader classLoader) {</span><br><span class="line"> String factoryClassName = factoryClass.getName();</span><br><span class="line"> return loadSpringFactories(classLoader).getOrDefault(factoryClassName, Collections.emptyList());</span><br><span class="line">}</span><br><span class="line"> // loadSpringFactories()</span><br><span class="line"> public static final String FACTORIES_RESOURCE_LOCATION = "META-INF/spring.factories";</span><br><span class="line"> </span><br><span class="line"> private static Map<String, List<String>> loadSpringFactories(@Nullable ClassLoader classLoader) {</span><br><span class="line"> MultiValueMap<String, String> result = cache.get(classLoader);</span><br><span class="line"> if (result != null) {</span><br><span class="line"> return result;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> try {</span><br><span class="line"> Enumeration<URL> urls = (classLoader != null ?</span><br><span class="line"> classLoader.getResources(FACTORIES_RESOURCE_LOCATION) :</span><br><span class="line"> ClassLoader.getSystemResources(FACTORIES_RESOURCE_LOCATION));</span><br><span class="line"> result = new LinkedMultiValueMap<>();</span><br><span class="line"> while (urls.hasMoreElements()) {</span><br><span class="line"> URL url = urls.nextElement();</span><br><span class="line"> UrlResource resource = new UrlResource(url);</span><br><span class="line"> Properties properties = PropertiesLoaderUtils.loadProperties(resource);</span><br><span class="line"> for (Map.Entry<?, ?> entry : properties.entrySet()) {</span><br><span class="line"> String factoryClassName = ((String) entry.getKey()).trim();</span><br><span class="line"> for (String factoryName : StringUtils.commaDelimitedListToStringArray((String) entry.getValue())) {</span><br><span class="line"> result.add(factoryClassName, factoryName.trim());</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> cache.put(classLoader, result);</span><br><span class="line"> return result;</span><br><span class="line"> }</span><br><span class="line"> catch (IOException ex) {</span><br><span class="line"> throw new IllegalArgumentException("Unable to load factories from location [" +</span><br><span class="line"> FACTORIES_RESOURCE_LOCATION + "]", ex);</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<p> 通过层层深入,我们发现了 META-INF/spring.factories ;我们找到改文件<br><a href="https://www.zhangbohan.com.cn/2020/08/02/theory/design/using/java/framework/springboot-src-1/spring_factories.png"><img src="https://www.zhangbohan.com.cn/images/spring_factories.png" alt="img"></a></p>
<figure class="highlight plaintext"><table><tr><td class="code"><pre><span class="line"># Initializers </span><br><span class="line">org.springframework.context.ApplicationContextInitializer=\</span><br><span class="line">org.springframework.boot.autoconfigure.SharedMetadataReaderFactoryContextInitializer,\</span><br><span class="line">org.springframework.boot.autoconfigure.logging.ConditionEvaluationReportLoggingListener</span><br><span class="line"></span><br><span class="line"># Application Listeners </span><br><span class="line">org.springframework.context.ApplicationListener=\</span><br><span class="line">org.springframework.boot.autoconfigure.BackgroundPreinitializer</span><br><span class="line"></span><br><span class="line"># Auto Configuration Import Listeners</span><br><span class="line">org.springframework.boot.autoconfigure.AutoConfigurationImportListener=\</span><br><span class="line">org.springframework.boot.autoconfigure.condition.ConditionEvaluationReportAutoConfigurationImportListener</span><br><span class="line"></span><br><span class="line"># Auto Configuration Import Filters</span><br><span class="line">org.springframework.boot.autoconfigure.AutoConfigurationImportFilter=\</span><br><span class="line">org.springframework.boot.autoconfigure.condition.OnBeanCondition,\</span><br><span class="line">org.springframework.boot.autoconfigure.condition.OnClassCondition,\</span><br><span class="line">org.springframework.boot.autoconfigure.condition.OnWebApplicationCondition</span><br><span class="line"></span><br><span class="line"># Auto Configure</span><br><span class="line">org.springframework.boot.autoconfigure.EnableAutoConfiguration=\</span><br><span class="line">org.springframework.boot.autoconfigure.admin.SpringApplicationAdminJmxAutoConfiguration,\</span><br><span class="line">org.springframework.boot.autoconfigure.aop.AopAutoConfiguration,\</span><br><span class="line">org.springframework.boot.autoconfigure.amqp.RabbitAutoConfiguration,\</span><br><span class="line">org.springframework.boot.autoconfigure.batch.BatchAutoConfiguration,\</span><br><span class="line">...</span><br></pre></td></tr></table></figure>
<p>随便打开一个自动配置类</p>
<figure class="highlight plaintext"><table><tr><td class="code"><pre><span class="line">// AopAutoConfiguration</span><br><span class="line">@Configuration</span><br><span class="line">@ConditionalOnClass({ EnableAspectJAutoProxy.class, Aspect.class, Advice.class,</span><br><span class="line"> AnnotatedElement.class })</span><br><span class="line">@ConditionalOnProperty(prefix = "spring.aop", name = "auto", havingValue = "true", matchIfMissing = true)</span><br><span class="line">public class AopAutoConfiguration {</span><br><span class="line"></span><br><span class="line"> @Configuration</span><br><span class="line"> @EnableAspectJAutoProxy(proxyTargetClass = false)</span><br><span class="line"> @ConditionalOnProperty(prefix = "spring.aop", name = "proxy-target-class", havingValue = "false", matchIfMissing = false)</span><br><span class="line"> public static class JdkDynamicAutoProxyConfiguration {</span><br><span class="line"></span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> @Configuration</span><br><span class="line"> @EnableAspectJAutoProxy(proxyTargetClass = true)</span><br><span class="line"> @ConditionalOnProperty(prefix = "spring.aop", name = "proxy-target-class", havingValue = "true", matchIfMissing = true)</span><br><span class="line"> public static class CglibAutoProxyConfiguration {</span><br><span class="line"></span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line">}</span><br><span class="line">// RabbitAutoConfiguration</span><br><span class="line">@Configuration</span><br><span class="line">@ConditionalOnClass({ RabbitTemplate.class, Channel.class })</span><br><span class="line">@EnableConfigurationProperties(RabbitProperties.class)</span><br><span class="line">@Import(RabbitAnnotationDrivenConfiguration.class)</span><br><span class="line">public class RabbitAutoConfiguration {</span><br><span class="line"> ...</span><br><span class="line">}</span><br><span class="line">// RabbitProperties</span><br><span class="line">@ConfigurationProperties(prefix = "spring.rabbitmq")</span><br><span class="line">public class RabbitProperties {</span><br><span class="line"> ...</span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<ul>
<li><p>ConditionalOnXXX 条件判断注释</p>
</li>
<li><p>EnableConfigurationProperties — ConfigurationProperties 自动绑定依赖</p>
<h3 id="整理"><a href="#整理" class="headerlink" title="整理"></a>整理</h3><p><img src="https://www.zhangbohan.com.cn/images/springboot源码阅读.svg" alt="img"></p>
</li>
</ul>
</li>
</ol>
]]></content>
<categories>
<category>java</category>
</categories>
</entry>
<entry>
<title>web攻击方式相关知识简介以及Burpsuite工具的使用</title>
<url>/2023/02/26/web%E6%94%BB%E5%87%BB%E6%96%B9%E5%BC%8F%E7%9B%B8%E5%85%B3%E7%9F%A5%E8%AF%86%E7%AE%80%E4%BB%8B%E4%BB%A5%E5%8F%8ABurpsuite%E5%B7%A5%E5%85%B7%E7%9A%84%E4%BD%BF%E7%94%A8/</url>
<content><![CDATA[<blockquote>
<p><strong>声明:本文仅限学习研究讨论,切忌做非法乱纪之事</strong></p>
</blockquote>
<h2 id="常见名词及其解释"><a href="#常见名词及其解释" class="headerlink" title="常见名词及其解释"></a>常见名词及其解释</h2><ol>
<li><p>渗透</p>
<p>渗透测试(Penetration Testing)是由具备高技能和高素质的安全服务人员发起、并模拟常见黑客所使用的攻击手段对目标系统进行模拟入侵。</p>
<p>渗透测试服务的目的在于充分挖掘和暴露系统的弱点,从而让管理人员了解其系统所面临的威胁。</p>
<p>渗透测试工作往往作为风险评估的一个重要环节,为风险评估提供重要的原始参考数据。</p>
</li>
<li><p>提权<br>提权就是通过获得低权限用户再通过漏洞获得最高权限的过程. 假如当你得到一个 系统用户,但是有一些事情你做不了,你就想要怎么才可以有那种权限呢?你把的这个用 户的权利通过漏洞变高的过程就叫提权.</p>
</li>
<li><p>0day <a href="https://zhuanlan.zhihu.com/p/30044629">https://zhuanlan.zhihu.com/p/30044629</a><br>0day在网络安全界通常是指没有补丁的漏洞利用程序.提供该利用程序的人通常是该漏洞的首发者或是第一个公开该漏洞利用细节的人。</p>
</li>
</ol>
<h2 id="一些简单的攻击方式相关知识简介"><a href="#一些简单的攻击方式相关知识简介" class="headerlink" title="一些简单的攻击方式相关知识简介"></a>一些简单的攻击方式相关知识简介</h2><p><a href="http://image.zhangbohan.com.cn/pic/md/学习方向.jpg"><img src="https://www.zhangbohan.com.cn/images/%E5%AD%A6%E4%B9%A0%E6%96%B9%E5%90%91.jpg" alt="学习方向"></a></p>
<p>这里介绍几种最基础的web攻击方式</p>
<h4 id="XSS注入"><a href="#XSS注入" class="headerlink" title="XSS注入"></a>XSS注入</h4><p><a href="https://www.freebuf.com/vuls/334263.html">https://www.freebuf.com/vuls/334263.html</a></p>
<h4 id="拒接服务攻击"><a href="#拒接服务攻击" class="headerlink" title="拒接服务攻击"></a>拒接服务攻击</h4><p>可以这么理解,凡是能导致合法用户不能够访问正常网络服务的行为都算是拒绝服务攻击。 也就是说拒绝服务攻击的目的非常明确,就是要阻止合法用户对正常网络资源的访问,从而 达成攻击者不可告人的目的。 虽然同样是拒绝服务攻击, 但是 DDOS 和 DOS 还是有所不同, DDOS 的攻击策略侧重于通过很多“僵尸主机”(被攻击者入侵过或可间接利用的主机)向 受害主机发送大量看似合法的网络包,从而造成网络阻塞或服务器资源耗尽而导致拒绝服 务,分布式拒绝服务攻击一旦被实施,攻击网络包就会犹如洪水般涌向受害主机,从而把合 法用户的网络包淹没,导致合法用户无法正常访问服务器的网络资源,因此,拒绝服务攻击 又被称之为 “洪水式攻击”</p>
<h5 id="DOS攻击、DDOS攻击、CC攻击"><a href="#DOS攻击、DDOS攻击、CC攻击" class="headerlink" title="DOS攻击、DDOS攻击、CC攻击"></a>DOS攻击、DDOS攻击、CC攻击</h5><p><a href="https://www.cloudflare.com/zh-cn/learning/ddos/what-is-a-ddos-attack/">https://www.cloudflare.com/zh-cn/learning/ddos/what-is-a-ddos-attack/</a></p>
<h5 id="zip炸弹"><a href="#zip炸弹" class="headerlink" title="zip炸弹"></a>zip炸弹</h5><p><a href="https://www.cntonan.com/pc/computer/show-8-854-1.html">https://www.cntonan.com/pc/computer/show-8-854-1.html</a></p>
<h5 id="ARP协议欺诈"><a href="#ARP协议欺诈" class="headerlink" title="ARP协议欺诈"></a>ARP协议欺诈</h5><p><a href="https://blog.csdn.net/weixin_46560512/article/details/123325604">https://blog.csdn.net/weixin_46560512/article/details/123325604</a></p>
<h4 id="SQL注入"><a href="#SQL注入" class="headerlink" title="SQL注入"></a>SQL注入</h4><p><a href="https://blog.csdn.net/qq_44159028/article/details/114325805">https://blog.csdn.net/qq_44159028/article/details/114325805</a><br>随着B/S模式(Browser/Server,浏览器/服务器)应用开发的发展,使用这种模式编写应用程序的程序员也越来越多。但是由于这个行业的入门门槛不高,程序员的水平及经验也参差不齐,相当大一部分程序员在编写代码的时候,没有对用户输入数据的合法性进行判断,使应用程序存在安全隐患。用户可以提交一段数据库查询代码,根据程序返回的结果,获得某些他想得知的数据,这就是所谓的SQL Injection,即SQL注入。 例如:目标服务器的网页脚本(ASP或 PHP)与数据库的数据交换过程存在检测不够严格的漏洞,攻击者通过构建特殊代码暴出数据库信息,包括用户信息,管理员密码等敏感信息以入侵远程服务器的网站系统。</p>
<h4 id="越权"><a href="#越权" class="headerlink" title="越权"></a>越权</h4><p> <a href="https://www.freebuf.com/vuls/313396.html">https://www.freebuf.com/vuls/313396.html</a></p>
<p><strong>横向越权</strong>:攻击者尝试访问与他拥有相同权限的用户的资源</p>
<p><strong>纵向越权</strong>:攻击者可以使用低权限的账户去使用高权限账户的功能。</p>
<h4 id="CSRF攻击"><a href="#CSRF攻击" class="headerlink" title="CSRF攻击"></a>CSRF攻击</h4><p><a href="https://baijiahao.baidu.com/s?id=1727601372183126511&wfr=spider&for=pc">https://baijiahao.baidu.com/s?id=1727601372183126511&wfr=spider&for=pc</a></p>
<h3 id="Burpsuite工具使用"><a href="#Burpsuite工具使用" class="headerlink" title="Burpsuite工具使用"></a>Burpsuite工具使用</h3><p><a href="https://www.jianshu.com/p/ee2ce74cb5e5">https://www.jianshu.com/p/ee2ce74cb5e5</a></p>
]]></content>
</entry>
<entry>
<title>下馆子记录</title>
<url>/2023/06/17/%E4%B8%8B%E9%A6%86%E5%AD%90%E8%AE%B0%E5%BD%95/</url>
<content><![CDATA[<ol>
<li>公司附近,不做评价了,有的吃就不错了:<br>大包间:龙船人鱼头泡饼·江湖菜(温泉店)、潇湘府(环保店)、潇湘楼(温泉店)<br>三结义烧鸽子·烧烤融合菜(温泉路店)<br>城都成串串香(创客小镇店)</li>
<li>稍远一点:<br>小吊梨汤(融科店):<br>环境真的好,在科资讯大厦,环境也非常棒,同楼层还有一家书店,有个音乐喷泉,设计都非常棒,排队等位置的时候可以看看书,环境如图 <center>
<img src="https://www.zhangbohan.com.cn/images/eating/rongke-bookshop.jpg" src="融科书店" />
</center></li>
<li>进城啦:<br> 丰泽园饭庄(王府商厦店):<br> 味道很棒,不点太过分的菜品(葱烧海参之类的)的话价格也能接受的,和一般吃一顿价格高不了多少<br> 北平楼(牡丹园店):<br> 味道不如丰泽园,烤鸭不错</li>
</ol>
]]></content>
<categories>
<category>吃喝玩乐</category>
</categories>
<tags>
<tag>吃喝玩乐</tag>
</tags>
</entry>
<entry>
<title>领域驱动设计----消化知识</title>
<url>/2023/02/26/%E9%A2%86%E5%9F%9F%E9%A9%B1%E5%8A%A8%E8%AE%BE%E8%AE%A1-%E6%B6%88%E5%8C%96%E7%9F%A5%E8%AF%86/</url>
<content><![CDATA[<h2 id="运用领域模型"><a href="#运用领域模型" class="headerlink" title="运用领域模型"></a>运用领域模型</h2><p> 模型在领域驱动设计中的作用</p>
<ol>
<li>模型和设计的可信互相影响</li>
<li>模型是团队所有成员的使用的通用语言的中枢</li>
<li>模型是浓缩的知识</li>
</ol>
<h3 id="消化知识"><a href="#消化知识" class="headerlink" title="消化知识"></a>消化知识</h3><h4 id="有效建模的要素"><a href="#有效建模的要素" class="headerlink" title="有效建模的要素"></a>有效建模的要素</h4><ol>
<li>模型和实现的绑定</li>
<li>建立一种基于模型的语言,可以帮助各方相互理解索要表达的意思</li>
<li>开发一个蕴含丰富知识的模型,对象具有行为和强制性规则,包含各种类型的知识</li>
<li>提炼模型,在模型日趋完善的过程中,添加新的重要的概念,并移除不再使用或者不重要的部分。</li>
<li>头脑风暴和实验,在知识消化中将团队将团队的知识转化为有价值的模型</li>
</ol>
<h4 id="知识消化"><a href="#知识消化" class="headerlink" title="知识消化"></a>知识消化</h4><p>一般是在开发人员的领导下,由开发人员和领域专家组成的团队来共同协作<br>知识的消化者:高效的领域建模人员<br>信息的原始资料:领域专家头脑中的知识</p>
<p> 领域模型的不断精化迫使开发人员学习重要的业务原理,而不是机械地进行功能开发;领域专家被迫提炼自己已经知道的重要知识,并逐渐理解软件项目所必须的概念严谨性。</p>
<p> 模型在不断改进的同时,也成为组织项目信息流的工具。模型聚焦于需求分析,模型永远都不是完美的,是一个不断演化的过程。</p>
<h4 id="持续学习"><a href="#持续学习" class="headerlink" title="持续学习"></a>持续学习</h4><ol>
<li>项目知识零散的分散在人和文档中,并且可能掺杂有一些无用的信息</li>
<li>项目会丢失知识。已经学会知识的人可能去做别的工作;团队有可能因为重组而拆散;被外包的子系统可能仅交回代码…由于某种原因没有口头传递知识的时候,知识就丢失了</li>
<li>高效率的团队会有意识的积累知识并持续学习。</li>
</ol>
<h4 id="知识丰富的设计"><a href="#知识丰富的设计" class="headerlink" title="知识丰富的设计"></a>知识丰富的设计</h4><p> 知识消化所产生的模型能够反映对知识的深层理解,在模型发生变化的同时,开发人员对实现进行重构,以便反映模型的变化,这样,新知识就被合并到应用程序中了</p>
<p> 通过与软件专家的紧密协作来消化知识的过程才使得规则更加澄清和充实,并消除规则之间的矛盾冲突以及删除一些无用的规则</p>
<h4 id="深层模型"><a href="#深层模型" class="headerlink" title="深层模型"></a>深层模型</h4><p> 随着对领域和应用程序需求的理解的逐渐加深,往往会丢弃那些最初看起来很重要的表面元素,或者切换他们的角度。有用的模型很少停留在表面</p>
]]></content>
<tags>
<tag>读书笔记</tag>
</tags>
</entry>
<entry>
<title>Git命令</title>
<url>/2018/08/25/using/Git/</url>
<content><![CDATA[<h4 id="查看、添加、提交、删除、找回,重置修改文件"><a href="#查看、添加、提交、删除、找回,重置修改文件" class="headerlink" title="查看、添加、提交、删除、找回,重置修改文件"></a>查看、添加、提交、删除、找回,重置修改文件</h4><p>git help \<command> # 显示command的help<br>git show # 显示某次提交的内容 git show $id<br>git co — \<file> # 抛弃工作区修改<br>git co . # 抛弃工作区修改<br>git add \<file> # 将工作文件修改提交到本地暂存区<br>git add . # 将所有修改过的工作文件提交暂存区<br>git rm \<file> # 从版本库中删除文件<br>git rm \<file> —cached # 从版本库中删除文件,但不删除文件<br>git reset \<file> # 从暂存区恢复到工作文件<br>git reset — . # 从暂存区恢复到工作文件<br>git reset —hard # 恢复最近一次提交过的状态,即放弃上次提交后的所有本次修改<br>git ci \<file> git ci . git ci -a # 将git add, git rm和git ci等操作都合并在一起做<br>git ci -am “some comments”<br>git ci —amend # 修改最后一次提交记录<br>git revert \<$id> # 恢复某次提交的状态,恢复动作本身也创建次提交对象<br>git revert HEAD # 恢复最后一次提交的状态</p>
<h4 id="查看文件diff"><a href="#查看文件diff" class="headerlink" title="查看文件diff"></a>查看文件diff</h4><p>git diff \<file> # 比较当前文件和暂存区文件差异 git diff<br>git diff \<id1>\<id1>\<id2> # 比较两次提交之间的差异<br>git diff \<branch1>..\<branch2> # 在两个分支之间比较<br>git diff —staged # 比较暂存区和版本库差异<br>git diff —cached # 比较暂存区和版本库差异<br>git diff —stat # 仅仅比较统计信息 </p>
<h4 id="查看提交记录"><a href="#查看提交记录" class="headerlink" title="查看提交记录"></a>查看提交记录</h4><p>git log git log \<file> # 查看该文件每次提交记录<br>git log -p \<file> # 查看每次详细修改内容的diff<br>git log -p -2 # 查看最近两次详细修改内容的diff<br>git log —stat #查看提交统计信息<br>tig Mac上可以使用tig代替diff和log,brew install tig </p>
<h4 id="Git-本地分支管理-查看、切换、创建和删除分支"><a href="#Git-本地分支管理-查看、切换、创建和删除分支" class="headerlink" title="Git 本地分支管理 查看、切换、创建和删除分支"></a>Git 本地分支管理 查看、切换、创建和删除分支</h4><p>git br -r # 查看远程分支<br>git br \<new_branch> # 创建新的分支<br>git br -v # 查看各个分支最后提交信息<br>git br —merged # 查看已经被合并到当前分支的分支<br>git br —no-merged # 查看尚未被合并到当前分支的分支<br>git co \<branch> # 切换到某个分支<br>git co -b \<new_branch> # 创建新的分支,并且切换过去<br>git co -b \<new_branch> \<branch> # 基于branch创建新的new_branch<br>git co $id # 把某次历史提交记录checkout出来,但无分支信息,切换到其他分支会自动删除<br>git co $id -b \<new_branch> # 把某次历史提交记录checkout出来,创建成一个分支<br>git br -d \<branch> # 删除某个分支<br>git br -D \<branch> # 强制删除某个分支 (未被合并的分支被删除的时候需要强制) </p>
<h4 id="分支合并和rebase"><a href="#分支合并和rebase" class="headerlink" title="分支合并和rebase"></a>分支合并和rebase</h4><p>git merge \<branch> # 将branch分支合并到当前分支<br>git merge origin/master —no-ff # 不要Fast-Foward合并,这样可以生成merge提交<br>git rebase master \<branch> # 将master rebase到branch,相当于: git co \<branch> && git rebase master && git co master && git merge \<branch></p>
<h4 id="Git补丁管理-方便在多台机器上开发同步时用"><a href="#Git补丁管理-方便在多台机器上开发同步时用" class="headerlink" title="Git补丁管理(方便在多台机器上开发同步时用)"></a>Git补丁管理(方便在多台机器上开发同步时用)</h4><p>git diff > ../sync.patch # 生成补丁<br>git apply ../sync.patch # 打补丁<br>git apply —check ../sync.patch #测试补丁能否成功 </p>
<h4 id="Git暂存管理"><a href="#Git暂存管理" class="headerlink" title="Git暂存管理"></a>Git暂存管理</h4><p>git stash # 暂存<br>git stash list # 列所有stash<br>git stash apply # 恢复暂存的内容<br>git stash drop # 删除暂存区 </p>
<h4 id="Git远程分支管理"><a href="#Git远程分支管理" class="headerlink" title="Git远程分支管理"></a>Git远程分支管理</h4><p>git pull # 抓取远程仓库所有分支更新并合并到本地<br>git pull —no-ff # 抓取远程仓库所有分支更新并合并到本地,不要快进合并<br>git fetch origin # 抓取远程仓库更新<br>git merge origin/master # 将远程主分支合并到本地当前分支<br>git co —track origin/branch # 跟踪某个远程分支创建相应的本地分支<br>git co -b \<local_branch> origin/\<remote_branch> # 基于远程分支创建本地分支,功能同上<br>git push # push所有分支<br>git push origin master # 将本地主分支推到远程主分支<br>git push -u origin master # 将本地主分支推到远程(如无远程主分支则创建,用于初始化远程仓库)<br>git push origin \<local_branch> # 创建远程分支, origin是远程仓库名<br>git push origin \<local_branch>:\<remote_branch> # 创建远程分支<br>git push origin :\<remote_branch> #先删除本地分支(git br -d \<branch>),然后再push删除远程分支 </p>
<h4 id="Git远程仓库管理"><a href="#Git远程仓库管理" class="headerlink" title="Git远程仓库管理"></a>Git远程仓库管理</h4><p>GitHub<br>git remote -v # 查看远程服务器地址和仓库名称<br>git remote show origin # 查看远程服务器仓库状态<br>git remote add origin git@github:robbin/robbin_site.git # 添加远程仓库地址<br>git remote set-url origin git@github.com:robbin/robbin_site.git # 设置远程仓库地址(用于修改远程仓库地址) git remote rm \<repository> # 删除远程仓库 </p>
<h4 id="创建远程仓库"><a href="#创建远程仓库" class="headerlink" title="创建远程仓库"></a>创建远程仓库</h4><p>git clone —bare robbin_site robbin_site.git # 用带版本的项目创建纯版本仓库<br>scp -r my_project.git git@ git.csdn.net:~ # 将纯仓库上传到服务器上<br>mkdir robbin_site.git && cd robbin_site.git && git —bare init # 在服务器创建纯仓库<br>git remote add origin git@github.com:robbin/robbin_site.git # 设置远程仓库地址<br>git push -u origin master # 客户端首次提交<br>git push -u origin develop # 首次将本地develop分支提交到远程develop分支,并且track<br>git remote set-head origin master # 设置远程仓库的HEAD指向master分支 ,也可以命令设置跟踪远程库和本地库<br>git branch —set-upstream master origin/master<br>git branch —set-upstream develop origin/develop </p>
<h4 id="reflog"><a href="#reflog" class="headerlink" title="reflog"></a>reflog</h4><p>git reflog是对reflog进行管理的命令,reflog是git用来记录引用变化的一种机制,比如记录分支的变化或者是HEAD引用的变化.<br>当git reflog不指定引用的时候,默认列出HEAD的reflog.<br>HEAD@{0}代表HEAD当前的值,HEAD@{3}代表HEAD在3次变化之前的值.<br>git会将变化记录到HEAD对应的reflog文件中,其路径为.git/logs/HEAD, 分支的reflog文件都放在.git/logs/refs目录下的子目录中. </p>
<h4 id="特殊符号"><a href="#特殊符号" class="headerlink" title="特殊符号"></a>特殊符号</h4><p>^代表父提交,当一个提交有多个父提交时,可以通过在^后面跟上一个数字,表示第几个父提交: ^相当于^1.<br>~\<n>相当于连续的\<n>个^. </p>
]]></content>
<categories>
<category>git</category>
</categories>
<tags>
<tag>git</tag>
<tag>tools</tag>
</tags>
</entry>
<entry>
<title>正则表达式应用(持续更新)</title>
<url>/2018/08/28/using/regex-using/</url>
<content><![CDATA[<h4 id="编程过程中使用的正则记录"><a href="#编程过程中使用的正则记录" class="headerlink" title="编程过程中使用的正则记录"></a>编程过程中使用的正则记录</h4><figure class="highlight text"><table><tr><td class="code"><pre><span class="line">b*[^:b#/]+.*$ 统计代码行数(不包括以# / 开头的 亦不包括空行)</span><br><span class="line">\d(9|[0-7])\d{4} 中国邮政编码</span><br><span class="line">^((13[0-9])|(15[^4,\\D])|(18[0,5-9]))\\d{8}$ 手机号</span><br><span class="line">^([a-z0-9A-Z]+[-|\\.]?)+[a-z0-9A-Z]@([a-z0-9A-Z]+(-[a-z0-9A-Z]+)?\\.)+[a-zA-Z]{2,}$ 邮箱</span><br><span class="line">^[\u4e00-\u9fa5]{0,}$ 汉字</span><br><span class="line">http(s)?://([\\w-]+\\.)+[\\w-]+(/[\\w- ./?%&=])? url</span><br><span class="line">(((\d{1,2})|(1\d{2})|(2[0-4]\d)|(25[0-5]))\.){3}((\d{1,2})|(1\d{2})|(2[0-4]\d)|(25[0-5])) ipv4</span><br><span class="line"><!-{2,}.*-{2,}> html注释</span><br><span class="line">[1-8]\d{5}((18)|(19)|(20))?\d{2}[0-1]\d[0-3]\d{4}[\dX]? 中华人民共和国身份证号码</span><br><span class="line">^[1-9]\d{3}-(0[1-9]|1[0-2])-(0[1-9]|[1-2][0-9]|3[0-1])$ 日期</span><br><span class="line">^(20|21|22|23|[0-1]\d):[0-5]\d:[0-5]\d$ 时间</span><br></pre></td></tr></table></figure>
]]></content>
<categories>
<category>regular</category>
</categories>
<tags>
<tag>tools</tag>
<tag>regular</tag>
<tag>regex</tag>
</tags>
</entry>
<entry>
<title>Regular Expression 正则表达式</title>
<url>/2018/08/25/theory/regex/</url>
<content><![CDATA[<p>Regular Expression(正则表达式) 简称Regex<br>在Javascript中<br> g(global)表示全局 i表示不区分大小写<br> .可以匹配任何一个单位的字符<br> []用于定义字符集合 ^取非</p>
<h4 id="元字符"><a href="#元字符" class="headerlink" title="元字符"></a>元字符</h4><div class="table-container">
<table>
<thead>
<tr>
<th>元字符</th>
<th>解释说明</th>
</tr>
</thead>
<tbody>
<tr>
<td>[\b]</td>
<td>Backspace键(退格)</td>
</tr>
<tr>
<td>\n</td>
<td>换行符</td>
</tr>
<tr>
<td>\f</td>
<td>换页符</td>
</tr>
<tr>
<td>\r</td>
<td>回车符</td>
</tr>
<tr>
<td>\t</td>
<td>制表符(tab)</td>
</tr>
<tr>
<td>\v</td>
<td>垂直制表符</td>
</tr>
<tr>
<td>\r\n</td>
<td>回车+换行 许多操作系统以此为行结束,Unix与Linux以\n作为行结束</td>
</tr>
<tr>
<td>\d</td>
<td>任何一个数字字符 [0-9]</td>
</tr>
<tr>
<td>\D</td>
<td>任何一个非数字字符 <sup><a href="#fn_0-9" id="reffn_0-9">0-9</a></sup></td>
</tr>
<tr>
<td>\w</td>
<td>[a-zA-Z0-9_]</td>
</tr>
<tr>
<td>\W</td>
<td><sup><a href="#fn_a-zA-Z0-9_" id="reffn_a-zA-Z0-9_">a-zA-Z0-9_</a></sup></td>
</tr>
<tr>
<td>\s</td>
<td>[\f\n\r\t\v] 任一一个空白字符</td>
</tr>
<tr>
<td>\S</td>
<td><sup><a href="#fn_\f\n\r\t\v" id="reffn_\f\n\r\t\v">\f\n\r\t\v</a></sup> 任一一个非空白字符</td>
</tr>
<tr>
<td>\xXX</td>
<td>XX:表示十六进制数</td>
</tr>
<tr>
<td>\0XX</td>
<td>XX:表示八进制数</td>
</tr>
</tbody>
</table>
</div>
<h4 id="POSIX字符类-javascript不支持"><a href="#POSIX字符类-javascript不支持" class="headerlink" title="POSIX字符类 javascript不支持"></a>POSIX字符类 javascript不支持</h4><div class="table-container">
<table>
<thead>
<tr>
<th>POSIX字符类</th>
<th>解释说明</th>
</tr>
</thead>
<tbody>
<tr>
<td>[:alnum:]</td>
<td>[\w^_]</td>
</tr>
<tr>
<td>[:upper:]</td>
<td>[A-Z]</td>
</tr>
<tr>
<td>[:alpha:]</td>
<td>[a-zA-Z]</td>
</tr>
<tr>
<td>[:blank:]</td>
<td>[ \t] 注意:这里包含空格</td>
</tr>
<tr>
<td>[:xdigit:]</td>
<td>任何一个十六进制数 [a-fA-F0-9]</td>
</tr>
<tr>
<td>[:cntrl:]</td>
<td>ASCII控制字符 0-31加上127</td>
</tr>
<tr>
<td>[:digit:]</td>
<td>\d</td>
</tr>
<tr>
<td>[:graph:]</td>
<td>[:print:]去除空格</td>
</tr>
<tr>
<td>[:lower:]</td>
<td>[a-z]</td>
</tr>
<tr>
<td>[:print:]</td>
<td>任何一个可打印字符</td>
</tr>
<tr>
<td>[:punct:]</td>
<td>既不属于[:alnum:]也不属于[:cntrl:]的字符</td>
</tr>
<tr>
<td>[:space:]</td>
<td>[\f\r\n\t\v ] 注意:这里包含空格</td>
</tr>
</tbody>
</table>
</div>
<div class="table-container">
<table>
<thead>
<tr>
<th>符号</th>
<th>解释说明</th>
</tr>
</thead>
<tbody>
<tr>
<td>+</td>
<td>一次或多次重复</td>
</tr>
<tr>
<td>+?</td>
<td>懒惰型</td>
</tr>
<tr>
<td>*</td>
<td>零次或多次重复</td>
</tr>
<tr>
<td>*?</td>
<td>懒惰型</td>
</tr>
<tr>
<td>?</td>
<td>零次或一次出现</td>
</tr>
<tr>
<td>{n}</td>
<td>重复n次</td>
</tr>
<tr>
<td>{m,n}</td>
<td>至少m次,至多n次</td>
</tr>
<tr>
<td>{m,}</td>
<td>至少出现m次</td>
</tr>
<tr>
<td>{m,}?</td>
<td>懒惰型</td>
</tr>
<tr>
<td>\b</td>
<td>一个单词的开头或结尾 b:boundary</td>
</tr>
<tr>
<td>\B</td>
<td>不匹配一个单词边界</td>
</tr>
<tr>
<td>^</td>
<td>字符串开头 $ 字符串结尾</td>
</tr>
<tr>
<td>(?m)</td>
<td>置于开头,用于开启分行匹配模式 multline mode 注意:有的语言不支持</td>
</tr>
<tr>
<td>(xx)</td>
<td>子表达式,视为独立元素</td>
</tr>
</tbody>
</table>
</div>
<h4 id="回溯引用-backreference"><a href="#回溯引用-backreference" class="headerlink" title="回溯引用 backreference"></a>回溯引用 backreference</h4><p>替换模式下Javascript使用$代替\</p>
<div class="table-container">
<table>
<thead>
<tr>
<th>符号</th>
<th>解释说明</th>
</tr>
</thead>
<tbody>
<tr>
<td>\1,\2……\n</td>
<td>第1个表达式,第2个表达式…….第n个表达式</td>
</tr>
<tr>
<td>\0</td>
<td>代表整个正则表达式</td>
</tr>
<tr>
<td>\E</td>
<td>结束 \L或\U转换</td>
</tr>
<tr>
<td>\l</td>
<td>把下一个字符转换为小写</td>
</tr>
<tr>
<td>\L</td>
<td>把\L到\E之间全部转换为小写</td>
</tr>
<tr>
<td>\u</td>
<td>把下一个字符转换为大写</td>
</tr>
<tr>
<td>\U</td>
<td>把\U到\E之间全部转换为大写</td>
</tr>
</tbody>
</table>
</div>
<p>注意:<br>1、 Java、Perl、PHP、.NET 支持向后查找<br>2、 Javascript、ColdFusion 不支持向后查找<br>向前查找:(?=x) 匹配但不消费x,即结果不包括x<br> 例 (?=:) 原字符串 https: 匹配结果 http<br>向后查找:(?<=x) 匹配但不消费x,即结果不包括x<br> 例 (?<=$) 原字符串 $400 匹配结果 400<br>注意:向前查找可以任意长度(可以使用.+) 向后查找的长度是固定的<br> (?=) 正向前查找 (?<=) 正向后查找<br> (?!) 负向前查找 (?<!) 负向后查找 </p>
<p>Mysql JAVA1.4<font color="red">不支持</font>条件处理<br>(?(backreference)true-regex)<br>(?(backreference)true-regex|false-regex)<br>Backreference满足时,匹配true-regex 否则匹配false-regex </p>
]]></content>
<categories>
<category>regular</category>
</categories>
<tags>
<tag>tools</tag>
<tag>regular</tag>
<tag>regex</tag>
<tag>code</tag>
</tags>
</entry>
<entry>
<title>数据挖掘--爬虫--Ajax数据爬取</title>
<url>/2019/02/22/machine%20learning/clawler/clawler-5/</url>
<content><![CDATA[<h3 id="Ajax"><a href="#Ajax" class="headerlink" title="Ajax"></a>Ajax</h3><p>  Ajax,全程Asynchronous JavaScript and XML,即异步的JavaScript和XML,在保证页面不被刷新、页面链接不改变的情况下与服务器交换数据并更新部分网页的技术。</p>
<h3 id="爬取今日头条街拍"><a href="#爬取今日头条街拍" class="headerlink" title="爬取今日头条街拍"></a>爬取今日头条街拍</h3><figure class="highlight python"><table><tr><td class="code"><pre><span class="line"><span class="comment">#!/usr/bin/env python3</span></span><br><span class="line"><span class="comment"># -*- coding: utf-8 -*-</span></span><br><span class="line"><span class="string">""" </span></span><br><span class="line"><span class="string">@author zhangbohan.dell@gmail.com</span></span><br><span class="line"><span class="string">@function:</span></span><br><span class="line"><span class="string">@create 2019/2/25 11:12</span></span><br><span class="line"><span class="string">"""</span></span><br><span class="line"><span class="keyword">import</span> os</span><br><span class="line"><span class="keyword">from</span> hashlib <span class="keyword">import</span> md5</span><br><span class="line"><span class="keyword">from</span> multiprocessing.pool <span class="keyword">import</span> Pool</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> requests</span><br><span class="line"><span class="keyword">from</span> bs4 <span class="keyword">import</span> BeautifulSoup</span><br><span class="line"><span class="keyword">from</span> urllib.parse <span class="keyword">import</span> urlencode</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">get_page</span>(<span class="params">offset</span>):</span><br><span class="line"> params = {</span><br><span class="line"> <span class="string">'aid'</span>: <span class="string">'24'</span>,</span><br><span class="line"> <span class="string">'app_name'</span>: <span class="string">'web_search'</span>,</span><br><span class="line"> <span class="string">'offset'</span>: offset,</span><br><span class="line"> <span class="string">'format'</span>: <span class="string">'json'</span>,</span><br><span class="line"> <span class="string">'keyword'</span>: <span class="string">'街拍'</span>,</span><br><span class="line"> <span class="string">'autoload'</span>: <span class="string">'true'</span>,</span><br><span class="line"> <span class="string">'count'</span>: <span class="string">'20'</span>,</span><br><span class="line"> <span class="string">'en_qc'</span>: <span class="string">'1'</span>,</span><br><span class="line"> <span class="string">'cur_tab'</span>: <span class="string">'1'</span>,</span><br><span class="line"> <span class="string">'from'</span>: <span class="string">'search_tab'</span>,</span><br><span class="line"> <span class="string">'pd'</span>: <span class="string">'syntheis'</span></span><br><span class="line"> }</span><br><span class="line"> url = <span class="string">'https://www.toutiao.com/api/search/content/?'</span> + urlencode(params)</span><br><span class="line"> <span class="keyword">try</span>:</span><br><span class="line"> response = requests.get(url)</span><br><span class="line"> <span class="keyword">if</span> response.status_code == <span class="number">200</span>:</span><br><span class="line"> <span class="keyword">return</span> response.json()</span><br><span class="line"> <span class="keyword">except</span> requests.ConnectionError:</span><br><span class="line"> <span class="keyword">return</span> <span class="literal">None</span></span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">get_image</span>(<span class="params">json</span>):</span><br><span class="line"> <span class="keyword">if</span> json.get(<span class="string">'data'</span>):</span><br><span class="line"> <span class="keyword">for</span> item <span class="keyword">in</span> json.get(<span class="string">'data'</span>):</span><br><span class="line"> title = item.get(<span class="string">'title'</span>)</span><br><span class="line"> images = item.get(<span class="string">'image_list'</span>)</span><br><span class="line"> <span class="keyword">for</span> image <span class="keyword">in</span> images:</span><br><span class="line"> <span class="keyword">yield</span> {</span><br><span class="line"> <span class="string">'image'</span>: image.get(<span class="string">'url'</span>),</span><br><span class="line"> <span class="string">'title'</span>: title</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">save_image</span>(<span class="params">item</span>):</span><br><span class="line"> <span class="keyword">if</span> <span class="keyword">not</span> os.path.exists(item.get(<span class="string">'title'</span>)):</span><br><span class="line"> os.mkdir(item.get(<span class="string">'title'</span>))</span><br><span class="line"> <span class="keyword">try</span>:</span><br><span class="line"> response = requests.get(item.get(<span class="string">'image'</span>))</span><br><span class="line"> <span class="keyword">if</span> response.status_code == <span class="number">200</span>:</span><br><span class="line"> file_path = <span class="string">'{0}/{1}.{2}'</span>.<span class="built_in">format</span>(item.get(<span class="string">'title'</span>), md5(response.content).hexdigest(), <span class="string">'jpg'</span>)</span><br><span class="line"> <span class="keyword">if</span> <span class="keyword">not</span> os.path.exists(file_path):</span><br><span class="line"> <span class="keyword">with</span> <span class="built_in">open</span>(file_path, <span class="string">'wb'</span>) <span class="keyword">as</span> f:</span><br><span class="line"> f.write(response.content)</span><br><span class="line"> <span class="keyword">else</span>:</span><br><span class="line"> <span class="built_in">print</span>(<span class="string">'Already Downloaded'</span>, file_path)</span><br><span class="line"> <span class="keyword">except</span> requests.ConnectionError:</span><br><span class="line"> <span class="built_in">print</span>(<span class="string">'Failed to save Image'</span>)</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">main</span>(<span class="params">offset</span>):</span><br><span class="line"> json = get_page(offset)</span><br><span class="line"> <span class="keyword">for</span> item <span class="keyword">in</span> get_image(json):</span><br><span class="line"> <span class="built_in">print</span>(item)</span><br><span class="line"> save_image(item)</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">GROUP_START = <span class="number">1</span></span><br><span class="line">GROUP_END = <span class="number">20</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> __name__ == <span class="string">'__main__'</span>:</span><br><span class="line"> pool = Pool()</span><br><span class="line"> groups = ([x * <span class="number">20</span> <span class="keyword">for</span> x <span class="keyword">in</span> <span class="built_in">range</span>(GROUP_START, GROUP_END + <span class="number">1</span>)])</span><br><span class="line"> pool.<span class="built_in">map</span>(main, groups)</span><br><span class="line"> pool.<span class="built_in">map</span>(main, groups)</span><br><span class="line"> pool.close()</span><br><span class="line"> pool.join()</span><br><span class="line"></span><br></pre></td></tr></table></figure>
<h3 id="参考资料"><a href="#参考资料" class="headerlink" title="参考资料"></a>参考资料</h3><p><a href="http://www.cuiqingcai.com">崔庆才大佬</a>的《python3网络爬虫开发实战》</p>
]]></content>
<categories>
<category>crawler</category>
</categories>
<tags>
<tag>python</tag>
<tag>web crawler</tag>
<tag>data mining</tag>
</tags>
</entry>
<entry>
<title>数据挖掘--爬虫--动态渲染页面爬取</title>
<url>/2019/02/25/machine%20learning/clawler/clawler-6/</url>
<content><![CDATA[<h3 id="selenium的使用"><a href="#selenium的使用" class="headerlink" title="selenium的使用"></a>selenium的使用</h3><p>  <a href="https://selenium-python.readthedocs.io/">官方文档</a><br>  selenium是一个自动化测试工具,利用它可以驱动浏览器执行特定的动作,还可以获取浏览器当前呈现页面的源码。 </p>
<h4 id="等待条件以及其含义"><a href="#等待条件以及其含义" class="headerlink" title="等待条件以及其含义"></a>等待条件以及其含义</h4><p>  <a href="https://selenium-python.readthedocs.io/api.html#module-selenium.webdriver.support.expected_conditions">官方文档</a></p>
<figure class="highlight python"><table><tr><td class="code"><pre><span class="line">wait = WebDriverWait(browser,<span class="number">1</span>)</span><br><span class="line">wait.until(EC.presence_of_element_located((By.ID,<span class="string">'content_left'</span>)))</span><br><span class="line"><span class="comment">#until中的即为等待条件</span></span><br></pre></td></tr></table></figure>
<div class="table-container">
<table>
<thead>
<tr>
<th>等待条件</th>
<th>含义</th>
</tr>
</thead>
<tbody>
<tr>
<td>title_is</td>
<td>标题是某内容</td>
</tr>
<tr>
<td>title_contains</td>
<td>标题包含某内容</td>
</tr>
<tr>
<td>persence_of_element_localted</td>
<td>节点加载出来,传入定位元素,如(By.ID,’p’)</td>
</tr>
<tr>
<td>visibility_of_element_localted</td>
<td>节点可见,传入定位元素</td>
</tr>
<tr>
<td>visibility_of</td>
<td>可见,传入节点对象</td>
</tr>
<tr>
<td>persence_of_all_element_localted</td>
<td>所有节点加载出来</td>
</tr>
<tr>
<td>text_to_be_present_in_element</td>
<td>某个节点文本包含某文字</td>
</tr>
<tr>
<td>text_to_be_present_in_element_value</td>
<td>某个节点值包含某文字</td>
</tr>
<tr>
<td>frame_to_be_availiable_and_switch_to_it</td>
<td>加载并切换</td>
</tr>
<tr>
<td>invisibility_of_element_located</td>
<td>节点不可见</td>
</tr>
<tr>
<td>element_to_be_clickable</td>
<td>节点可点击</td>
</tr>
<tr>
<td>staleness_of</td>
<td>判断一个节点是否仍在DOM,可判断页面是否已经刷新</td>
</tr>
<tr>
<td>element_t_be_selected</td>
<td>节点可选择,传入节点对象</td>
</tr>
<tr>
<td>element_located_to_be_clickable</td>
<td>节点可选择,传入定位元组</td>
</tr>
<tr>
<td>element_selection_state_to_be</td>
<td>传入节点对象以及状态,相等返回true,否则false</td>
</tr>
<tr>
<td>element_located_selection_state_to_be</td>
<td>传入定位元组以及状态,相等返回true,否则false</td>
</tr>
<tr>
<td>alert_is_present</td>
<td>是否出现警告</td>
</tr>
</tbody>
</table>
</div>
]]></content>
<categories>
<category>crawler</category>
</categories>
<tags>
<tag>python</tag>
<tag>web crawler</tag>
<tag>data mining</tag>
</tags>
</entry>
<entry>
<title>数据挖掘--爬虫--准备工作</title>
<url>/2019/02/21/machine%20learning/clawler/crawler-1/</url>
<content><![CDATA[<h3 id="python环境的准备工作"><a href="#python环境的准备工作" class="headerlink" title="python环境的准备工作"></a>python环境的准备工作</h3><p>  博主在这使用windows作为系统环境,安装anaconda3作为python运行与库管理环境。<br>  anaconda<a href="https://www.anaconda.com/">官方网站</a><br>  如果下载速度过慢,可以选择使用<a href="https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/">清华大学镜像</a>,<a href="https://mirrors.tuna.tsinghua.edu.cn/help/anaconda/">使用说明</a></p>
<h3 id="请求库的安装"><a href="#请求库的安装" class="headerlink" title="请求库的安装"></a>请求库的安装</h3><h4 id="requests库"><a href="#requests库" class="headerlink" title="requests库"></a>requests库</h4><pre><code>&emsp;&emsp;阻塞式http请求库
</code></pre><figure class="highlight cmd"><table><tr><td class="code"><pre><span class="line">pip3 install requests</span><br></pre></td></tr></table></figure>
<h4 id="selenuim库"><a href="#selenuim库" class="headerlink" title="selenuim库"></a>selenuim库</h4><p>   selenuim是一个自动化测试工具,可以使用它驱动浏览器执行特定的动作。如点击,下拉等等</p>
<figure class="highlight cmd"><table><tr><td class="code"><pre><span class="line">pip3 install selenuim</span><br></pre></td></tr></table></figure>
<h4 id="Google-Chrome-以及其驱动-ChromeDriver"><a href="#Google-Chrome-以及其驱动-ChromeDriver" class="headerlink" title="Google Chrome 以及其驱动 ChromeDriver"></a>Google Chrome 以及其驱动 ChromeDriver</h4><pre><code>&emsp;&emsp;Google Chrome[下载](https://chrome.en.softonic.com/)&emsp;&emsp; ChromeDriver[下载](https://chromedriver.storage.googleapis.com/index.html)
<font color="red">注意:安装的chromedriver版本要和google chrome版本相匹配</font>
&emsp;&emsp;将下载下来的chromedriver添加到环境变量,在命令行中进行测试是否成功
</code></pre><figure class="highlight cmd"><table><tr><td class="code"><pre><span class="line">chromedriver</span><br></pre></td></tr></table></figure>
<pre><code>显示以下类似内容说明成功
</code></pre><figure class="highlight text"><table><tr><td class="code"><pre><span class="line">Starting ChromeDriver 72.0.3626.69 (3c16f8a135abc0d4da2dff33804db79b849a7c38) on port 9515</span><br><span class="line">Only local connections are allowed.</span><br><span class="line">Please protect ports used by ChromeDriver and related test frameworks to prevent access by malicious code</span><br></pre></td></tr></table></figure>
<pre><code>在python中进行测试
</code></pre><figure class="highlight python"><table><tr><td class="code"><pre><span class="line"><span class="keyword">from</span> selenium <span class="keyword">import</span> webdriver </span><br><span class="line">brower = webdriver.Chrome()</span><br></pre></td></tr></table></figure>
<pre><code>运行打开一个新的chrome窗口
</code></pre><h4 id="Firefox以及其驱动-GeckoDriver"><a href="#Firefox以及其驱动-GeckoDriver" class="headerlink" title="Firefox以及其驱动 GeckoDriver"></a>Firefox以及其驱动 GeckoDriver</h4><pre><code>FireFox[下载](http://www.firefox.com.cn/)&emsp;&emsp;GeckoDriver[下载](https://github.com/mozilla/geckodriver/releases)
<font color="red">注意:安装的geckodriver版本要和firefox版本相匹配,geckodriver下载界面有版本要求说明,如果geckodriver未起作用,请尝试重装firefox最新版本并重启电脑</font>
&emsp;&emsp;将下载下来的geckodriver添加到环境变量,在命令行中进行测试是否成功
</code></pre><figure class="highlight cmd"><table><tr><td class="code"><pre><span class="line">geckodriver</span><br></pre></td></tr></table></figure>
<pre><code>在python中进行测试
</code></pre><figure class="highlight python"><table><tr><td class="code"><pre><span class="line"><span class="keyword">from</span> selenium <span class="keyword">import</span> webdriver</span><br><span class="line">brower = webdriver.Firefox()</span><br></pre></td></tr></table></figure>
<pre><code>运行打开一个新的firefox窗口
</code></pre><h4 id="PhantomJS"><a href="#PhantomJS" class="headerlink" title="PhantomJS"></a>PhantomJS</h4><p>   PhantomJS是一个无界面的、可脚本编程的webkit浏览器引擎,<a href="http://phantomjs.org/download.html">下载</a><br>   将下载下来的bin文件夹下的phantomjs.exe添加到环境变量,在命令行中进行测试是否成功</p>
<figure class="highlight cmd"><table><tr><td class="code"><pre><span class="line">phantomjs</span><br></pre></td></tr></table></figure>
<pre><code>出现以下内容说明可用
</code></pre><figure class="highlight cmd"><table><tr><td class="code"><pre><span class="line">phantomjs></span><br></pre></td></tr></table></figure>
<pre><code>在python中进行测试
</code></pre><figure class="highlight python"><table><tr><td class="code"><pre><span class="line"><span class="keyword">from</span> selenium <span class="keyword">import</span> webdriver</span><br><span class="line"></span><br><span class="line">brower = webdriver.PhantomJS()</span><br><span class="line">brower.get(<span class="string">"http://www.baidu.com"</span>)</span><br><span class="line"><span class="built_in">print</span>(brower.current_url))</span><br></pre></td></tr></table></figure>
<pre><code>使用PhantonJs不会打开一个新的窗口,但实际上已经在后台运行
</code></pre><h4 id="aiohttp"><a href="#aiohttp" class="headerlink" title="aiohttp"></a>aiohttp</h4><pre><code>异步web服务库
</code></pre><figure class="highlight cmd"><table><tr><td class="code"><pre><span class="line">pip3 install aiohttp</span><br></pre></td></tr></table></figure>
<pre><code>此外,官方还推荐安装cchardet(字符编码检测库)和aiodns(加速dns解析库),
</code></pre><figure class="highlight cmd"><table><tr><td class="code"><pre><span class="line">pip3 install cchardet aiodns</span><br></pre></td></tr></table></figure>
<h3 id="解析库的安装"><a href="#解析库的安装" class="headerlink" title="解析库的安装"></a>解析库的安装</h3><h4 id="lxml"><a href="#lxml" class="headerlink" title="lxml"></a>lxml</h4><p> 支持html和xml的解析,支持XPath解析方式,解析效率高</p>
<figure class="highlight cmd"><table><tr><td class="code"><pre><span class="line">pip3 install lxml</span><br></pre></td></tr></table></figure>
<h4 id="Beautiful-Soup"><a href="#Beautiful-Soup" class="headerlink" title="Beautiful Soup"></a>Beautiful Soup</h4><figure class="highlight cmd"><table><tr><td class="code"><pre><span class="line">pip3 install beautifulsoup4</span><br></pre></td></tr></table></figure>
<h4 id="pyquery"><a href="#pyquery" class="headerlink" title="pyquery"></a>pyquery</h4><p>  pyquery 同样是一个强大的网页解析工具,它提供了和 jQuery 类似的语法来解析 HTML 文梢,支持 css 选择器,使用非常方便.</p>
<figure class="highlight cmd"><table><tr><td class="code"><pre><span class="line">pip3 install pyquery</span><br></pre></td></tr></table></figure>
<h4 id="tesserocr"><a href="#tesserocr" class="headerlink" title="tesserocr"></a>tesserocr</h4><p>   cor识别,识别各种各样的验证码。tesserocr是对tesseract的一层python封装,因此需要先<a href="https://digi.bib.uni-mannheim.de/tesseract/">安装tesseract</a>,带dev的是开发版本,不稳定,不带dev的是稳定版本<br> linux下</p>
<figure class="highlight cmd"><table><tr><td class="code"><pre><span class="line">pip3 install tesserocr pillow </span><br></pre></td></tr></table></figure>
<p> windows下<br> <a href="https://github.com/simonflueckiger/tesserocr-windows_build/releases">whl安装文件下载地址</a></p>
<pre><code><figure class="highlight cmd"><table><tr><td class="code"><pre><span class="line">pip install tesserocr-<span class="number">2</span>.<span class="number">4</span>.<span class="number">0</span>-cp36-cp36m-win_amd64.whl</span><br></pre></td></tr></table></figure>
</code></pre><p>保存<a href="https://raw.githubusercontent.com/Python3WebSpider/TestTess/master/image.png">测试图片</a>到本地,分别对tesseract和tesserocr进行测试,看能否识别</p>
<figure class="highlight cmd"><table><tr><td class="code"><pre><span class="line">tesseract image.png result -l eng && <span class="built_in">type</span> result.txt</span><br><span class="line">#运行结果</span><br><span class="line">#Tesseract Open Source OCR Engine v3.<span class="number">0</span>S .<span class="number">01</span> with Leptonica</span><br><span class="line">#Python3WebSpider</span><br></pre></td></tr></table></figure>
<figure class="highlight python"><table><tr><td class="code"><pre><span class="line"><span class="keyword">import</span> tesserocr</span><br><span class="line"><span class="keyword">from</span> PIL <span class="keyword">import</span> Image</span><br><span class="line">image = Image.<span class="built_in">open</span>(<span class="string">'D:\\temp\\test\\image.png'</span>)</span><br><span class="line"><span class="built_in">print</span>(tesserocr.image_to_text(image))</span><br></pre></td></tr></table></figure>
<h3 id="数据库-略"><a href="#数据库-略" class="headerlink" title="数据库(略)"></a>数据库(略)</h3><h3 id="存储库"><a href="#存储库" class="headerlink" title="存储库"></a>存储库</h3><h4 id="PyMysql"><a href="#PyMysql" class="headerlink" title="PyMysql"></a>PyMysql</h4><figure class="highlight cmd"><table><tr><td class="code"><pre><span class="line">pip3 install pymysql</span><br></pre></td></tr></table></figure>
<h4 id="PyMongo"><a href="#PyMongo" class="headerlink" title="PyMongo"></a>PyMongo</h4><figure class="highlight cmd"><table><tr><td class="code"><pre><span class="line">pip3 install pymongo</span><br></pre></td></tr></table></figure>
<h4 id="redis-py"><a href="#redis-py" class="headerlink" title="redis-py"></a>redis-py</h4><figure class="highlight cmd"><table><tr><td class="code"><pre><span class="line">pip3 install redis</span><br></pre></td></tr></table></figure>
<h3 id="WEB库"><a href="#WEB库" class="headerlink" title="WEB库"></a>WEB库</h3><h4 id="flask"><a href="#flask" class="headerlink" title="flask"></a>flask</h4><figure class="highlight cmd"><table><tr><td class="code"><pre><span class="line">pip3 install flask</span><br></pre></td></tr></table></figure>
<h4 id="Tornado"><a href="#Tornado" class="headerlink" title="Tornado"></a>Tornado</h4><p>  Tornado是一个支持异步的web框架,通过使用非阻塞I/O流,可以支撑成千上万的开放连接,效率非常高。</p>
<figure class="highlight cmd"><table><tr><td class="code"><pre><span class="line">pip3 install tornado</span><br></pre></td></tr></table></figure>
<h3 id="App爬取相关库-暂时放弃"><a href="#App爬取相关库-暂时放弃" class="headerlink" title="App爬取相关库(暂时放弃)"></a>App爬取相关库(暂时放弃)</h3><h3 id="爬虫框架"><a href="#爬虫框架" class="headerlink" title="爬虫框架"></a>爬虫框架</h3><h4 id="pyspider"><a href="#pyspider" class="headerlink" title="pyspider"></a>pyspider</h4><p>  它带有强大的WebUI、脚本编辑器、任务监<br>控器、项目管理器以及结果处理器,同时支持多种数据库后端、多种消息队列,另外还支持JavaScript渲染页面的爬取<br>  pyspider是支持JavaScript渲染的,而这个过程是依赖于PhantomJS的,所以还需要安装PhantomJS</p>
<figure class="highlight cmd"><table><tr><td class="code"><pre><span class="line">pip3 install pyspider</span><br></pre></td></tr></table></figure>
<h4 id="Scrapy"><a href="#Scrapy" class="headerlink" title="Scrapy"></a>Scrapy</h4><figure class="highlight cmd"><table><tr><td class="code"><pre><span class="line">pip3 install scrapy</span><br></pre></td></tr></table></figure>
<h3 id="参考资料"><a href="#参考资料" class="headerlink" title="参考资料"></a>参考资料</h3><p><a href="http://www.cuiqingcai.com">崔庆才大佬</a>的《python3网络爬虫开发实战》</p>
]]></content>
<categories>
<category>crawler</category>
</categories>
<tags>
<tag>python</tag>
<tag>web crawler</tag>
<tag>data mining</tag>
</tags>
</entry>
<entry>
<title>数据挖掘--爬虫--解析库</title>
<url>/2019/02/22/machine%20learning/clawler/clawler-4/</url>
<content><![CDATA[<h3 id="XPath"><a href="#XPath" class="headerlink" title="XPath"></a>XPath</h3><p>  XPath,全程XML Path Language,是一门在XML文档中查找信息的语言,也适用于HTML文档的搜索<br>  XPath<a href="http://www.w3school.com.cn/xpath/index.asp">用法</a>    lxml库<a href="http://lxml.de/">用法</a></p>
<h4 id="概览"><a href="#概览" class="headerlink" title="概览"></a>概览</h4><p>常用规则</p>
<div class="table-container">
<table>
<thead>
<tr>
<th>表达式</th>
<th>描述</th>
</tr>
</thead>
<tbody>
<tr>
<td>nodename</td>
<td>选取此节点的所有子节点</td>
</tr>
<tr>
<td>/</td>
<td>从当前节点选取直接子节点</td>
</tr>
<tr>
<td>//</td>
<td>从当前节点选取子孙节点</td>
</tr>
<tr>
<td>.</td>
<td>选取当前节点</td>
</tr>
<tr>
<td>..</td>
<td>选取当前节点的父节点</td>
</tr>
<tr>
<td>@</td>
<td>选取属性</td>
</tr>
</tbody>
</table>
</div>
<p>运算符以及介绍</p>
<div class="table-container">
<table>
<thead>
<tr>
<th>运算符</th>
<th>描述</th>
<th>实例</th>
<th>返回值</th>
</tr>
</thead>
<tbody>
<tr>
<td>or</td>
<td>或</td>
<td></td>
</tr>
<tr>
<td>and</td>
<td>与</td>
<td></td>
</tr>
<tr>
<td>mod</td>
<td>取余</td>
<td></td>
</tr>
<tr>
<td>\</td>
<td></td>
<td>计算两个节点集</td>
<td>//book \</td>
<td>//cd</td>
<td>返回所有拥有book和cd元素的节点集</td>
</tr>
<tr>
<td>+</td>
<td>加法</td>
</tr>
<tr>
<td>-</td>
<td>减法</td>
</tr>
<tr>
<td>*</td>
<td>乘法</td>
</tr>
<tr>
<td>div</td>
<td>除法</td>
</tr>
<tr>
<td>=</td>
<td>等于</td>
</tr>
<tr>
<td>!=</td>
<td>不等于</td>
</tr>
<tr>
<td><</td>
<td>小于</td>
</tr>
<tr>
<td>></td>
<td>大于</td>
</tr>
<tr>
<td>>=</td>
<td>大于等于</td>
</tr>
<tr>
<td><=</td>
<td>小于等</td>
</tr>
</tbody>
</table>
</div>
<h3 id="Beautiful-soup"><a href="#Beautiful-soup" class="headerlink" title="Beautiful soup"></a>Beautiful soup</h3><p>  Beautiful Soup就是Python的一个HTML或XML的解析库</p>
<h4 id="解析器"><a href="#解析器" class="headerlink" title="解析器"></a>解析器</h4><p>Beautiful soup在解析时实际上依赖解析库,支持的解析库如下</p>
<div class="table-container">
<table>
<thead>
<tr>
<th>解析器</th>
<th>使用的方法</th>
<th>优势</th>
<th>劣势</th>
</tr>
</thead>
<tbody>
<tr>
<td>Python标准库</td>
<td>BeautifulSoup(markup,”html.parser”)</td>
<td>Python内置标准库,执行速度适中,文档容错能力强</td>
<td>Python2.7.3以及3.2.2之前的版本文档容错能力差</td>
</tr>
<tr>
<td>lxml html解析器</td>
<td>BeautifulSoup(markup,”lxml”)</td>
<td>速度快,文档容错能力强</td>
<td>需要c语言库</td>
</tr>
<tr>
<td>lxml xml解析器</td>
<td>BeautifulSoup(markup,”xml”)</td>
<td>速度快,唯一支持xml的解析器</td>
<td>需要c语言库</td>
</tr>
<tr>
<td>html5lib</td>
<td>BeautifulSoup(markup,”html5lib”)</td>
<td>最好的容错性、以浏览器的方式解析文档、生成html5格式的文档</td>
<td>速度慢、不依赖扩展</td>
</tr>
</tbody>
</table>
</div>
<h3 id="puquery"><a href="#puquery" class="headerlink" title="puquery"></a>puquery</h3><p>  获取元素方式和jquery类似</p>
<h3 id="参考资料"><a href="#参考资料" class="headerlink" title="参考资料"></a>参考资料</h3><p><a href="http://www.cuiqingcai.com">崔庆才大佬</a>的《python3网络爬虫开发实战》</p>
]]></content>
<categories>
<category>crawler</category>
</categories>
<tags>
<tag>python</tag>
<tag>web crawler</tag>
<tag>data mining</tag>
</tags>
</entry>
<entry>
<title>数据挖掘--爬虫--爬虫基础</title>
<url>/2019/02/21/machine%20learning/clawler/crawler-2/</url>
<content><![CDATA[<h3 id="HTTP基本原理"><a href="#HTTP基本原理" class="headerlink" title="HTTP基本原理"></a>HTTP基本原理</h3><h4 id="URI和URL"><a href="#URI和URL" class="headerlink" title="URI和URL"></a>URI和URL</h4><p>  URI(Uniform Resource Identifier),统一资源标志符;URL(Uniform Resource Locator),统一资源定位符;URN(Uniform Resource Name),统一资源名称,只命名组员而不指定如何定位资源。URL和URN都是URI的子集。</p>
<h4 id="超文本hypertext"><a href="#超文本hypertext" class="headerlink" title="超文本hypertext"></a>超文本hypertext</h4><p>  超文本是用超链接的方法,将各种不同空间的文字信息组织在一起的网状文本。网页就是超文本的一种体现</p>
<h4 id="http和https"><a href="#http和https" class="headerlink" title="http和https"></a>http和https</h4><p>http(Hyper Text Transfer Protocol)超文本传输协议。用于从网络传输超文本数据到本地浏览器接收。<br>https(Hyper Text Transfer Protocol over Secure Socket Layer),是以安全为目标的HTTP通道,在HTTP下加入SSL层。</p>
<h4 id="http请求过程"><a href="#http请求过程" class="headerlink" title="http请求过程"></a>http请求过程</h4><div class="table-container">
<table>
<thead>
<tr>
<th></th>
<th>含义</th>
<th>解释</th>
</tr>
</thead>
<tbody>
<tr>
<td>Name</td>
<td>请求的名称</td>
<td>一般会将URL的最后一部分内容当做名称</td>
</tr>
<tr>
<td>Status</td>
<td>响应的状态码</td>
</tr>
<tr>
<td>Type</td>
<td>请求的文档类型</td>
</tr>
<tr>
<td>Initiator</td>
<td>请求源</td>
<td>用来标记请求是由那个对象或进程发起的</td>
</tr>
<tr>
<td>Size</td>
<td>从服务器下载的文件和请求的资源的大小</td>
</tr>
<tr>
<td>Time</td>
<td>发起请求到获取相应所用的总时间</td>
</tr>
<tr>
<td>WaterFall</td>
<td>网络请求的可视化瀑布流</td>
</tr>
</tbody>
</table>
</div>
<h4 id="请求"><a href="#请求" class="headerlink" title="请求"></a>请求</h4><p>  请求可以分为4部分:请求方法(Request Method)、请求的网址(Request URL)、请求头(Request Header)、请求体(Request Body)。 </p>
<h5 id="请求方法"><a href="#请求方法" class="headerlink" title="请求方法"></a>请求方法</h5><p>  常见请求方法有两种,GET和POST。除此之外还有PUT、DELETE、OPTIONS、CONNECT、TRACE等</p>
<div class="table-container">
<table>
<thead>
<tr>
<th>方法</th>
<th>描述</th>
</tr>
</thead>
<tbody>
<tr>
<td>GET</td>
<td>请求页面,并返回页面内容</td>
</tr>
<tr>
<td>HEAD</td>