-
Notifications
You must be signed in to change notification settings - Fork 62
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
25 changed files
with
226 additions
and
100 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
eclipse.preferences.version=1 | ||
encoding//testdata/doccn/dongxiaoutf8-2.txt=UTF-8 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,2 @@ | ||
/utils/ | ||
/preprocess/ | ||
/gui/ |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,57 @@ | ||
1 8.0% testdata\python\stu1_demo.py testdata\python\stu1_lprcmd.py | ||
from stanford:http://moss.stanford.edu/results/874773796 Fri Oct 25 19:19:17 CST 2019 | ||
1 99.51535% dongxiao-2.doc dongxiaogbk.txt | ||
2 92.47312% gumingzhu-2.doc zhucuiyun_2.doc | ||
3 91.408936% wangmeng-2.doc zhucuiyun_2.doc | ||
4 87.63636% dongxiao-2.docx dongxiaoutf8-2.txt | ||
5 84.717606% gumingzhu-2.doc wangmeng-2.doc | ||
6 84.310844% dongxiao-2.doc dongxiao-2.pdf | ||
7 84.168015% dongxiao-2.doc dongxiaoutf8-2.txt | ||
8 83.870964% dongxiao-2.pdf dongxiaogbk.txt | ||
9 83.68336% dongxiaogbk.txt dongxiaoutf8-2.txt | ||
10 82.954544% dongxiao-2.docx dongxiaogbk.txt | ||
11 82.552505% dongxiao-2.doc dongxiao-2.docx | ||
12 75.74404% lijie-2.doc wangmeng-2.doc | ||
13 74.96063% gumingzhu-2.doc wuchangqing-2.doc | ||
14 71.703705% dongxiao-2.pdf dongxiaoutf8-2.txt | ||
15 71.49254% dongxiao-2.docx dongxiao-2.pdf | ||
16 69.92366% wuchangqing-2.doc zhucuiyun_2.doc | ||
17 68.584076% lijie-2.doc zhucuiyun_2.doc | ||
18 65.61151% wangmeng-2.doc wuchangqing-2.doc | ||
19 65.12301% gumingzhu-2.doc lijie-2.doc | ||
20 57.454544% dongxiaogbk.txt meitao-2.doc | ||
21 57.246376% dongxiao-2.doc meitao-2.doc | ||
22 52.258064% lijie-2.doc wuchangqing-2.doc | ||
23 50.757576% dongxiao-2.docx meitao-2.doc | ||
24 50.284416% dongxiao-2.pdf meitao-2.doc | ||
25 48.87218% makai��2.doc wangxuan_2.doc.doc | ||
26 48.45869% dongxiaoutf8-2.txt meitao-2.doc | ||
27 46.67074% liuchuanyang-2.doc tangwenpeng-2.doc | ||
28 41.64096% heliwen_2.doc liufan_2.doc | ||
29 40.54834% liufan_2.doc wangchunming_2.doc | ||
30 38.75061% gechunlong-2.doc hanchao_2.doc | ||
31 36.930233% luxiang-2.doc tangwenpeng-2.doc | ||
32 36.89095% jiangfeng-2.doc lijie-2.doc | ||
33 35.925926% weixiao-2.doc yinxu-2.doc | ||
34 35.424637% liuchuanyang-2.doc wuliangchao-2.doc | ||
35 35.039577% gechunlong-2.doc yinxu-2.doc | ||
36 34.839073% gechunlong-2.doc weixiao-2.doc | ||
37 34.325184% wangmeng-2.doc wuliangchao-2.doc | ||
38 34.069096% guozhiquan -2.doc wuliangchao-2.doc | ||
39 33.98907% wuliangchao-2.doc zhucuiyun_2.doc | ||
40 32.858547% tangwenpeng-2.doc xuqiwei-2.doc | ||
41 32.557137% tangwenpeng-2.doc wangchen-2.doc | ||
42 32.296955% liuchuanyang-2.doc yinxu-2.doc | ||
43 32.073547% lijie-2.doc wuliangchao-2.doc | ||
44 32.070206% gechunlong-2.doc wangchen-2.doc | ||
45 32.058823% jiangfeng-2.doc yinpeiyan_2.doc | ||
46 31.946404% sunxiaolei-2.doc wangchunming_2.doc | ||
47 31.471535% gumingzhu-2.doc wuliangchao-2.doc | ||
48 30.698889% sunxiaolei-2.doc yinxu-2.doc | ||
49 30.651136% liuchuanyang-2.doc xuqiwei-2.doc | ||
50 30.63007% heliwen_2.doc wangchunming_2.doc | ||
51 30.559345% liuchuanyang-2.doc weixiao-2.doc | ||
52 30.494392% wangchen-2.doc xuqiwei-2.doc | ||
53 30.429863% tangwenming-2.doc xuqiwei-2.doc | ||
54 30.424183% tangwenming-2.doc wangchen-2.doc | ||
55 30.095451% sunxiaolei-2.doc tangwenpeng-2.doc | ||
56 30.065361% guozhiquan -2.doc liuchuanyang-2.doc | ||
from fh Sun Dec 01 18:57:44 CST 2019 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
package preprocess.plag.edu; | ||
|
||
import java.util.List; | ||
|
||
import com.hankcs.hanlp.HanLP; | ||
import com.hankcs.hanlp.dictionary.CustomDictionary; | ||
import com.hankcs.hanlp.seg.common.Term; | ||
import com.hankcs.hanlp.tokenizer.NotionalTokenizer; | ||
|
||
public class Tokenizer { | ||
//将输入的字符串转成指定分隔符隔开的分词过的字符串 | ||
public static String segment(String text,String sep) { | ||
StringBuilder sb = new StringBuilder(); | ||
HanLP.Config.Normalization = true; //(繁体->简体,全角->半角,大写->小写) | ||
List<Term> tokens = NotionalTokenizer.segment(text);//分词,去除停用词 | ||
for(Term token : tokens) { | ||
sb.append(token.word+sep); | ||
} | ||
return sb.toString(); | ||
} | ||
|
||
public static void main(String[] args) { | ||
// TODO Auto-generated method stub | ||
HanLP.Config.Normalization = true; //(繁体->简体,全角->半角,大写->小写) | ||
CustomDictionary.insert("爱听4G", "nz 1000"); | ||
String text = "i am from china.小区居民有的反对喂养流浪猫,而有的居民却”赞成“喂养这些小宝贝,i will go back Home,我愛聽4G"; | ||
System.out.println(text); | ||
//精确分词 | ||
List<Term> tokens = HanLP.segment(text); | ||
System.out.println(tokens); // 停用词典位于data/dictionary/stopwords.txt,可以自行修改 | ||
for (Term token : tokens) { | ||
System.out.print("("+token.word+","+token.offset+","+token.length()+")"); | ||
|
||
} | ||
System.out.println(); | ||
// 自动去除停用词,会丢失词在原文件中的位置信息 | ||
tokens = NotionalTokenizer.segment(text); | ||
System.out.println(tokens); // 停用词典位于data/dictionary/stopwords.txt,可以自行修改 | ||
for (Term token : tokens) { | ||
System.out.print("("+token.word+","+token.offset+","+token.length()+")"); | ||
|
||
} | ||
System.out.println(); | ||
// 自动断句+去除停用词 | ||
for (List<Term> sentence : NotionalTokenizer.seg2sentence(text)) | ||
{ | ||
System.out.println(sentence); | ||
} | ||
//英语中的停用词也会被去掉 | ||
String str = Tokenizer.segment(text," "); | ||
System.out.println(str); | ||
} | ||
|
||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
���ʶ��壨�������Ƹ���Ӣ�ļ�����ij���12x2�� | ||
1.�������ԣ�Software Testing����������https://zh.wikipedia.org/wiki/%E8%BD%AF%E4%BB%B6%E6%B5%8B%E8%AF%95 ���壺������������Ҳ��Ϊ�˶������������ж��������� ������ P18 | ||
2. ��Ԫ����: unit testing ������http://www.igsgroup.com.cn/common/ISTQB%E8%BD%AF%E4%BB%B6%E6%B5%8B%E8%AF%95%E4%B8%93%E4%B8%9A%E6%9C%AF%E8%AF%AD%E5%AF%B9%E7%85%A7%E8%A1%A8v2.1.pdf | ||
���壺������ϸ��ƹ��˵���飬��ģ����������Ҫ����·����Ʋ���������������ģ���ڲ����� P94 | ||
3. ���ɲ���: integration testing ����ͬ�� ���壺�ڵ�Ԫ���ԵĻ����ϣ������г���ģ������������IJ��ԣ��������Ԫ���Ľӿڹ�ϵ��ʹ֮����Ҫ�� P25 | ||
4. ϵͳ���ԣ�system testing ����ͬ�� ���壺�Լ��ɵ�������Ӳ��ϵͳ���еIJ��� P26 | ||
5. ���ղ���: acceptance testing ����ͬ�� ���壺������ĿҪ��ͺ�ͬ������˫��ǩ���������ĵ����еIJ��Ժ����� P26 | ||
6. ���ܲ��ԣ�functional testing ����ͬ�� ���壺���ܲ��Ծ��ǶԲ�Ʒ�ĸ����ܽ�����֤�����ݹ��ܲ���������������ԣ�����Ʒ�Ƿ�ﵽ�û�Ҫ��Ĺ��ܡ� ������ http://baike.baidu.com/view/651435.htm | ||
7. �ںв��ԣ�black-box testing ����ͬ�� ���壺δ֪�����ڲ��ṹ���еIJ��� P26 | ||
8. �в��ԣ�white-box testing ����ͬ�� ���壺��֪�����ڲ��ṹ���еIJ��� P26 | ||
9. ���ܲ��ԣ�performance testing ����ͬ�� ���壺�������������ڼ���ϵͳ�е��������ܡ�P135 | ||
10. �����ԣ���testing ���壺�Լ������е�������Ʒ���в��� P158 | ||
11.CMM��Capability Maturity Model for Software ���������ģ�� http://baike.baidu.com/view/8110.htm ���壺����������֯�ڶ��塢ʵʩ�����������ƺ������������̵�ʵ���и�����չ�ε����� http://baike.baidu.com/view/8110.htm | ||
12. ISO9000������������ϵ�� ���壺��TC176������������ϵ����ίԱ�ᣩ�ƶ������й��ʱ��� http://baike.baidu.com/view/9486.htm | ||
����⣺��2x12�� | ||
1 �ںв��ԺͰв��Ե�������Щ����ʹ�úںв��Ը�������?��Щ����ʹ�ðв��Ը������֣�����2���� | ||
�ںв����Dz�֪�����������ڲ��ṹ���в�����֪�����������ڲ��ṹ�� | ||
�ںв��Ա��ڷ���1���Ƿ��в���ȷ����©�Ĺ��ܣ�2���ڽӿ��ϣ������Ƿ�����ȷ�Ľ��ܣ��ܷ������ȷ�Ľ���� | ||
�в������ڷ��֣�1�������е����ж���ȡ���桱��ȡ���١�����������������ٲ�һ�顣2����ѭ���ı߽�����еĽ�����ִ��ѭ���� | ||
http://zhidao.baidu.com/question/13988876.html | ||
2 ���ɲ��Ժ�ϵͳ���Ե��������ϵ�� | ||
P132 ���ɲ��Զ�����ģ���Ľӿڣ�ϵͳ���Զ���������ϵͳ�����ɲ��Ժ�ϵͳ���Զ��õ��ںв��� | ||
�ʴ��⣺(52) | ||
1 (10)���������������ٲ�ģ�ͣ�������Լ�����ľ�����Ŀ���ش��������⣺ | ||
�ٲ�ģ�ͣ��������о��ͼƻ��������������ơ����롪���ԡ�����ά�� http://baike.baidu.com/view/551037.htm | ||
��1�� ʵ����Ŀ������������Щ�Σ����ȼ�����������Ŀ�� | ||
��һ��������Ʊϵͳ��һ��ʼ��ʦ˵Ҫ�������о��ͷ�������ͬѧ��������ʼ������ʦ��Ҫʲô�����������������Ȼ�����������и���Ӧ���뷨����ƣ�����ʼ����루���룩���������û�б��������ܲ������У����ԣ� | ||
��2�� ��Ϊ����Ա������д������Ϊ����Ҫ��3���Σ���˵��ԭ�� | ||
�����������ƣ����롣���������ֻ��֪���Լ���Ҫʲô����֪���Լ�Ҫ����ʲô��������ƣ��и������ģ�ӣ�����֪������ôŪ�����룬��Ȼ�dz���Ա���������ܽг���Ա�� | ||
|
||
2 (12)д���������Ե�2�ֲ�ͬ���壬ָ�����ǵ�������ϲ����һ�֣�Ϊʲô�� | ||
��һ�֣�P18 Bill Hetzel ������Ե�Ŀ�IJ�������Ϊ�˷�������ȱ�ݺʹ���Ҳ�Ƕ������������ж�������������������������� | ||
�ڶ��֣� P18 Grenford J.Myers ������Ϊ��֤�������д���������֤���������� | ||
�ڶ���Ƭ��㡣ԭ��һ���������ȫ��㣬��Ϊ�����ǿ϶��д��ģ�������������ûbug�ģ������Ҹ�ϲ���ڶ��� | ||
|
||
3 (30)����Vģ�ͣ�˵���������Թ����Ǵ��ĸ��ο�ʼ�ģ���Ͼ�����Ŀ����ʵ����Ŀ�����о�������Щ���ԽΣ�������Щ���͵IJ��ԣ��繦�ܡ��ڰеȣ��������������Ϊ�ĸ����Խ�����Ҫ��Ϊʲô��P30 | ||
�û����������������Ҫ��������ϸ���������롪��Ԫ���ԡ������ɲ��ԡ�ϵͳ���ԡ����ղ��� | ||
������Ʊϵͳ��ÿ���࣬����Ū��֮�϶��ȼ����û�д������л�ᱨ������Ԫ���ԣ�����һЩ���ࡢ�����ĵ��ã����ܲ��ܱ����ã����ɲ��ԣ������һ���Ժ�����Ʊ�IJ��ԣ����ܲ��ܳɹ���ϵͳ���ԣ���������ʦ�������ղ��ԣ� | ||
�в��ԣ�����֮��һ����û��ʲô���� | ||
�ںв��ԣ���ʦ����ʱ������û��Ч�� | ||
�Ҿ����û����ͺ���Ҫ����Ϊ����Խ�緢�֣���ʧԽС |
Oops, something went wrong.