Skip to content

Commit

Permalink
add image compare
Browse files Browse the repository at this point in the history
  • Loading branch information
fanghon committed Dec 25, 2019
1 parent fcd1e09 commit bc5666b
Show file tree
Hide file tree
Showing 48 changed files with 236 additions and 72 deletions.
1 change: 1 addition & 0 deletions .classpath
Original file line number Diff line number Diff line change
Expand Up @@ -12,5 +12,6 @@
<classpathentry kind="lib" path="lib/substance-5.3.jar"/>
<classpathentry kind="lib" path="lib/jplag-2.12.1-SNAPSHOT-jar-with-dependencies.jar"/>
<classpathentry kind="lib" path="lib/hanlp-portable-1.7.5.jar"/>
<classpathentry kind="lib" path="lib/jimagehash3.0.jar"/>
<classpathentry kind="output" path="bin"/>
</classpath>
11 changes: 8 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# antiplag 程序代码及文档作业相似度检查软件
软件主要检查、比较学生提交的电子档作业之间的相似度,能对多种编程语言(如java、c/c++、python等)、多种格式(txt、doc、docx、pdf等)的中英文、简繁体文档(如实验报告)之间的文本相似度进行比较分析,输出相似度高的文档,进而辅助发现学生之间互相抄袭的行为。
软件主要检查、比较学生提交的电子档作业之间的相似度,能对多种编程语言(如java、c/c++、python等)、多种格式(txt、doc、docx、pdf等)的中英文、简繁体文档之间的文本、多种格式(png、jpg、gif、bmp等)的图片相似度进行比较分析,输出相似度高的文档、图片,进而辅助发现学生之间互相抄袭的行为。

## 需求
[jdk11](https://www.oracle.com/technetwork/java/javase/downloads/jdk11-downloads-5066655.html)
Expand All @@ -14,7 +14,7 @@
![程序主界面](./maingui.png)

## 原理
系统采用的主要技术是字符串相似度比较算法、代码词法语法解析、自然语言处理(nlp)中的分词。
系统采用的主要技术是字符串相似度比较算法、代码词法语法解析、自然语言处理(nlp)中的分词、图片相似度比较算法

程序类文本的相似度比较基于3个开放系统:
* 一是基于网络服务的[MOSS系统](http://theory.stanford.edu/~aiken/moss/)(斯坦福大学开放的支持多种编程语言代码相似度比较的系统);
Expand All @@ -33,6 +33,10 @@

第二种是基于jplag的GST算法,对其功能进行了扩展,增加的“doc”语言类型,可以对各种文档进行相似度计算,并提供基于网页的可视化比对功能。

图片的相似度比较基于[JImageHash项目](https://github.com/KilianB/JImageHash)

主要采用了图片phash指纹相似度比较算法。

### 参考文献:
1. [Software Plagiarism Detection Techniques:A Comparative Study](http://www.ijcsit.com/docs/Volume%205/vol5issue04/ijcsit2014050441.pdf)
2. [JPlag: Finding plagiarisms among a set of programs](http://page.mi.fu-berlin.de/prechelt/Biblio/jplagTR.pdf)
Expand All @@ -42,7 +46,7 @@
## TODO
1. 将jplag整合进系统。已实现。
2. 支持html,jsp文件代码的查重。
3. 支持图片文件查重。
3. 支持图片文件查重。已实现。
4. 开发web版作业查重软件。
5. 支持存储以往作业文档,支持基于数据库的作业查重。

Expand All @@ -51,5 +55,6 @@
## 更新情况
1. 2019.12.1 使用hanlp作为分词组件,支持pdf、html文件文本的查重,修复若干bug,发布v2.8.6版。
2. 2019.12.3 扩展jplag功能,提供“doc”语言类型,实现了对多种格式文档文本的相似度计算及可视化比对功能。更新使用帮助,测试数据,发布v2.8.8版。
3. 2019.12.25 实现图片相似比较功能,使用phash,实现了对多种格式图片的相似比较。更新测试数据,文档,发布版v3.0.0版。


6 changes: 3 additions & 3 deletions bin/.gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/preprocess/
/gui/
/jplag/
/utils/
/shingle/
/utils/
/gui/
/imghash/
Binary file modified bin/gui/plag/edu/PlagGUI$1.class
Binary file not shown.
Binary file modified bin/gui/plag/edu/PlagGUI$2.class
Binary file not shown.
Binary file modified bin/gui/plag/edu/PlagGUI$3.class
Binary file not shown.
Binary file modified bin/gui/plag/edu/PlagGUI$4.class
Binary file not shown.
Binary file modified bin/gui/plag/edu/PlagGUI$5.class
Binary file not shown.
Binary file modified bin/gui/plag/edu/PlagGUI$6.class
Binary file not shown.
Binary file modified bin/gui/plag/edu/PlagGUI$7.class
Binary file not shown.
Binary file modified bin/gui/plag/edu/PlagGUI$8.class
Binary file not shown.
Binary file modified bin/gui/plag/edu/PlagGUI$9.class
Binary file not shown.
Binary file modified bin/gui/plag/edu/PlagGUI.class
Binary file not shown.
Binary file added bin/jplag/doc/DocToken.class
Binary file not shown.
Binary file added bin/jplag/doc/Language.class
Binary file not shown.
Binary file added bin/jplag/doc/Parser.class
Binary file not shown.
Binary file added bin/jplag/doc/TokenStructure.class
Binary file not shown.
Binary file added bin/jplag/options/CommandLineOptionsExt.class
Binary file not shown.
Binary file modified bin/shingle/plag/edu/ShingleSim.class
Binary file not shown.
6 changes: 6 additions & 0 deletions help.txt
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,12 @@ docx
ʽ�ĵ������ƶȣ�����֧�ֻ�����ҳ�Ŀ��ӻ��ȶԡ���������롰������롱���IJ�����ͬ����⹤��ѡ��
Jplag����������ѡ��doc�����ɡ�Jplag�µ�text�������͸��ʺϼ�ⴿӢ���ĵ������ƶȡ�

3 ͼƬ�����ƶȼ��
��1��ѡ�񱻼��ͼƬ�ļ�Ŀ¼����testdata�µ�imgsĿ¼��
��2��ȷ����������ȷ����ҵ�����ǡ�ͼƬ����
��3��ִ�бȽϡ�
��4���鿴�������


�� ����׼��
ϵͳ֧��2�����������ʽ.
Expand Down
Binary file added lib/jimagehash3.0.jar
Binary file not shown.
Binary file modified maingui.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
64 changes: 1 addition & 63 deletions out.txt
Original file line number Diff line number Diff line change
@@ -1,63 +1 @@
1 99.51535% dongxiao-2.doc dongxiaogbk.txt
2 92.47312% gumingzhu-2.doc zhucuiyun_2.doc
3 91.408936% wangmeng-2.doc zhucuiyun_2.doc
4 87.63636% dongxiao-2.docx dongxiaoutf8-2.txt
5 84.765625% dongxiao-2.docx dongxiao-2.html
6 84.717606% gumingzhu-2.doc wangmeng-2.doc
7 84.310844% dongxiao-2.doc dongxiao-2.pdf
8 84.168015% dongxiao-2.doc dongxiaoutf8-2.txt
9 83.870964% dongxiao-2.pdf dongxiaogbk.txt
10 83.68336% dongxiaogbk.txt dongxiaoutf8-2.txt
11 83.14176% dongxiao-2.html dongxiaoutf8-2.txt
12 82.954544% dongxiao-2.docx dongxiaogbk.txt
13 82.552505% dongxiao-2.doc dongxiao-2.docx
14 75.74404% lijie-2.doc wangmeng-2.doc
15 74.96063% gumingzhu-2.doc wuchangqing-2.doc
16 71.703705% dongxiao-2.pdf dongxiaoutf8-2.txt
17 71.49254% dongxiao-2.docx dongxiao-2.pdf
18 70.34036% dongxiao-2.html dongxiaogbk.txt
19 70.0% dongxiao-2.doc dongxiao-2.html
20 69.92366% wuchangqing-2.doc zhucuiyun_2.doc
21 68.584076% lijie-2.doc zhucuiyun_2.doc
22 65.61151% wangmeng-2.doc wuchangqing-2.doc
23 65.12301% gumingzhu-2.doc lijie-2.doc
24 60.869564% dongxiao-2.html dongxiao-2.pdf
25 57.454544% dongxiaogbk.txt meitao-2.doc
26 57.246376% dongxiao-2.doc meitao-2.doc
27 52.258064% lijie-2.doc wuchangqing-2.doc
28 50.757576% dongxiao-2.docx meitao-2.doc
29 50.284416% dongxiao-2.pdf meitao-2.doc
30 48.87218% makai��2.doc wangxuan_2.doc.doc
31 48.45869% dongxiaoutf8-2.txt meitao-2.doc
32 46.67074% liuchuanyang-2.doc tangwenpeng-2.doc
33 41.878174% dongxiao-2.html meitao-2.doc
34 41.64096% heliwen_2.doc liufan_2.doc
35 40.54834% liufan_2.doc wangchunming_2.doc
36 38.75061% gechunlong-2.doc hanchao_2.doc
37 36.930233% luxiang-2.doc tangwenpeng-2.doc
38 36.89095% jiangfeng-2.doc lijie-2.doc
39 35.925926% weixiao-2.doc yinxu-2.doc
40 35.424637% liuchuanyang-2.doc wuliangchao-2.doc
41 35.039577% gechunlong-2.doc yinxu-2.doc
42 34.839073% gechunlong-2.doc weixiao-2.doc
43 34.325184% wangmeng-2.doc wuliangchao-2.doc
44 34.069096% guozhiquan -2.doc wuliangchao-2.doc
45 33.98907% wuliangchao-2.doc zhucuiyun_2.doc
46 32.858547% tangwenpeng-2.doc xuqiwei-2.doc
47 32.557137% tangwenpeng-2.doc wangchen-2.doc
48 32.296955% liuchuanyang-2.doc yinxu-2.doc
49 32.073547% lijie-2.doc wuliangchao-2.doc
50 32.070206% gechunlong-2.doc wangchen-2.doc
51 32.058823% jiangfeng-2.doc yinpeiyan_2.doc
52 31.946404% sunxiaolei-2.doc wangchunming_2.doc
53 31.471535% gumingzhu-2.doc wuliangchao-2.doc
54 30.698889% sunxiaolei-2.doc yinxu-2.doc
55 30.651136% liuchuanyang-2.doc xuqiwei-2.doc
56 30.63007% heliwen_2.doc wangchunming_2.doc
57 30.559345% liuchuanyang-2.doc weixiao-2.doc
58 30.494392% wangchen-2.doc xuqiwei-2.doc
59 30.429863% tangwenming-2.doc xuqiwei-2.doc
60 30.424183% tangwenming-2.doc wangchen-2.doc
61 30.095451% sunxiaolei-2.doc tangwenpeng-2.doc
62 30.065361% guozhiquan -2.doc liuchuanyang-2.doc
from fh Mon Dec 02 19:18:34 CST 2019
from fh Wed Dec 25 20:57:21 CST 2019
38 changes: 36 additions & 2 deletions src/gui/plag/edu/PlagGUI.java
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
import org.jvnet.substance.SubstanceLookAndFeel;
import org.jvnet.substance.skin.BusinessBlueSteelSkin;

import imghash.plag.edu.ImageSim;
import moss.plag.edu.Http;

import shingle.plag.edu.ShingleSim;
Expand All @@ -53,6 +54,7 @@ public class PlagGUI extends JFrame {


WinCMD cmd;
private JRadioButton radBntImage;
/**
* Launch the application.
*/
Expand Down Expand Up @@ -125,6 +127,7 @@ public void stateChanged(ChangeEvent arg0) {
&& combLang!=null){
combMethod.setEnabled(true); // 使能算法选择按钮
combLang.setEnabled(true);
txtThreshold.setText("50");
}
}
});
Expand All @@ -140,16 +143,38 @@ public void stateChanged(ChangeEvent arg0) {
if(radBntText.isSelected()){
combMethod.setEnabled(false); //禁止算法选择按钮
combLang.setEnabled(false);
txtThreshold.setText("50");
}
}
});
radBntText.setBounds(158, 26, 98, 23);
radBntText.setBounds(106, 26, 98, 23);
ButtonGroup rbGroup = new ButtonGroup();
rbGroup.add(radBntProgram);
rbGroup.add(radBntText);

panel.add(radBntText);

radBntImage = new JRadioButton("\u56FE\u7247");
radBntImage.addChangeListener(new ChangeListener() {
public void stateChanged(ChangeEvent e) {
//图片按钮被选择
if(radBntImage.isSelected()){
combMethod.setEnabled(false); //禁止算法选择按钮
combLang.setEnabled(false);
txtThreshold.setText("80");
}
}

});
radBntImage.setToolTipText("\u652F\u6301\u56FE\u7247\u7C7B\u578B\uFF1Apng jpeg gif");
radBntImage.setBounds(201, 26, 98, 23);
panel.add(radBntImage);

ButtonGroup g1=new ButtonGroup();
g1.add(radBntProgram);
g1.add(radBntText);
g1.add(radBntImage);

JPanel panel_1 = new JPanel();
panel_1.setBorder(new TitledBorder(null, "\u53C2\u6570", TitledBorder.LEADING, TitledBorder.TOP, null, null));
panel_1.setBounds(51, 181, 305, 95);
Expand Down Expand Up @@ -256,7 +281,16 @@ public void actionPerformed(ActionEvent arg0) {
}else if(res>0){
JOptionPane.showMessageDialog(PlagGUI.this, "执行完毕,未发现符合限值要求的结果,可以尝试调低相似度限值");
}
}
}else if(radBntImage.isSelected()){ //比较图片
String[] args = new String[2];
args[0] = path;
args[1] = threshold;
ImageSim.main(args) ; // 执行比较
JOptionPane.showMessageDialog(PlagGUI.this, "执行完毕,请查看结果。\r\n如果结果为空,可以尝试调低相似度限值");



}


}else{
Expand Down
Loading

0 comments on commit bc5666b

Please sign in to comment.