DrDupLex is a novel clone detector based on the index of abstract syntax trees. Please, see the following paper for details:
Zdenek Tronicek, Indexing source code and clone detection, Information and Software Technology, Volume 144, 2022, 106805, ISSN 0950-5849.
To compile the code, you need to install maven
and run the following
command (see also compile.bat
):
mvn clean compile assembly:single
The run and output of DrDupLex are controlled by a configuration file, which can contain the following configuration parameters:
- level specifies the granularity of the index; accepted values are "method" and "statement".
- compressed specifies whether the index is compressed or not; accepted values are "true" and "false".
- persistent specifies whether the index is built in main memory or on secondary storage; accepted values are "true" and "false".
- minSize specifies the minimum number of lines; for example, if minSize is 5, the code fragment must have at least 5 lines to be reported.
- ignoreUnaryAtLiterals specifies how the unary plus and minus are treated; accepted values are "true" and "false".
- ignoreAnnotations specifies whether annotations in code are taken into account; accepted values are "true" and "false".
- batchFileSize specifies how many files are processed before the index in memory is merged with the persistent index.
Example:
- index = simplified
- level = method
- compressed = true
- persistent = false
- sourceDir = /research/BigCloneBench
- minSize = 20
- outputFile = bigclonebench.xml
To run the clone detector with test.properties
configuration file,
use the following command:
java -jar target/DrDupLex-1.0-jar-with-dependencies.jar test.properties
See also examples in the evaluation
folder.