[Step 1] Run the comment_bug_relation_raw_data_process.ipynb Notebook file under Raw Code Comment Processing for reading code-comment raw files. That notebook generates JSON files which will be used to collect AST of corresponding Java Code.
[Step 2] Prepare the AST from the JSON file using Java Compiler (e.g. IntelliJ, Eclipse). You can find the Java project under AST Creation IntelliJ folder. The project name is ast_creation_Java_project which takes two data named test_bug.json and test_nobug.json as input. You can generate the those in put from Step 1 or directly use the file under the AST Creation IntelliJ folder. The Java project write the AST to two new JSON file named as ast_test_bug.json and ast_test_nobug.json
[Step 3] Run comment_bug_relation_processing_simple_ast.ipynb under PKL file creating using AST to process those AST for preparing pk file to measure the inconsistency between code and correspondent comments. This notebook uses two JSON file containing the AST infromation named as ast_test_bug.json and ast_test_nobug.json. You can generate those by executing previous steps or found under the PKL file creating using AST folder. It creates two pkl file for bug and non-bug data.
[Step 4] Run siamese-run-Code_Comment.ipynb under Model Run to calculate the inconsistency between code and correspondent comments using pretrained siamese neural network. This notebook uses pre-trained RNN-LSTM model, previously generated pkl file, and benchmark data pkls as input. You can find entire prerequisite files under Model Run folder. You have to use bug and non-bug pkl seperately to cellect their coherence value. This step generate a list of coherence values for the targated pairs. This step need GPU for execution.
[Step 5] Run Result_Visualization.ipynb under Result visualization section for drawing diagram of results.