Code for paper, "Pathways to Leverage Transcompiler based Data Augmentation for Cross-Language Clone Detection", ICPC 2023
To set up the project, follow these steps:
- Install dependencies by executing the following scripts:
install_server.sh
srcml_dep.sh
requirements.txt
To train or test the models, it is recommended to use a virtual environment. Follow the specific requirements outlined in the requirements.txt
file. For additional model-specific instructions, refer to the repository of the target model.
Detailed instructions for setting up ANTLR and Transcoder can be found in the following files:
setup_antlr.txt
setup_transcoder.txt
To generate clone pairs and datasets, follow these steps:
- Run the provided notebooks sequentially, ensuring dependencies are met
- Feature extraction using ANTLR (find in CLCDSA repo)
- Utilize the clone pairs generation method provided in the CLCDSA repo (requires Java)
For more information, refer to the respective repositories and documentation.
-
Pre-trained Model
-
Cross-Language Clone Detection Models
-
Graph Matching Network for Single-Language Clones
-
Parsers
Subroto Nag Pinku, subroto.npi@usask.ca