The dataset of large case studies on mutants' similarity with bugs

This repository provides access to the dataset associated with our accepted papers on fault seeding and evaluation of mutation testing techniques. The dataset is designed to support empirical studies, offering insights into the syntactic and semantic aspects of artificially seeded faults a.k.a. mutants, generated by mutation testing tools (PIT, μBERT, iBIR, DeepMutation) and bugs (Defects4J).

Papers

Syntactic Vs. Semantic similarity of Artificial and Real Faults in Mutation Testing Studies
On Comparing Mutation Testing Tools through Learning-based Mutant Selection

Deep Learning Approach

The source code for the deep learning approach, Cerebro, employed for mutant selection is available here.

Simulation

The source code to perform simulation and to help identify the semantic and syntactic correlation between bugs and mutants is available in Simutate repository.

Citation

To cite our papers or the dataset, please use the BibTeX entries available in cite.bib.

Structure of the Dataset

Important Note: Due to GitHub's restriction on file sizes, the dataset files are zipped to a maximum of 100 MB each.

The dataset is organized as follows:

The source code of all the 595 Defects4J bugs considered in our study is available in projects_source_code_buggy directory.
The source code of fixes to all the bugs is available in projects_source_code_fixed directory. These fixes were considered for mutation purposes.
The details of fixes to all the bugs with tests are available in fixes_for_all_bugs_with_tests directory.
The statements modified in the bug-fixes are available in changed_lines_to_fix_bugs directory.
The details on the tests that failed for bugs are in groundtruth_bugs_failing_tests directory.
The mutants generated by the mutation testing tools μBERT, iBIR, and DeepMutation are available in mutants_generated_via_CodeBERT, mutants_generated_via_iBIR, and mutants_generated_via_DeepMutation directories, respectively.
The details on the operators employed by the mutation testing tools are available in mutation_operators_employed_by_mutation_testing_tools directory.
The details of all the tests failed by the mutants generated via μBERT, and DeepMutation are available in failed_tests_by_CodeBERT_mutants and failed_tests_by_DeepMutation_mutants directories, respectively.
The details and scores of semantic comparison between the bugs and the mutants generated by the mutation testing tools are available in semantic_similarity_between_bugs_and_CodeBERT_mutants, semantic_similarity_between_bugs_and_DeepMutation_mutants, and semantic_similarity_between_bugs_and_iBIR_mutants directories, respectively.
The details and scores of syntactic comparison between the bugs and the mutants generated by the mutation testing tools are available in syntactic_similarity_between_bugs_and_CodeBERT_mutants, syntactic_similarity_between_bugs_and_DeepMutation_mutants, and syntactic_similarity_between_bugs_and_iBIR_mutants directories, respectively.
Out of all the mutants generated by the mutation testing tools, the details of the mutants predicted as subsuming (and non-subsuming) by the deep learning based mutant selection approach Cerebro, based on its training n-fold cross-evaluation, are available in Cerebro_predicted_mutants directory.

Please feel free to explore and utilize the dataset for your research and testing evaluations. If you have any questions or need further clarification, please feel free to reach out. Thank you for your interest and collaboration.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The dataset of large case studies on mutants' similarity with bugs

Papers

Deep Learning Approach

Simulation

Citation

Structure of the Dataset

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
Cerebro_predicted_mutants		Cerebro_predicted_mutants
changed_lines_to_fix_bugs		changed_lines_to_fix_bugs
failed_tests_by_CodeBERT_mutants		failed_tests_by_CodeBERT_mutants
failed_tests_by_DeepMutation_mutants		failed_tests_by_DeepMutation_mutants
fixes_for_all_bugs_with_tests		fixes_for_all_bugs_with_tests
groundtruth_bugs_failing_tests		groundtruth_bugs_failing_tests
mutants_generated_via_CodeBERT		mutants_generated_via_CodeBERT
mutants_generated_via_DeepMutation		mutants_generated_via_DeepMutation
mutants_generated_via_iBIR		mutants_generated_via_iBIR
mutation_operators_employed_by_mutation_testing_tools		mutation_operators_employed_by_mutation_testing_tools
projects_source_code_buggy		projects_source_code_buggy
projects_source_code_fixed		projects_source_code_fixed
semantic_similarity_between_bugs_and_CodeBERT_mutants		semantic_similarity_between_bugs_and_CodeBERT_mutants
semantic_similarity_between_bugs_and_DeepMutation_mutants		semantic_similarity_between_bugs_and_DeepMutation_mutants
semantic_similarity_between_bugs_and_iBIR_mutants		semantic_similarity_between_bugs_and_iBIR_mutants
syntactic_similarity_between_bugs_and_CodeBERT_mutants		syntactic_similarity_between_bugs_and_CodeBERT_mutants
syntactic_similarity_between_bugs_and_DeepMutation_mutants		syntactic_similarity_between_bugs_and_DeepMutation_mutants
syntactic_similarity_between_bugs_and_iBIR_mutants		syntactic_similarity_between_bugs_and_iBIR_mutants
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
cite.bib		cite.bib

License

serval-uni-lu/The_dataset_of_large_case_studies_on_mutants_similarity_with_bugs

Folders and files

Latest commit

History

Repository files navigation

The dataset of large case studies on mutants' similarity with bugs

Papers

Deep Learning Approach

Simulation

Citation

Structure of the Dataset

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages