This repository contains the data of the PhD thesis "Understanding Confusion in Code Reviews" of Felipe Ebert, and all papers related to it. It provides three gold standard datasets with Android code review information, and one replication package for confusion identification with the classification models.
-
dataset-confusion-code-reviews - this folder contains the two gold standard sets with general and inline comments from Android labeled as confusion and no confusion. This dataset was partially published at ICSME'2017 in the paper Confusion Detection in Code Reviews.
-
dataset-confusion-in-context - this folder contains the gold standard set with the confusion in context with the reasons for confusion in code reviews, its impacts, and the strategies developers use to cope with confusion. This dataset was published at SANER'2019 in the paper Confusion in Code Reviews: Reasons, Impacts and Coping Strategies.
-
dataset-questions-intentions - this folder contains the gold standard set with the communicative intentions of developers questions in code reviews. This dataset was published at ICSME'2018 in the paper Communicative Intention in Code Review Questions.
-
confusion-detection - this folder contains the replication package of the confusion identification studies and the classification models.
- Felipe Ebert (Federal University of Pernambuco - Brazil, Eindhoven University of Technology - The Netherlands)
- Fernando Castor (Federal University of Pernambuco - Brazil)
- Nicole Novielli (University of Bari - Italy)
- Alexander Serebrenik (Eindhoven University of Technology - The Netherlands)