Datasets curated by SCANL lab.
The grammar_patterns_data directory contains the dataset from:
- Christian D. Newman, Reem S. AlSuhaibani, Michael J. Decker, Anthony Peruma, Dishant Kaushik, Mohamed Wiem Mkaouer, Emily Hill, On the generation, structure, and semantics of grammar patterns in source code identifiers, Journal of Systems and Software, 2020, 110740, ISSN 0164-1212, https://doi.org/10.1016/j.jss.2020.110740. (http://www.sciencedirect.com/science/article/pii/S0164121220301680)
The abbreviation_expansions_data directory contains the dataset from:
-
C. D. Newman, M. J. Decker, R. S. Alsuhaibani, A. Peruma, D. Kaushik and E. Hill, "An Empirical Study of Abbreviations and Expansions in Software Artifacts," 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME), Cleveland, OH, USA, 2019, pp. 269-279, doi: 10.1109/ICSME.2019.00040.
-
C. D. Newman, M. J. Decker, R. S. AlSuhaibani, A. Peruma, D. Kaushik and E. Hill, "An Open Dataset of Abbreviations and Expansions," 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME), Cleveland, OH, USA, 2019, pp. 280-280, doi: 10.1109/ICSME.2019.00041.