This project extends the default completion engine in Pharo 9 to add N-gram based sorting strategies. The idea with using N-grams for sorting is to get completions that are most likely to be used given a certain context, based on source code history loaded from https://gitlab.inria.fr/rmod-public/2019-sourcecodedata.
In order to load the repo and its dependencies, execute:
Metacello new
baseline: 'CompletionSorting';
repository: 'github://myroslavarm/CompletionSorting/';
load.
NOTE: This is still in development and only works on Pharo 9. However, you should be able to use the Frequency sorter without any problems.
the unigram approach where we take into consideration the number of token occurences in history and sort the completion candidates based on that
the results are sorted based on the probability of them occuring given a previous history word, i.e. (occurences of the token before followed by the token being completed / total occurences of the token before), such as:
The bigram model is trained using the Pharo N-gram library https://github.com/pharo-ai/NgramModel.
- we already have a Frequency based sorter, which is now fast enough to use when using code completion
- now the Bigram sorter is also available (there are some millisecond pauses but it's not critically slow)
- Settings -> Code Completion -> Sorter
- choose
FrequencyCompletionSorter
orBigramCompltionSorter