Skip to content

Commit

Permalink
add default encoding for counting words as UTF-8
Browse files Browse the repository at this point in the history
  • Loading branch information
filyp committed Nov 23, 2021
1 parent b660a04 commit 35a5b86
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion autocorrect/word_count.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ def get_words(filename, lang, encd):
yield from re.findall(word_regex, line)


def count_words(src_filename, lang, encd=None, out_filename="word_count.json"):
def count_words(src_filename, lang, encd="utf-8", out_filename="word_count.json"):
words = get_words(src_filename, lang, encd)
counts = Counter(words)
# make output file human readable
Expand Down

0 comments on commit 35a5b86

Please sign in to comment.