GitHub - christopher-chandler/bulk_generate_japanese_vocab_frequency: An Anki add-on for adding the word frequency to the Japanese words in a specific deck.

Bulk Generate Japanese Vocab Frequency

Add frequency to Japanese vocab cards

Report Bug · Request Feature

Table of Contents

About The Project
- Built With
Getting Started
- Prerequisites
- Installation
Usage
Roadmap
Update
Contributing
License
Contact
Acknowledgments

About The Project

This add-on will generate the frequency ranking for the provided Japanese vocab field The frequency ranking is generated from the same frequency database of the Rikai-Sama,i.e., JMDict firefox add-on.

Built With

This project is written in pure python and the respective Anki libraries.

anki~=2.1.54
aqt~=2.1.54

(back to top)

Getting Started

To get to started, you have to install the add-on through Anki. You can also install it via the files provided on Github.

Prerequisites

You must set the variables of the config.json. These are necessary so that the dictionaries can be accessed. It is better to use absolute as opposed to relative paths.

The dictionaries needed for the word type data and frequency look-up are included in this addon. They are stored in the folder of the anki addon: Anki2/addons21/1004691625/databases

freq_dict.db
jmdict.db
slice_of_life.db

Should they not be included, they can also be downloaded from the github page of this anki add on.

{
    "0_jm_dict": "",
    "0_freq_dict": "",
    "01_note_type": "Japanese",
    "02_vocab_input_field": "Target Word",
    "03_frequency_output_field": "Frequency Ranking",
    "03_word_type_output_field": "Word Type",
    "04_overwrite_destination_field": true
}

0_freq_dict: The path of the frequency dictionary; used to ascertain frequency data.
0_jm_dict: The path of the japanese dictionary; used to ascertain word type data It is coupled with the add-on and is located in the project directory. You only need to copy the path of the standard dictionary or the one you provide.
01_note_type: The name of the note type which contains the vocab for which the frequency rate should be determined.
02_vocab_input_field: The field that contains the target word
03_frequency_output_field: The field where frequency rating should be saved.
03_word_type_output_field": The field where grammatical data about the word should be saved.
04_overwrite_destination_field: Overwrite frequency value of the output filed (automatically set to true)

Installation

There are two ways to install this add-on:

Add the add-on via anki using the code 1004691625
Install the add-on via the files provided on Github.

(back to top)

Usage

For example, if you have a card with the vocab 頬杖, then the frequency ranking of "14962" will be appended to the chosen destination field.

What does this frequency number mean?

If a word has a frequency of 1563, it means that 1562 words are more frequent than it.
Frequency number 1-5000 = very common, 5001-10000 = common, 10001-20000 = rare, 20001+ = very rare
Frequencies are based on analysis of 5000+ novels. Naturally, frequency based on other mediums (such as newspapers) might vary.
Not all words have frequency information. It is possible for multiple words to share the same frequency.
More info in Rikai-Sama Add-on or its frequency ranking can be found on
his site at http://rikaisama.sourceforge.net/
You can also reference to the original project https://ankiweb.net/shared/info/1612642956

Vocab field format

The following things will be ignored when querying the frequency of a vocab:

HTML tags (it is possible to color or format the vocab field);
Parenthesis and their contents (e.g., kanji's wrong readings);
Leading/trailing whitespace.

(back to top)

Example

Frequency Data

open the Anki-add on and insert the data into the necessary fields.

Go to the cards that correspond to the data you entered.

Highlight the cards that should have their frequency fields filled.

Then open the Edit menu in the browser menu and click on Bulk Generate Japanese Frequency

If information for a word cannot be found, it will be filled with UNK for unknown.

Remove HTML

Case with a colored vocab field and extra information in parenthesis. In this case the vocab term will be 方 only.

Add Word Type Data

Using the function generate word type data, you can add data from the jmdict to a certain field. This can be understood as a dictionary function, but with a limited interface. To understand what the exact abbreviations mean, it is best to consult the jmdict site

if you set the config file correctly, this should work as the previous function, but the output is different.

select the appropriate field where the word type should be saved. In this case, the field is called Word Type.
click on this field and a pop up and then select Bulk Generate Word Type Data

The following pop ups should appear:

If you rexamine the field Word Type, you now see that it is filled with the jmdict info:

If information for a word cannot be found, it will be filled with UNK for unknown.

Roadmap

Add Jmdict support

Updates

v1.0.2

add function so that word type data can be added to a field alongside the frequency data
restructured the config file to include the new 03_word_type_output_field

See the open issues for a full list of proposed features (and known issues).

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

(back to top)

License

The licensing information of the project upon which this add-on is based.

For those asking for the License permission, 
it's WTFPL. So please feel free to modify/re-upload 
a better version whenever you like.

Therefore, this project is distributed under the MIT License. See LICENSE.txt for more information.

(back to top)

Contact

Christopher Chandler - christopher.chandler@outlook.de

Project Link: https://github.com/christopher-chandler/Bulk_Generate_Japanese_Vocab_Frequency

(back to top)

Acknowledgments

@author: Myxoma (original creator)

(back to top)

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.idea		.idea
anki_addon		anki_addon
databases		databases
readme_img		readme_img
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.MD		README.MD
__init__.py		__init__.py
config.MD		config.MD
config.json		config.json
main.py		main.py
manifest.json		manifest.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bulk Generate Japanese Vocab Frequency

About The Project

Built With

Getting Started

Prerequisites

Installation

Usage

What does this frequency number mean?

Vocab field format

Example

Frequency Data

Remove HTML

Add Word Type Data

Roadmap

Updates

Contributing

License

Contact

Acknowledgments

About

Releases 3

Packages

Contributors 2

Languages

License

christopher-chandler/bulk_generate_japanese_vocab_frequency

Folders and files

Latest commit

History

Repository files navigation

Bulk Generate Japanese Vocab Frequency

About The Project

Built With

Getting Started

Prerequisites

Installation

Usage

What does this frequency number mean?

Vocab field format

Example

Frequency Data

Remove HTML

Add Word Type Data

Roadmap

Updates

Contributing

License

Contact

Acknowledgments

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 2

Languages

Packages