A simple set of tools to automatically generate HSK worksheets as CSVs and PDFs files.
Install all Python dependencies with:
poetry install
You will also need Typst installed and available on PATH
to generate PDFs.
All commands listed below should be issued from this project's root folder unless otherwise stated.
This application uses the excellent AllSet Learning Chinese Vocabulary Wiki as its default crawler data source. Other crawlers can be implemented as long as they generate the following fields for each scrapped item:
Name | Type | Description |
---|---|---|
id | int | Sequential ID of each item in a given category |
category | str | Name of the item category |
chinese | str | Chinese word |
pinyin | str | Pinyin representation |
english | str | English translation |
To extract HSK 1
vocabulary to a CSV file at ./output/hsk_1.csv
, you should run:
scrapy crawl AllSetLearning -a hsk=1 -O ./output/hsk_1.csv
To generate a PDF HSK 1
worksheet at ./output/hsk_1.pdf
from a given CSV vocabulary file located at ./output/hsk_1.csv
, you should run:
typst compile template/main.typ output/hsk_1.pdf
--root .
--font-path font
--input hsk="1"
--input csv_file_path="../output/hsk_1.csv"
The resulting file should look like this: