Skip to content

Conversion from other data formats

Gergely Sarkozi edited this page Mar 15, 2024 · 1 revision

Normally you don't create quizzes in the quiz JSON data format by hand, but rather you convert to this data format from some other format. This page contains some conversion scripts that can be useful starting points and also some tips about how you can easily create a new conversion script.

Existing conversion scripts

These conversion scripts are generally not perfect, they just get the job done. It's easy to break them by feeding them data they are not prepared to handle, but still, they can be useful starting points for more robust conversion tools.

From Markdown list-based answers

Example input:

# My sample quiz

## How much is $6*9$?

- $54$
- **$42$**
- $420$
- $69$

## Another question

- **Alpha**
- Bravo
- Charlie

Conversion Python script:

def convert(input_file: str, title: str, output_file: str) -> None:
    with open(input_file, 'r') as f:
        lines = f.readlines()

    questions = []
    current_question = None
    for line in lines:
        if line.startswith("##"):
            if current_question:
                questions.append(current_question)
            current_question = {"type": "single-choice", "question": line[2:].strip(), "choices": []}
        elif line.startswith("-"):
            line = line[1:].lstrip()
            if line.startswith("**"):
                correct = True
                line = line.replace("**", "")
            else:
                correct = False
            current_question["choices"].append({"content": line.strip(), "correct": correct})

    questions.append(current_question)
    data = {"title": title, "questions": questions}

    with open(output_file, 'w') as f:
        import json
        json.dump(data, f, indent=2)


if __name__ == "__main__":
    convert("in.txt", "TODO hardcoded title", "out.json")

Creating a new conversion script from scratch

Large language models are generally very good at handling structured data. They can be asked to create (e.g. Python) scripts that handle conversion between data formats. The resulting scripts are usually broken, but they can be a great starting point for a functional conversion tool. Just give the LLM an example of the input and output data format and it will produce an almost working script.