ADAMS workflows for dealing with LLM datasets etc.
- compare-translated-alpaca.flow - allows comparing original and translated records (Alpaca JSON format) and whether to keep or discard them (i.e., cleaning up a dataset).
- kb_json_to_csv.flow - converts the JSON files generated from
a
kb-eval
script into a CSV file for analysis (example docker image). - kb_eval.flow - generates statistics from outputs generated by the
kb_eval
script - whisper_edit_distance.flow - computes the edit distance between ground truth and whisper output