GitHub

Setup

I used my generic environment for this project. You can create a conda env from general_environment.yml (this has extraneous packages since it's my "general" environment). You also need an openai key to run experiments.

conda env create -f general_environment.yml
export PYTHONPATH=<root directory of this repo>
export OPENAI_API_KEY=<your key here>

Running experiments

To run the experiments on openai models, use src/prompt_openai.py. To run experiments on open-source models, please use src/prompt_open_llms.py. The arguments to these scripts are the same, so this is abbreviated to src/prompt_XXX.py. src/prompt_open_llms.py assumes someone is running the server on tir/babel.

After each experiment runs, the accuracy will be logged and a csv file with the prompts/answers will be saved to ./logs by default with an autogenerated file name (you can change this through --output). You can also use ./slurm_scripts/compile_results.py to compile all the results in a directory to csv. If you hit CTRL + C while running an experiment the intermediate results will also be saved.

Toy experiments

There are synthetic experiments on functions and colours domains.

Functions You can reproduce the three general approaches like this:

Base prompt (in-context examples only): python src/prompt_XXX.py --model <model_name> --dataset functions

With ground-truth instruction: python src/prompt_XXX.py --model <model_name> --dataset functions --prompt_type full_grammar

With self-induced instruction: python src/prompt_XXX.py --model <model_name> --dataset functions --prompt_type grammar_induction --hyp_reranking_method <method> --num_hyps 5

Colours Just replace functions with colours in the arguments below.

Base prompt (in-context examples only): python src/prompt_XXX.py --model <model_name> --dataset colours --use_min_cover

With ground-truth instruction: python src/prompt_XXX.py --model <model_name> --dataset colours --use_min_cover --prompt_type full_grammar

With self-induced instruction: python src/prompt_XXX.py --model <model_name> --dataset colours --use_min_cover --prompt_type grammar_induction --hyp_reranking_method <method> --num_hyps 5

Language translation

Kalamang The Kalamang experiments can be run from the mtob directory. Please set up a separate environment following the instructions in mtob as it is not compatible with the base environment. Hypothesis selection has now been added. The experimental settings are as follows:

cd mtob/baselines

(Note, the TGI-type model is based on an internal framework. Refer to the options in main.py to use huggingface versions of llama models)

Base prompt (in-context examples only): python main.py --use_reference_sentences --model_name <model_name> --direction <ek|ke> --temperature 0.05 --output_dir <output_dir>

With ground-truth instruction: python main.py --use_reference_sentences --use_reference_wordlist --use_reference_grammar_sketch --model_name <model_name> --direction <ek|ke> --temperature 0.05 --output_dir <output_dir>

With self-induced instruction: python main.py --induce_wordlist --use_induced_grammar --grammar_sketch_path ../resources/kalamang_grammar_sketch_<model_name>.txt --num_hyps 5 --model_name <model_name> --direction <ek|ke> --temperature 0.05 --output_dir <output_dir>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Setup

Running experiments

Toy experiments

Language translation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
data		data
logs		logs
mtob		mtob
slurm_scripts		slurm_scripts
src		src
tests		tests
LICENSE		LICENSE
README.md		README.md
general_environment.yml		general_environment.yml

License

nightingal3/rule_induction

Folders and files

Latest commit

History

Repository files navigation

Setup

Running experiments

Toy experiments

Language translation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages