Here we list all the program-of-thoughts results through program generation
Model | Params | GSM8K | TheoremQA |
---|---|---|---|
ChatGPT | ? | 76.3 | 35.6 |
Codex | 175B | 71.6 | 23.9 |
GPT-3 | 175B | 60.4 | 16.6 |
PaLM | 540B | 51.3 | - |
PaLM-Coder | 540B | 50.9 | - |
codegen-mono | 15B | 12.7 | 11.8 |
codet5+ | 15B | 12.5 | 11.6 |
xgen | 7B | 11.0 | 11.4 |
codegen-multi | 15B | 8.2 | 10.2 |