Skip to content

Actions: openai/evals

Actions

All workflows

Actions

Loading...
Loading

Showing runs from all workflows
511 workflow runs
511 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Updates for Solvers
Run new evals #2197: Pull request #1461 synchronize by JunShern
January 26, 2024 08:25 1m 56s jun/solvers-update
January 26, 2024 08:25 1m 56s
Updates for Solvers
Run unit tests #1603: Pull request #1461 synchronize by JunShern
January 26, 2024 08:25 2m 14s jun/solvers-update
January 26, 2024 08:25 2m 14s
Updates for Solvers
Run new evals #2196: Pull request #1461 opened by JunShern
January 26, 2024 08:22 2m 11s jun/solvers-update
January 26, 2024 08:22 2m 11s
Updates for Solvers
Run unit tests #1602: Pull request #1461 opened by JunShern
January 26, 2024 08:22 2m 28s jun/solvers-update
January 26, 2024 08:22 2m 28s
Logged spec now includes overridden args (#1460)
Run unit tests #1601: Commit 3040d6f pushed by JunShern
January 26, 2024 07:12 2m 2s main
January 26, 2024 07:12 2m 2s
Add run_id to final_report from LocalRecorder (#1452)
Run unit tests #1600: Commit cf002f2 pushed by JunShern
January 26, 2024 07:07 2m 7s main
January 26, 2024 07:07 2m 7s
Logged spec now includes overridden args
Run unit tests #1599: Pull request #1460 opened by ojaffe
January 17, 2024 09:39 1m 57s ojaffe:ollie/logging_fix
January 17, 2024 09:39 1m 57s
Fix formatting/typing so pre-commit hooks pass (#1451)
Run unit tests #1596: Commit c66b5c1 pushed by etr2460
January 10, 2024 16:25 3m 34s main
January 10, 2024 16:25 3m 34s
icelandic gec eval (#1400)
Run unit tests #1595: Commit 105c2b9 pushed by etr2460
January 10, 2024 16:23 2m 11s main
January 10, 2024 16:23 2m 11s
Add eval yaml for Theory of Mind eval (#1453)
Run unit tests #1593: Commit 877d555 pushed by JunShern
January 9, 2024 02:34 2m 9s main
January 9, 2024 02:34 2m 9s
Add eval yaml for Theory of Mind eval
Run new evals #2192: Pull request #1453 opened by ojaffe
January 8, 2024 10:37 1m 54s ojaffe:ollie/tom_fix
January 8, 2024 10:37 1m 54s
Add eval yaml for Theory of Mind eval
Run unit tests #1592: Pull request #1453 opened by ojaffe
January 8, 2024 10:37 1m 57s ojaffe:ollie/tom_fix
January 8, 2024 10:37 1m 57s
Improve MMMU performance with prompt engineering (#1450)
Run unit tests #1588: Commit 2981e65 pushed by etr2460
January 3, 2024 18:20 2m 14s main
January 3, 2024 18:20 2m 14s
Improve MMMU performance with prompt engineering
Run unit tests #1587: Pull request #1450 opened by etr2460
January 3, 2024 18:15 2m 6s erik/mmmu-tuning
January 3, 2024 18:15 2m 6s
Add eval japanese prime minister (#1422)
Run unit tests #1586: Commit f1bb7cb pushed by etr2460
January 3, 2024 16:49 2m 1s main
January 3, 2024 16:49 2m 1s
Solve #1394 (#1395)
Run unit tests #1585: Commit 10b02c6 pushed by logankilpatrick
January 3, 2024 16:48 2m 13s main
January 3, 2024 16:48 2m 13s
Add a recorder for function calls (#1389)
Run unit tests #1584: Commit 0647721 pushed by logankilpatrick
January 3, 2024 16:46 2m 31s main
January 3, 2024 16:46 2m 31s
Add gpt-3.5-turbo-16k support to ctx len getter (#1388)
Run unit tests #1583: Commit bbe26f8 pushed by logankilpatrick
January 3, 2024 16:45 2m 12s main
January 3, 2024 16:45 2m 12s
Fixed parameter incorrect (#1378)
Run unit tests #1582: Commit 1dd2ea2 pushed by logankilpatrick
January 3, 2024 16:43 2m 16s main
January 3, 2024 16:43 2m 16s
Log model and usage stats in record.sampling
Run unit tests #1581: Pull request #1449 opened by JunShern
January 3, 2024 04:12 2m 8s jun/log-token-counts
January 3, 2024 04:12 2m 8s