Skip to content

Actions: openai/evals

Actions

All workflows

Actions

Loading...
Loading

Showing runs from all workflows
503 workflow runs
503 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Randomly select MMMU answer when none is returned from the model (#1447)
Run unit tests #1580: Commit ded9382 pushed by etr2460
December 24, 2023 19:23 2m 8s main
December 24, 2023 19:23 2m 8s
Randomly select MMMU answer when none is returned from the model
Run unit tests #1579: Pull request #1447 opened by etr2460
December 24, 2023 05:40 2m 1s erik/mmmu-random
December 24, 2023 05:40 2m 1s
Change wrong kwargs name (#1435)
Run unit tests #1578: Commit 02f35cc pushed by etr2460
December 21, 2023 17:46 2m 12s main
December 21, 2023 17:46 2m 12s
Fix Pydantic warning on data_test run (#1445)
Run unit tests #1577: Commit dd38662 pushed by etr2460
December 21, 2023 17:41 2m 4s main
December 21, 2023 17:41 2m 4s
Fix small typo in oaieval run function (#1438)
Run unit tests #1576: Commit 23ae8ab pushed by etr2460
December 21, 2023 17:40 2m 5s main
December 21, 2023 17:40 2m 5s
Fix Pydantic warning on data_test run
Run unit tests #1575: Pull request #1445 opened by inwaves
December 21, 2023 10:44 2m 2s inwaves:fix/PydanticWarningOnTestRun
December 21, 2023 10:44 2m 2s
Release 2.0.0 (#1444)
Run unit tests #1574: Commit 311e91e pushed by etr2460
December 21, 2023 01:37 2m 21s main
December 21, 2023 01:37 2m 21s
Add MMMU evals and runner (#1442)
Run unit tests #1573: Commit f20c305 pushed by etr2460
December 21, 2023 01:09 3m 5s main
December 21, 2023 01:09 3m 5s
Release 2.0.0
Run unit tests #1572: Pull request #1444 opened by etr2460
December 21, 2023 01:04 2m 7s release/2.0.0
December 21, 2023 01:04 2m 7s
Add MMMU evals and runner
Run new evals #2189: Pull request #1442 synchronize by etr2460
December 21, 2023 00:19 2m 27s erik/mmmu
December 21, 2023 00:19 2m 27s
Add MMMU evals and runner
Run unit tests #1571: Pull request #1442 synchronize by etr2460
December 21, 2023 00:19 2m 7s erik/mmmu
December 21, 2023 00:19 2m 7s
Use the API key for testing evals in CI (#1443)
Run unit tests #1570: Commit d30262c pushed by etr2460
December 21, 2023 00:18 3m 39s main
December 21, 2023 00:18 3m 39s
Use the API key for testing evals in CI
Run unit tests #1569: Pull request #1443 opened by etr2460
December 20, 2023 23:44 2m 15s erik/test-eval-api-key
December 20, 2023 23:44 2m 15s
Add MMMU evals and runner
Run new evals #2188: Pull request #1442 synchronize by etr2460
December 20, 2023 22:36 1m 52s erik/mmmu
December 20, 2023 22:36 1m 52s
Add MMMU evals and runner
Run unit tests #1568: Pull request #1442 synchronize by etr2460
December 20, 2023 22:36 2m 12s erik/mmmu
December 20, 2023 22:36 2m 12s
Add MMMU evals and runner
Run unit tests #1567: Pull request #1442 opened by etr2460
December 20, 2023 22:17 2m 10s erik/mmmu
December 20, 2023 22:17 2m 10s
Add MMMU evals and runner
Run new evals #2187: Pull request #1442 opened by etr2460
December 20, 2023 22:17 1m 55s erik/mmmu
December 20, 2023 22:17 1m 55s
Add complete list of errors to MakeMeSay utils (#1436)
Run unit tests #1566: Commit 4fcbed2 pushed by etr2460
December 20, 2023 17:58 2m 10s main
December 20, 2023 17:58 2m 10s
Run tests on all commits to main (#1441)
Run unit tests #1565: Commit f4dc762 pushed by etr2460
December 20, 2023 17:55 2m 10s main
December 20, 2023 17:55 2m 10s
Run tests on all commits to main
Run unit tests #1564: Pull request #1441 opened by etr2460
December 20, 2023 17:52 2m 7s erik/main-tests
December 20, 2023 17:52 2m 7s
Fix branch tests with empty API Key
Run unit tests #1563: Pull request #1440 opened by etr2460
December 20, 2023 17:34 2m 1s etr2460:erik/fix-branch-tests
December 20, 2023 17:34 2m 1s
Fix small typo in oaieval run function
Run unit tests #1561: Pull request #1438 opened by inwaves
December 20, 2023 12:04 2m 37s inwaves:fix/CompletionArgsTypo
December 20, 2023 12:04 2m 37s
Add complete list of errors to MakeMeSay utils
Run unit tests #1560: Pull request #1436 opened by inwaves
December 19, 2023 21:51 6h 0m 27s inwaves:fix/MakeMePayErrorHandling
December 19, 2023 21:51 6h 0m 27s
Add eval japanese prime minister
Run unit tests #1559: Pull request #1422 synchronize by return-nil
December 16, 2023 02:07 2m 41s return-nil:japanese_prime_minister
December 16, 2023 02:07 2m 41s