Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

list_tasks() return less than shown on the website #1222

Open
xieleo5 opened this issue Feb 25, 2023 · 2 comments
Open

list_tasks() return less than shown on the website #1222

xieleo5 opened this issue Feb 25, 2023 · 2 comments
Labels
bug serverside These issues are present in the rest API and not fixable by the Python package.

Comments

@xieleo5
Copy link

xieleo5 commented Feb 25, 2023

Description

I'm trying to list all the tasks in OpenML database. I tried to use task_list = openml.tasks.list_tasks() but it only return a list of length 46779. I saw on the OpenML official website there are 261.0k tasks. Is there any APIs that can help me to get all these tasks?

I also tried to add task_type like task_list = openml.tasks.list_tasks(openml.tasks.TaskType.SUPERVISED_REGRESSION), the returned task are still less than the filtered result on website. I only get 3939 supervised classification tasks but the website shows 4345. I only get 2600 supervised regression tasks but the website shows 19459.

Steps/Code to Reproduce

import openml
task_list = openml.tasks.list_tasks()
print(task_list)

Expected Results

task_list contains all the 261.0k task_id and infos.

Actual Results

It only contains 46779 tasks.

@LennartPurucker
Copy link
Contributor

Heyho,

Thanks for pointing this out.

This might be a server problem, or the numbers on the website might be wrong.
The API, which list_tasks calls, also only returns 2600 entries (https://api.openml.org/api/v1/json/task/list/type/2).

@PGijsbers do you know more about this?

@PGijsbers
Copy link
Collaborator

PGijsbers commented Mar 1, 2023

No, I was under the impression that the website internally also uses the same API to get their data (+ elastic search), so based on that I can't explain the discrepancy. @joaquinvanschoren ?

@LennartPurucker LennartPurucker added bug serverside These issues are present in the rest API and not fixable by the Python package. labels Apr 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug serverside These issues are present in the rest API and not fixable by the Python package.
Projects
None yet
Development

No branches or pull requests

3 participants