Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add .get_task method to Schedulers - APS v4 #953

Open
1 task done
HK-Mattew opened this issue Aug 10, 2024 · 10 comments
Open
1 task done

Add .get_task method to Schedulers - APS v4 #953

HK-Mattew opened this issue Aug 10, 2024 · 10 comments

Comments

@HK-Mattew
Copy link
Contributor

Things to check first

  • I have searched the existing issues and didn't find my feature already requested there

Feature description

Hello,

My suggestion is to add the .get_task(task_id=...) method to the Schedulers.

Use case

I found myself in a situation where I needed to pass the Task instance directly to the .add_job method to get an existing task configuration.

I could use the method to get all tasks with .get_tasks. But I would have to filter this list every time to get a single specific task. I don't think this would be a very interesting approach in my use case and I believe my suggestion will be useful to others as well.

@agronholm
Copy link
Owner

I'll consider this, but I'm curious as to why you would need to pass a Task instance to add_job(). Could you explain that?

@HK-Mattew
Copy link
Contributor Author

I'll consider this, but I'm curious as to why you would need to pass a Task instance to add_job(). Could you explain that?

Because whenever I use the .add_job method, the .add_job method itself uses the .configure_task method internally and if I pass the task id to the .add_job method, it overwrites some configurations that I made previously in the task.

However, passing the Task instance does not overwrite my configuration.

I did not report this as a bug, because I am not sure if this is a bug or if it is actually expected behavior.

@agronholm
Copy link
Owner

Passing the task ID to add_job() should never overwrite any task configuration. Can you give me a reproducible example where you demonstrate such behavior?

@HK-Mattew
Copy link
Contributor Author

Passing the task ID to add_job() should never overwrite any task configuration. Can you give me a reproducible example where you demonstrate such behavior?

Good to know,

I'll reproduce this now

@HK-Mattew
Copy link
Contributor Author

Passing the task ID to add_job() should never overwrite any task configuration. Can you give me a reproducible example where you demonstrate such behavior?

Here is the sample code:

from apscheduler import Scheduler, SchedulerRole
from apscheduler.executors.async_ import AsyncJobExecutor
from apscheduler.executors.thread import ThreadPoolJobExecutor
from apscheduler.executors.subprocess import ProcessPoolJobExecutor
from apscheduler.datastores.mongodb import MongoDBDataStore
import config



scheduler_web_configs = dict(
    data_store=MongoDBDataStore(
        client_or_uri=config.MONGO_DB_URI,
        database=config.MONGO_DB_NAME
    ),
    role=SchedulerRole.scheduler,
    max_concurrent_jobs=100,
    job_executors={
        'async': AsyncJobExecutor(),
        'threadpool': ThreadPoolJobExecutor(),
        'processpool': ProcessPoolJobExecutor(),
    }
)



def func_to_task_1():
    ...


with Scheduler(**scheduler_web_configs) as scheduler:
    
    scheduler.configure_task(
        func_or_task_id='task1',
        func=func_to_task_1,
        job_executor='async',
        max_running_jobs=5
    )

    print(scheduler.get_tasks())

    """
    [print result]

    [Task(id='task1', func='__main__:func_to_task_1', job_executor='async',
    max_running_jobs=5, misfire_grace_time=None, metadata={}, running_jobs=0)]
    """


    scheduler.add_job(
        func_or_task_id='task1'
    )

    print(scheduler.get_tasks())

    """
    [print result]
    [Task(id='task1', func='__main__:func_to_task_1', job_executor='threadpool',
    max_running_jobs=1, misfire_grace_time=None, metadata={}, running_jobs=0)]
    """

In the result of my execution you can see that the .add_job method overrode some of my task settings. Like the max_running_jobs and job_executor fields.

@agronholm agronholm mentioned this issue Aug 11, 2024
3 tasks
@agronholm
Copy link
Owner

Ok, I understand the problem now, and it's a design issue. I'll have to refactor the add_task() data store method.

@mmmcorpsvit
Copy link

@agronholm , sorry for my question, are there any fix updates?

@agronholm
Copy link
Owner

I'm making some progress once in a while, but it seems that every time I fix something, I uncover another problem. The rabbit hole is deep ☹️
I'll get it done Soon(tm). But I have people in other projects constantly asking for updates, not just APScheduler...

@agronholm
Copy link
Owner

The hard work on AnyIO's next release is done, so I can focus on this now. Getting incremental updates to task configuration is the crux of the problem here. I'm still working on a solution to that.

@agronholm
Copy link
Owner

Sorry for the delay. I'm having a bit of trouble fixing AsyncScheduler.configure_task() to work with the data stores in a sane way, so I'm currently experimenting with different ways of implementing full and partial task updates. This might take some time, unfortunately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants