3.11.0 Using redis as jobstore will cause a large number of abnormal processes and eventually cause memory overflow #996

caiwenju · 2024-11-29T08:48:35Z

Things to check first

I have checked that my issue does not already have a solution in the FAQ
I have searched the existing issues and didn't find my bug already reported there
I have checked that my bug is still present in the latest release

Version

3.11.0

What happened?

我使用gunicorn 启动 flask 服务，并且使用aps 3.11.0 ,redis作为jobstore, addjob之后会出现大量的异常进程，最终导致内存溢出

How can we reproduce the bug?

from datetime import datetime
from apscheduler.schedulers.background import BackgroundScheduler
from apscheduler.jobstores.redis import RedisJobStore
from apscheduler.executors.pool import ThreadPoolExecutor, ProcessPoolExecutor
from apscheduler.triggers.cron import CronTrigger
from flask_apscheduler import APScheduler

APS_TZ = "Asia/Shanghai"
APS_JDE_COALESCE = True
APS_JDE_MAX_INSTANCES = 3
APS_JDE_MFG_TIME = 5
APS_JST_RDS_DB = 12
APS_JST_RDS_HOST = "192.168.0.17"
APS_JST_RDS_PORT = 46379
APS_JST_RDS_PWD = "Redis#2022@Cms"
APS_JDE_JITTER = 15

Executor 配置

executors = {
'default': ThreadPoolExecutor(max_workers=10),
'processpool': ProcessPoolExecutor(max_workers=5)
}

Job 默认配置

job_defaults = {
'coalesce': APS_JDE_COALESCE,
'max_instances': APS_JDE_MAX_INSTANCES,
"misfire_grace_time": APS_JDE_MFG_TIME
}

Redis jobstore 配置

redis_jobstore = RedisJobStore(db=APS_JST_RDS_DB,
host=APS_JST_RDS_HOST,
port=APS_JST_RDS_PORT,
password=APS_JST_RDS_PWD)

jobstores 字典配置

jobstores = {'redis': redis_jobstore}

BackgroundScheduler 配置

bg_scheduler = BackgroundScheduler(jobstores=jobstores, executors=executors, job_defaults=job_defaults)

使用 Flask-APScheduler 的 APScheduler 配置

scheduler = APScheduler(scheduler=bg_scheduler)

定时任务的函数

def task_1():
print("Task 1 is running")

添加任务

scheduler.add_job(
id="1", # 定时任务 ID
func=task_1,
replace_existing=True,
trigger=CronTrigger.from_crontab("* * * * *", timezone=APS_TZ),
jobstore='redis', # 使用 redis jobstore
executor="processpool", # 使用 processpool executor
misfire_grace_time=APS_JDE_MFG_TIME,
jitter=APS_JDE_JITTER
)

添加任务

scheduler.add_job(
id="14", # 定时任务 ID
func=task_1,
replace_existing=True,
trigger=CronTrigger.from_crontab("* * * * *", timezone=APS_TZ),
jobstore='redis', # 使用 redis jobstore
executor="processpool", # 使用 processpool executor
misfire_grace_time=APS_JDE_MFG_TIME,
jitter=APS_JDE_JITTER
)

添加任务

scheduler.add_job(
id="111111111", # 定时任务 ID
func=task_1,
replace_existing=True,
trigger=CronTrigger.from_crontab("* * * * *", timezone=APS_TZ),
jobstore='redis', # 使用 redis jobstore
executor="processpool", # 使用 processpool executor
misfire_grace_time=APS_JDE_MFG_TIME,
jitter=APS_JDE_JITTER
)

添加任务

scheduler.add_job(
id="1111", # 定时任务 ID
func=task_1,
replace_existing=True,
trigger=CronTrigger.from_crontab("* * * * *", timezone=APS_TZ),
jobstore='redis', # 使用 redis jobstore
executor="processpool", # 使用 processpool executor
misfire_grace_time=APS_JDE_MFG_TIME,
jitter=APS_JDE_JITTER
)

添加任务

scheduler.add_job(
id="11", # 定时任务 ID
func=task_1,
replace_existing=True,
trigger=CronTrigger.from_crontab("* * * * *", timezone=APS_TZ),
jobstore='redis', # 使用 redis jobstore
executor="processpool", # 使用 processpool executor
misfire_grace_time=APS_JDE_MFG_TIME,
jitter=APS_JDE_JITTER
)

添加任务

scheduler.add_job(
id="12", # 定时任务 ID
func=task_1,
replace_existing=True,
trigger=CronTrigger.from_crontab("* * * * *", timezone=APS_TZ),
jobstore='redis', # 使用 redis jobstore
executor="processpool", # 使用 processpool executor
misfire_grace_time=APS_JDE_MFG_TIME,
jitter=APS_JDE_JITTER
)

启动调度器

scheduler.start()

agronholm · 2024-11-29T08:49:42Z

Try again in English please.

caiwenju · 2024-11-29T08:50:45Z

root 9434 8777 9 13:21 pts/0 00:00:03 /usr/local/bin/python -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=6, pipe_handle=28) --multiprocessing-fork

agronholm · 2024-11-29T08:51:49Z

Still not English. Unless you rewrite your issue in English, I will close it. I cannot read Chinese.

caiwenju · 2024-11-29T08:52:54Z

I used gunicorn to start flask service, and used aps 3.11.0,redis as jobstore, after addjob there would be a lot of abnormal processes, and eventually caused memory overflow
from datetime import datetime
from apscheduler.schedulers.background import BackgroundScheduler
from apscheduler.jobstores.redis import RedisJobStore
from apscheduler.executors.pool import ThreadPoolExecutor, ProcessPoolExecutor
from apscheduler.triggers.cron import CronTrigger
from flask_apscheduler import APScheduler

APS_TZ = "Asia/Shanghai"
APS_JDE_COALESCE = True
APS_JDE_MAX_INSTANCES = 3
APS_JDE_MFG_TIME = 5
APS_JST_RDS_DB = 12
APS_JST_RDS_HOST = "192.168.0.17"
APS_JST_RDS_PORT = 46379
APS_JST_RDS_PWD = "Redis#2022@Cms"
APS_JDE_JITTER = 15

Executor 配置

executors = {
'default': ThreadPoolExecutor(max_workers=10),
'processpool': ProcessPoolExecutor(max_workers=5)
}

Job 默认配置

job_defaults = {
'coalesce': APS_JDE_COALESCE,
'max_instances': APS_JDE_MAX_INSTANCES,
"misfire_grace_time": APS_JDE_MFG_TIME
}

Redis jobstore 配置

redis_jobstore = RedisJobStore(db=APS_JST_RDS_DB,
host=APS_JST_RDS_HOST,
port=APS_JST_RDS_PORT,
password=APS_JST_RDS_PWD)

jobstores 字典配置

jobstores = {'redis': redis_jobstore}

BackgroundScheduler 配置

bg_scheduler = BackgroundScheduler(jobstores=jobstores, executors=executors, job_defaults=job_defaults)

使用 Flask-APScheduler 的 APScheduler 配置

scheduler = APScheduler(scheduler=bg_scheduler)

定时任务的函数

def task_1():
print("Task 1 is running")

添加任务

scheduler.add_job(
id="1", # 定时任务 ID
func=task_1,
replace_existing=True,
trigger=CronTrigger.from_crontab("* * * * *", timezone=APS_TZ),
jobstore='redis', # 使用 redis jobstore
executor="processpool", # 使用 processpool executor
misfire_grace_time=APS_JDE_MFG_TIME,
jitter=APS_JDE_JITTER
)

添加任务

scheduler.add_job(
id="14", # 定时任务 ID
func=task_1,
replace_existing=True,
trigger=CronTrigger.from_crontab("* * * * *", timezone=APS_TZ),
jobstore='redis', # 使用 redis jobstore
executor="processpool", # 使用 processpool executor
misfire_grace_time=APS_JDE_MFG_TIME,
jitter=APS_JDE_JITTER
)

添加任务

scheduler.add_job(
id="111111111", # 定时任务 ID
func=task_1,
replace_existing=True,
trigger=CronTrigger.from_crontab("* * * * *", timezone=APS_TZ),
jobstore='redis', # 使用 redis jobstore
executor="processpool", # 使用 processpool executor
misfire_grace_time=APS_JDE_MFG_TIME,
jitter=APS_JDE_JITTER
)

添加任务

scheduler.add_job(
id="1111", # 定时任务 ID
func=task_1,
replace_existing=True,
trigger=CronTrigger.from_crontab("* * * * *", timezone=APS_TZ),
jobstore='redis', # 使用 redis jobstore
executor="processpool", # 使用 processpool executor
misfire_grace_time=APS_JDE_MFG_TIME,
jitter=APS_JDE_JITTER
)

添加任务

scheduler.add_job(
id="11", # 定时任务 ID
func=task_1,
replace_existing=True,
trigger=CronTrigger.from_crontab("* * * * *", timezone=APS_TZ),
jobstore='redis', # 使用 redis jobstore
executor="processpool", # 使用 processpool executor
misfire_grace_time=APS_JDE_MFG_TIME,
jitter=APS_JDE_JITTER
)

添加任务

scheduler.add_job(
id="12", # 定时任务 ID
func=task_1,
replace_existing=True,
trigger=CronTrigger.from_crontab("* * * * *", timezone=APS_TZ),
jobstore='redis', # 使用 redis jobstore
executor="processpool", # 使用 processpool executor
misfire_grace_time=APS_JDE_MFG_TIME,
jitter=APS_JDE_JITTER
)

启动调度器

scheduler.start()

agronholm · 2024-11-29T08:53:51Z

By "abnormal" processes, do you mean the processes from ProcessPoolExecutor? Abnormal how?

caiwenju · 2024-11-29T08:55:58Z

This is what the exception process looks like: root 9434 8777 9 13:21 pts/0 00:00:03 /usr/local/bin/python -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=6, pipe_handle=28) --multiprocessing-fork
There are a large number of such processes, increasing indefinitely until the server's memory is used up

caiwenju · 2024-11-29T08:57:01Z

I'll give you a screenshot of my usage

agronholm · 2024-11-29T08:58:36Z

There should only be a maximum of 5 subprocesses, as defined by the configuration of ProcessPoolExecutor. Can you create a minimal working example that reproduces the issue?

caiwenju · 2024-11-29T08:58:42Z

caiwenju · 2024-11-29T09:00:01Z

I'll try to repeat it with a minimal example, and I'll send it to you later

caiwenju · 2024-11-29T09:24:35Z

from datetime import datetime
from apscheduler.schedulers.background import BackgroundScheduler
from apscheduler.jobstores.redis import RedisJobStore
from apscheduler.executors.pool import ThreadPoolExecutor, ProcessPoolExecutor
from apscheduler.triggers.cron import CronTrigger
from flask_apscheduler import APScheduler
from flask import Flask, jsonify

app = Flask(name)

APS_TZ = "Asia/Shanghai"
APS_JDE_COALESCE = True
APS_JDE_MAX_INSTANCES = 3
APS_JDE_MFG_TIME = 5
APS_JST_RDS_DB = 12
APS_JST_RDS_HOST = "192.168.0.17"
APS_JST_RDS_PORT = 46379
APS_JST_RDS_PWD = "Redis#2022@Cms"
APS_JDE_JITTER = 15

Executor 配置

executors = {
'default': ThreadPoolExecutor(max_workers=10),
'processpool': ProcessPoolExecutor(max_workers=5)
}

Job 默认配置

job_defaults = {
'coalesce': APS_JDE_COALESCE,
'max_instances': APS_JDE_MAX_INSTANCES,
"misfire_grace_time": APS_JDE_MFG_TIME
}

Redis jobstore 配置

redis_jobstore = RedisJobStore(db=APS_JST_RDS_DB,
host=APS_JST_RDS_HOST,
port=APS_JST_RDS_PORT,
password=APS_JST_RDS_PWD)

jobstores 字典配置

jobstores = {'redis': redis_jobstore}
bg_scheduler = BackgroundScheduler()
app.config.update({
"SCHEDULER_JOBSTORES": jobstores,
"SCHEDULER_EXECUTORS": executors,
"SCHEDULER_JOB_DEFAULTS": job_defaults,
"SCHEDULER_TIMEZONE": APS_TZ
})
scheduler = APScheduler(scheduler=bg_scheduler, app=app)

def heart_beat():
print(f"now: {datetime.now()}")

if not scheduler.get_job(id=f"{heart_beat.name}"):
scheduler.add_job(
id=f"{heart_beat.name}",
replace_existing=True,
func=heart_beat,
trigger=CronTrigger.from_crontab("*/1 * * * *", timezone=APS_TZ),
jobstore='redis',
)
else:
print(scheduler.get_job(id=f"{heart_beat.name}"))

scheduler.start()

def task1():
pass

@app.route('/', methods=['GET'])
def hello():
import time
scheduler.add_job(
id=str(int(time.time())), # 以定时任务id为调度id
func=task1,
replace_existing=True,
trigger=CronTrigger.from_crontab("* * * * *", timezone=APS_TZ),
start_date="2024-11-28 00:00:00",
end_date="2024-11-29 00:00:00",
jobstore='redis',
executor="processpool",
misfire_grace_time=APS_JDE_MFG_TIME,
jitter=APS_JDE_JITTER)
return jsonify(message="Hello, World!")

if name == "main":
app.run(host='0.0.0.0')

1.Write the above code to app.py and replace it with your own redis configuration

Write the following code into gunicorn_conf.py, in the same directory as app.py, and start with gunicorn app:app -c gunicorn_conf.py
workers = 10
threads = 5
backlog = 512
x_forwarded_for_header = 'X-FORWARDED-FOR'
reload = Tru
Do the request curl http://0.0.0.0:8000/
If you use ps -ef to view the process, a large number of /usr/local/bin/python -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=13, pipe_handle=28) --multiprocessing-fork

agronholm · 2024-11-29T09:25:54Z

Is the problem only reproducible with Flask and all those environment variables?

caiwenju · 2024-11-29T09:28:59Z

I'm not sure, because by running schedule.start() alone, it ends immediately, and I just start in flask and execute this addjob in the interface

caiwenju · 2024-11-29T09:30:25Z

agronholm · 2024-11-29T09:32:58Z

I'm not sure, because by running schedule.start() alone, it ends immediately, and I just start in flask and execute this addjob in the interface

That's because you're using BackgroundScheduler. If you used BlockingScheduler instead, it would keep running.

caiwenju · 2024-11-29T09:41:59Z

I have verified that it has nothing to do with flask. I will send you the code

Write the following code to p.y

from datetime import datetime
from apscheduler.schedulers.background import BackgroundScheduler, BlockingScheduler
from apscheduler.jobstores.redis import RedisJobStore
from apscheduler.executors.pool import ThreadPoolExecutor, ProcessPoolExecutor
from apscheduler.triggers.cron import CronTrigger
from flask_apscheduler import APScheduler
from flask import Flask, jsonify

app = Flask(__name__)

APS_TZ = "Asia/Shanghai"
APS_JDE_COALESCE = True
APS_JDE_MAX_INSTANCES = 3
APS_JDE_MFG_TIME = 5
APS_JST_RDS_DB = 12
APS_JST_RDS_HOST = "192.168.0.17"
APS_JST_RDS_PORT = 46379
APS_JST_RDS_PWD = "Redis#2022@Cms"
APS_JDE_JITTER = 15


executors = {
    'default': ThreadPoolExecutor(max_workers=10),
    'processpool': ProcessPoolExecutor(max_workers=5)
}


job_defaults = {
    'coalesce': APS_JDE_COALESCE,
    'max_instances': APS_JDE_MAX_INSTANCES,
    "misfire_grace_time": APS_JDE_MFG_TIME
}


redis_jobstore = RedisJobStore(db=APS_JST_RDS_DB,
                               host=APS_JST_RDS_HOST,
                               port=APS_JST_RDS_PORT,
                               password=APS_JST_RDS_PWD)


jobstores = {'redis': redis_jobstore}
bg_scheduler = BlockingScheduler()
app.config.update({
    "SCHEDULER_JOBSTORES": jobstores,
    "SCHEDULER_EXECUTORS": executors,
    "SCHEDULER_JOB_DEFAULTS": job_defaults,
    "SCHEDULER_TIMEZONE": APS_TZ
})
scheduler = APScheduler(scheduler=bg_scheduler, app=app)


def heart_beat():
    print(f"now: {datetime.now()}")

if not scheduler.get_job(id=f"{heart_beat.__name__}"):
    scheduler.add_job(
        id=f"{heart_beat.__name__}",
        replace_existing=True,
        func=heart_beat,
        trigger=CronTrigger.from_crontab("*/1 * * * *", timezone=APS_TZ),
        jobstore='redis',
    )
else:
    print(scheduler.get_job(id=f"{heart_beat.__name__}"))

if __name__ == "__main__":
    scheduler.start()

Write the following code to pp.py

from p import scheduler
from apscheduler.triggers.cron import CronTrigger
APS_TZ = "Asia/Shanghai"
APS_JDE_COALESCE = True
APS_JDE_MAX_INSTANCES = 3
APS_JDE_MFG_TIME = 5
APS_JST_RDS_DB = 12
APS_JST_RDS_HOST = "192.168.0.17"
APS_JST_RDS_PORT = 46379
APS_JST_RDS_PWD = "Redis#2022@Cms"
APS_JDE_JITTER = 15
def task1():
    pass

import time
scheduler.add_job(
    id=str(int(time.time())),  # 以定时任务id为调度id
    func=task1,
    replace_existing=True,
    trigger=CronTrigger.from_crontab("* * * * *", timezone=APS_TZ),
    start_date="2024-11-28 00:00:00",
    end_date="2024-11-29 00:00:00",
    jobstore='redis',
    executor="processpool",
    misfire_grace_time=APS_JDE_MFG_TIME,
    jitter=APS_JDE_JITTER)

Run the python p.py command
Run the python pp.py command
Run ps -ef to view a large number of processes

agronholm · 2024-11-29T09:44:51Z

Have you tested with only the default memory job store? Your issue title says that using specifically the redis job store causes this problem.

caiwenju · 2024-11-29T09:45:10Z

3.10.4 verion is ok

caiwenju · 2024-11-29T09:46:02Z

There is no problem without using redis, version 3.10.4 does not have this problem with redis

HK-Mattew · 2024-11-29T16:53:16Z

1.Write the above code to app.py and replace it with your own redis configuration

Write the following code into gunicorn_conf.py, in the same directory as app.py, and start with gunicorn app:app -c gunicorn_conf.py
workers = 10
threads = 5
backlog = 512
x_forwarded_for_header = 'X-FORWARDED-FOR'
reload = Tru

It seems to me that the problem is the way you are starting your Flask application.

Notice that in the gunicorn settings you have set gunicorn to use 10 workers. This means that 10 copies of your application will be started, along with 10 schedulers.

If the max_workers value in the ProcessPoolExecutor class is 5 by default, then when you start 10 workers from your Flask application, that means you will have 5x10 scheduler processes, which is a total of 50 scheduler processes alone.

agronholm · 2024-11-30T17:51:06Z

The example still relies on Flask. Flask-apscheduler is not supported here. The example isn't exactly minimal either.
I made my own example, and I'm not seeing more than 2 subprocesses even if I let the script run for a while.

caiwenju · 2024-12-02T01:47:22Z

The example still relies on Flask. Flask-apscheduler is not supported here. The example isn't exactly minimal either. I made my own example, and I'm not seeing more than 2 subprocesses even if I let the script run for a while.

Ok, maybe there is this problem with flask. At present, I have gone back to version 3.10.4 and will not have this problem. Moreover, I used the stace command to see that aps link redis has been blocked and then waiting for timeout

caiwenju · 2024-12-02T02:32:30Z

1.Write the above code to app.py and replace it with your own redis configuration

Write the following code into gunicorn_conf.py, in the same directory as app.py, and start with gunicorn app:app -c gunicorn_conf.py
workers = 10
threads = 5
backlog = 512
x_forwarded_for_header = 'X-FORWARDED-FOR'
reload = Tru

It seems to me that the problem is the way you are starting your Flask application.

Notice that in the gunicorn settings you have set gunicorn to use 10 workers. This means that 10 copies of your application will be started, along with 10 schedulers.

If the max_workers value in the ProcessPoolExecutor class is 5 by default, then when you start 10 workers from your Flask application, that means you will have 5x10 scheduler processes, which is a total of 50 scheduler processes alone.

No, he'll add an infinite pile, more than 50

agronholm · 2024-12-02T08:16:00Z

Why is my example not reproducing your problem then? I asked to test without Flask (and flask-apscheduler) to eliminate them as a source of problems. APScheduler 3.11 switched to using spawn instead of fork as the subprocess spawning method, but that should not cause problems unless you're creating more subprocesses as import side effects.

caiwenju added the bug label Nov 29, 2024

caiwenju changed the title ~~3.11.0 使用redis作为jobstore会出现大量的异常进程，最终导致内存溢出~~ 3.11.0 Using redis as jobstore will cause a large number of abnormal processes and eventually cause memory overflow Nov 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

3.11.0 Using redis as jobstore will cause a large number of abnormal processes and eventually cause memory overflow #996

3.11.0 Using redis as jobstore will cause a large number of abnormal processes and eventually cause memory overflow #996

caiwenju commented Nov 29, 2024

agronholm commented Nov 29, 2024

caiwenju commented Nov 29, 2024

agronholm commented Nov 29, 2024

caiwenju commented Nov 29, 2024

agronholm commented Nov 29, 2024

caiwenju commented Nov 29, 2024

caiwenju commented Nov 29, 2024

agronholm commented Nov 29, 2024

caiwenju commented Nov 29, 2024

caiwenju commented Nov 29, 2024

caiwenju commented Nov 29, 2024

agronholm commented Nov 29, 2024

caiwenju commented Nov 29, 2024

caiwenju commented Nov 29, 2024

agronholm commented Nov 29, 2024

caiwenju commented Nov 29, 2024 •

edited by agronholm

Loading

agronholm commented Nov 29, 2024

caiwenju commented Nov 29, 2024

caiwenju commented Nov 29, 2024

HK-Mattew commented Nov 29, 2024

agronholm commented Nov 30, 2024

caiwenju commented Dec 2, 2024

caiwenju commented Dec 2, 2024

agronholm commented Dec 2, 2024 •

edited

Loading

3.11.0 Using redis as jobstore will cause a large number of abnormal processes and eventually cause memory overflow #996

3.11.0 Using redis as jobstore will cause a large number of abnormal processes and eventually cause memory overflow #996

Comments

caiwenju commented Nov 29, 2024

Things to check first

Version

What happened?

How can we reproduce the bug?

Executor 配置

Job 默认配置

Redis jobstore 配置

jobstores 字典配置

BackgroundScheduler 配置

使用 Flask-APScheduler 的 APScheduler 配置

定时任务的函数

添加任务

添加任务

添加任务

添加任务

添加任务

添加任务

启动调度器

agronholm commented Nov 29, 2024

caiwenju commented Nov 29, 2024

agronholm commented Nov 29, 2024

caiwenju commented Nov 29, 2024

Executor 配置

Job 默认配置

Redis jobstore 配置

jobstores 字典配置

BackgroundScheduler 配置

使用 Flask-APScheduler 的 APScheduler 配置

定时任务的函数

添加任务

添加任务

添加任务

添加任务

添加任务

添加任务

启动调度器

agronholm commented Nov 29, 2024

caiwenju commented Nov 29, 2024

caiwenju commented Nov 29, 2024

agronholm commented Nov 29, 2024

caiwenju commented Nov 29, 2024

caiwenju commented Nov 29, 2024

caiwenju commented Nov 29, 2024

Executor 配置

Job 默认配置

Redis jobstore 配置

jobstores 字典配置

agronholm commented Nov 29, 2024

caiwenju commented Nov 29, 2024

caiwenju commented Nov 29, 2024

agronholm commented Nov 29, 2024

caiwenju commented Nov 29, 2024 • edited by agronholm Loading

agronholm commented Nov 29, 2024

caiwenju commented Nov 29, 2024

caiwenju commented Nov 29, 2024

HK-Mattew commented Nov 29, 2024

agronholm commented Nov 30, 2024

caiwenju commented Dec 2, 2024

caiwenju commented Dec 2, 2024

agronholm commented Dec 2, 2024 • edited Loading

caiwenju commented Nov 29, 2024 •

edited by agronholm

Loading

agronholm commented Dec 2, 2024 •

edited

Loading