Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

15692: Introduce background jobs #16927

Merged
merged 38 commits into from
Jul 30, 2024

Conversation

alehaa
Copy link
Contributor

@alehaa alehaa commented Jul 16, 2024

Fixes: #15692

This PR includes three new background job class to be used in NetBox and by NetBox plugins. They handle the necessary logic for starting and stopping jobs, including exception handling and rescheduling of recurring jobs.

This PR also implements the use of this framework for existing data source and script jobs, eliminating duplicate code and ensuring consistency.

alehaa added 10 commits June 17, 2024 23:02
A new abstract class can be used to implement job function classes. It
handles the necessary logic for starting and stopping jobs, including
exception handling and rescheduling of recurring jobs.

This commit also includes the migration of data source jobs to the new
framework.
Using the 'import_string()' utility from Django allows the job script
class to be simplified, as module imports no longer need to avoid loops.
This should make it easier to queue and maintain jobs.
Instead of maintaining two separate job execution logics, the same job
is now used for both background and interactive execution.
The independent implementations of interactive and background script
execution have been merged into a single BackgroundJob implementation.
A new abstract class can be used to implement job function classes that
specialize in scheduling. These use the same logic as regular
BackgroundJobs, but ensure that they are only scheduled once at any given
time.
A new abstract class can be used to implement job function classes that
specialize in system background tasks (e.g. synchronization or
housekeeping). In addition to the features of the BackgroundJob and
ScheduledJob classes, these implement additional logic to not need to be
bound to an existing NetBox object and to setup job schedules on plugin
load instead of an interactive request.
@jeremystretch jeremystretch added this to the v4.1 milestone Jul 17, 2024
Copy link
Member

@jeremystretch jeremystretch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alehaa thank you for such a thorough and well-documented PR! Apologies again for the delay in review. At a high level, this all looks doable. I just have a few questions about the implementation details.

netbox/core/management/commands/syncdatasource.py Outdated Show resolved Hide resolved
netbox/core/models/jobs.py Outdated Show resolved Hide resolved
netbox/utilities/jobs.py Outdated Show resolved Hide resolved
netbox/utilities/jobs.py Outdated Show resolved Hide resolved
netbox/utilities/jobs.py Outdated Show resolved Hide resolved
alehaa and others added 10 commits July 24, 2024 09:48
This partially reverts commit db591d4. The 'run_now' parameter of
'enqueue()' remains, as its being used by following commits.
Because scripts are already linked through the Job Instance field, the
name is displayed twice. Removing this reduces redundancy and opens up
the possibility of simplifying the BackgroundJob framework in future
commits.
Instead of using separate classes, the logic of ScheduledJob is now
merged into the generic BackgroundJob class. This allows reusing the
same logic, but dynamically deciding whether to enqueue the same job
once or multiple times.
Instead of defining individual names on enqueue, BackgroundJob classes
can now set a job name in their meta class. This is equivalent to other
Django classes and NetBox scripts.
@alehaa
Copy link
Contributor Author

alehaa commented Jul 24, 2024

@jeremystretch I‘m a bit confused about your commits. You requested changes, I‘ve worked a day on them and just before uploading commits apear in this PR. 🧐

@alehaa alehaa marked this pull request as draft July 24, 2024 20:59
@jeremystretch
Copy link
Member

@alehaa apologies for the confusion; it had been several days with no response so I assumed you did not have time to make the changes. I've undertaken the work we discussed above as well as some general cleanup. We need to have all the v4.1 work merged by end of week.

@jeremystretch
Copy link
Member

@alehaa do you want to handle renaming ScheduledJob from here, or should I just wrap it up?

@alehaa
Copy link
Contributor Author

alehaa commented Jul 24, 2024

@jeremystretch no worries, but I’d like to keep it up from here. I see some of your code handles parts I‘ve not finished yet. Please give me till tomorrow to integrate your changes into mine and I‘ll upload the combination from both of our sources.

ChoiceSet was moved to core in 40572b5.
System jobs usually perform low-priority background tasks and therefore
can use a different queue than 'default', which is used for regular jobs
related to specific objects.
As the job's name is set by enqueue(), it must not be passed in handle()
to avoid duplicate kwargs with the same name.
Not only can a job's interval change, but so can the time at which it is
scheduled to run. If a specific scheduled time is set, it will also be
checked against the current job schedule. If there are any changes, the
job is rescheduled with the new time.
Instead of using a class method for run(), a regular method is used for
this purpose. This gives the possibility to add more convenience methods
in the future, e.g. for interacting with the job object or for logging,
as implemented for scripts.
netbox/utilities/jobs.py Outdated Show resolved Hide resolved
@alehaa
Copy link
Contributor Author

alehaa commented Jul 25, 2024

@jeremystretch I've finished my work and added everything to this PR and its ready for review now. Please note that I did need to do a force push, as some of our works did conflict and I didn't want to open a third PR for this issue. Therefore you might need to to a git fetch followed by a git reset --hard origin/15692-background-jobs to get the current state on your local machine.

Despite the open tasks and some bug fixing, I've

  • integrated additional tests for the BackgroundJob framework
  • did some cleanup of legacy code in 309ad29
  • implemented to use queue low for system background jobs by default
  • switched run() from a @classmethod to a regular method to simplify implementation and give options for possible future extensions

Also note the open comment on the setup() method above, maybe you can take a look at it? Feel free to add or request changes to the code as you see fit.

@alehaa alehaa marked this pull request as ready for review July 25, 2024 16:24
This commit reverts commits 4880d81 and 0b15ecf. Using the database
'connection_created' signal for job registration feels a little wrong at
this point, as it would trigger registration very often. However, the
background job framework is prepared for this use case and can be used
by plugins once the auto-registration of jobs is solved.
netbox/core/models/data.py Show resolved Hide resolved
# your logic goes here
```

You can schedule the background job from within your code (e.g. from a model's `save()` method or a view) by calling `MyTestJob.enqueue()`. This method passes through all arguments to `Job.enqueue()`. However, no `name` argument must be passed, as the background job name will be used instead.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be good to give an example here of calling enqueue

netbox/extras/api/views.py Outdated Show resolved Hide resolved
netbox/extras/management/commands/runscript.py Outdated Show resolved Hide resolved
netbox/core/models/jobs.py Show resolved Hide resolved
netbox/utilities/tests/test_jobs.py Outdated Show resolved Hide resolved
object_type = ObjectType.objects.get_for_model(instance, for_concrete_model=False)
rq_queue_name = get_queue_for_model(object_type.model)
if schedule_at and immediate:
raise ValueError("enqueue() cannot be called with values for both schedule_at and immediate.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be internationalized with _(

Defining names for background jobs was disabled with fb75389. The
preceeding changes in 257976d did forget the management command.
@arthanson
Copy link
Collaborator

BTW @alehaa IMHO excellent work on this, both the writeup in the issue and the PR itself is very high quality and impressive. Nice work!

@jeremystretch
Copy link
Member

jeremystretch commented Jul 30, 2024

Seems like we're just about there, however I still have one concern: As it stands, I expect the average user will have difficulty distinguishing between the Job database model and the BackgroundJob runner class. I'd like to rename BackgroundJob to JobRunner for clarity. IMO this better conveys the relationship between classes which embody the programming logic and the database records which track its execution.

@jeremystretch jeremystretch merged commit d3a3a6b into netbox-community:feature Jul 30, 2024
3 checks passed
@jeremystretch
Copy link
Member

Thanks again for all your work on this @alehaa!

@alehaa alehaa deleted the 15692-background-jobs branch July 30, 2024 19:33
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 30, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants