-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix vanishing thread data between deserialising and performing an ActiveJob instance #36
Open
caius
wants to merge
2
commits into
Betterment:main
Choose a base branch
from
caius:cd/activejob_base_execute
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I managed to dig up, in an old archive repo, the original comment thread pertaining to the choice of
job.perform_now
vsActiveJob::Base.execute(job_data)
in this custom job wrapper class.Basically, it's because of the
delegate_missing_to :job
call (above), and how we wanted to support defining things likemax_attempts
andreschedule_at
(and other DJ-specific interfaces) on the ActiveJob class, and have the wrapper be able to call into the ActiveJob class in order to expose those interfaces to the Delayed::Worker.And the implementation choice was made here to ensure that
job
is the same object instance as the thing that ultimately gets performed withinexecute(job_data
). Otherwise, you'd have one instance of the ActiveJob (@job
, memoized below) responding to the worker, and a different instance deserialized within ActiveJob's.execute
call.So the implementation here was based on the assumption that the internal implementation of
execute
looks something like this:And we wanted to avoid calling
deserialize(job_data)
twice. Not just because it requires doing 2x the work, but because, in rare cases, you may actually want the legacy DelayedJob methods to be able to change their response depending on the result of theperform
. I've only ever seen this use case once or twice, though. Usually, they are a pure function or return a static value, like this:However, it's possible to make them react to the results of the
perform
method:And this breaks if
@job
in the wrapper (which the worker can see) is a completely different object from the deserialized job inside of.execute
.All of that said, I can see exactly what problem you are encountering. Because, if we deserialize the job outside of the
run_callbacks
block, callbacks that deal with setting up global state will swap out that global state from underneath the existingjob
instance. No bueno!The way I can think to resolve this is to essentially pull everything back into the
run_callbacks
block, but expose an attr_reader so that the instantiated@job
instance can be read back out by the worker.This assumes that the worker doesn't need the job instance before it calls
perform
. If it does... maybe there's still a double-instantiation that you'd have to do to make it happy (and then throw away the first instance you instantiated), but at the very least we can guarantee that any delegated method calls that occur after the job has completed will be delegated to the same instance of@job
that actually just ran itsperform
method.Does that make sense to you @caius?
(And, thank you for shining the flashlight on this very specific but very important piece of behavior. I learned a lot going down this rabbit hole, and I'm sorry I didn't get a chance to do it sooner!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Another choice we could make here would be to not support that very niche "read the ivar from the perform method to respond dynamically to the worker" kind of configuration method. Technically a breaking change, and I'm hesitant to make it if there's still a way to make the job instance internally consistent.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, sorry it's taken a while to get back to you on this.
Thanks for going code-spelunking in the history, it's much clearer to understand why choices were made previously - and makes sense as to why it's written the way it is now given the context. I suspected this might be a case of DelayedJob meets ActiveJob and they have subtly different behaviours that don't interlace nicely.
Ah, this is an interesting use case I'd not considered. In my mind callbacks are either pure, or respond based on data held outside the instance as the ruby objects are basically ephemeral within the worker's runloop. In some ways that's quite a neat idea though, and in theory the ivar could be included in
job_data
to persist through different attempts of the job as well. Hrm. :thinking_face:I can definitely see benefits to trying to support both use cases here, but I think I also come down on the side of not wanting to make breaking changes if we can't. The ivar-callback pattern seems like a useful option to maintain, even if rarely used. Especially if you don't have (or need) to store that data somewhere outside the instance.
I started out agreeing, then thought about this some more and realised if that would work then it should work now as
JobWrapper#job
is already lazily loading the job data after . But then we wouldn't have seen the behaviour we did, so there must be something accessingJobWrapper#job
beforeperform
is invoked. If no messages are delegated to missing through theJobWrapper
it wouldn't be loaded (ie,ActiveRecord::Base#deserialize
) before entering the callback block inJobWrapper#perform
.Looking at the worker class, we go through
Delayed::Runnable#start
,Delayed::Worker#run!
,Delayed::Worker#work_off
then descend into iterating the reserved jobs wherejob
is anDelayed::Job
instance, andjob#payload_object
will return theDelayed::JobWrapper
instance - which at this point hasn't yet loaded@job
.Presuming no Delayed callbacks defined in this example, we call through
Worker#run_thread_callbacks
intoWorker#run_job
, passing theDelayed::Job
instance with#payload_object
still not having loaded@job
yet.#run_job
then invokes any Delayed perform callbacks and calls intoDelayed::Worker#run
.The first thing we do in
Worker#run
is generate a metadata hash by calling some methods on theDelayed::Job
instance. Most of these are database columns, so won't trigger theJobWrapper
loading the@job
ivar. We do however calljob.name
which isDelayed::Backend::Base#name
, and checks ifpayload_object
responds_to?
a couple of things. Turns outdelegate_missing_to
in turn delegatesrespond_to?
(viarespond_to_missing?
), which causes the@job
to be loaded to handle therespond_to?
. This means the job is deserialized before we get tojob.invoke_job
a couple of lines later inWorker#run
.I've not used
delegate_missing_to
before, so wrote a little in-memory reproduction to prove to myself that was the case:As you can see after calling
j.name
the JobWrapper instance gains the@job
instance, becausedelegate_missing_to
is callingjob.respond_to?
which in turn then deserializes our application job instance.Then went and proved it was due to the
respond_to_missing?
falling through:I think we're at a point where the two use cases just don't mesh in a way we can make happen, lazily deserializing the job instance outside of the callbacks means anything run during deserialize that affects things outside the job instance aren't guaranteed to persist through the ActiveJob callbacks being invoked. And if we deserialize the job instance again, we break any ivar-trickery that works with DelayedJob natively.
I'm half-tempted to suggest we shouldn't be doing anything that operates outside the job instance in
#deserialize
in our application code - currently we've worked around this issue by moving the loading logic forCurrent.services =
into abefore_perform
callback, which is working fine. The job gets deserialized before the ActiveJob callbacks are invoked, then the before_perform callback sets up the environment for us just before the#perform
method is invoked.I wonder if potentially we could move the ActiveJob callbacks up a level in Delayed, so job deserialization happens inside the callbacks even when Delayed callbacks are accessing the job instance. :thinking_face:
We'd have to keep supporting non-ActiveJob jobs being scheduled, so it would require branching in
Delayed::Job#run_thread_callbacks
as the earliest point non-Delayed code could trigger job deserialization. And then theJobWrapper
can just perform the job, knowing that the worker has already invoked the callbacks.Not sure what the knock-on effect of an ActiveJob callback erroring at that point would be vs the current behaviour though. I think currently that's handled by
Delayed::Job#run
havingrescue Exception
which would be bypassed by moving it earlier. Potentially would have to catch errors inrun_thread_callbacks
and invoke the error callbacks to maintain error handling parity.I wonder if an easier option might be to try and stop the
#run
metadata hash loading the job instance and document that any of the Delayed callbacks accessing the job can lead to ActiveJob's#deserialize
being called before the ActiveJob callbacks being executed, which can cause anything usingActiveSupport::CurrentAttributes
to be wiped between#deserialize
and#perform
. I suspect the correct answer for our use case is to use abefore_perform
callback as we are doing now, and treat#deserialize
as something that operates on the job instance and nothing else.What do you think about interleaving the
ActiveJob::Callbacks
earlier in the Worker, and having ActiveJob logic outside ofJobWrapper
@smudge?