Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redis Semaphore reaching bad state #49

Open
ben-axnick opened this issue Sep 16, 2016 · 4 comments
Open

Redis Semaphore reaching bad state #49

ben-axnick opened this issue Sep 16, 2016 · 4 comments

Comments

@ben-axnick
Copy link

ben-axnick commented Sep 16, 2016

Hi there,

We're having some troubles with Redis Semaphore of late, we're no longer able to acquire locks on hundreds of keys. Looking closer, it seems to be because there are only VERSION and EXISTS subkeys, AVAILABLE and GRABBED are nowhere to be seen:

irb(main):055:0> redis.keys("SEMAPHORE:search_index_lock:6938264*")
=> ["SEMAPHORE:search_index_lock:6938264:VERSION", "SEMAPHORE:search_index_lock:6938264:EXISTS"]

calling lock will cause lpop / blpop to come back empty handed and the whole thing fails.

irb(main):051:0> semaphore.lock(1) { puts "hello" }
=> false

This of course makes intuitive sense, since an existing semaphore should have a list of AVAILABLE or GRABBED tokens at all times.

Do you have any thoughts about how we might be getting to this state, or what can be done to resolve it? For now I'm thinking we'll roll with expiration, so that at least we get a reset after being stuck for a while.

UPDATE 2016-09-19:

This is still occurring after moving to an all new keyspace with an expiration set. The keys I'm seeing don't have expiration on them, ttl returns -1. When testing directly, it all appears to work, so I'm potentially looking at some sort of race condition here that is causing the available list never to be created, or not to be repopulated properly once the semaphore is unlocked.

Since the ttl is not set on either the version or exists keys, I expect that the error must be occurring very early during the semaphore creation process.

One thing I'm noticing is that there's a small window between popping an available key and adding it to the "grabbed" keys that could cause a semaphore to fail, but that doesn't seem to be the issue here, as the keys should have an expiration set at this point.

The other thing I'm noticing is that behaviour gets a little bit undefined when two entities try to create a semaphore at the same time. One will start creating while the other will start immediately consuming the semaphore, assuming that it exists and is completely set up. It's possible problems are occurring there.

UPDATE 2016-09-22:

Since a semaphore is overkill for my needs, I've opted for a somewhat simplistic mutex implementation instead. Problem solved, I guess.

@tbrammar
Copy link

We're seeing the same behaviour whereby every 2-3 days Sempahore is reaching a bad state. We locking on average around 5-6,000 times a day. Eventually there are only version and exists keys available.

We tried setting expiration: 2.minutes but as @bentheax mentions above this seems to have little effect.

@thomasbalsloev
Copy link

We have probably seen the same issue.
No AVAILABLE key existed in Redis with the name we use and I was unable to get a lock.
I solved it by manually pushing the key (rails console):

2.2.0 :100 > c = Redis.current # get redis client instance
2.2.0 :101 > c.get('SEMAPHORE:package_builder:AVAILABLE')
=> nil
2.2.0 :102 > c.rpush('SEMAPHORE:package_builder:AVAILABLE', 1)
=> 1

Then I was able get a lock for our named semaphore:
2.2.0 :001 > s = Redis::Semaphore.new(:package_builder)
2.2.0 :002 > s.available_count
=> 1
2.2.0 :003 > s.lock(5) {puts "hello"}
hello
=> nil

Hope this helps someone!
However this is not a permanent fix.

@dv
Copy link
Owner

dv commented Sep 20, 2017

Is there any chance your Redis DB ever reached the maximum memory size, causing it to start to evict non-persisted keys?

@ben-axnick
Copy link
Author

ben-axnick commented Sep 21, 2017

I can't rule out the possibility, it was so long ago that I no longer have access to that data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants