-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cluster in permanent failure state if all pods crash #24
Comments
It seems to be the same issue as I discovered here:
|
After deleting the existing permanent failure pod, it will still not recover, and all fail to join the Raft group as it doesn't exist yet. The only way to fix this atm is to delete the actual request for a cluster, let it terminate, then recreate. |
Yes currently if all pods crash and using the raft cluster then it will not be able to recover since quorum was lost and there won't be a leader able to bootstrap the cluster. |
There is no way to fix a cluster once the quorum is lost, you have to destroy all data and start from zero? |
There is probably also an issue because it's currently not possible to use a PV for the store and/or raft store, as you can only make one pvc for all pods due to #23 |
i had the same problem and i had to delete and recreate the whole thing several times to get it to suddenly work. is there something specific i can do to get it working the first time? |
Was't resilience supposed to be the great benefit of all this? I came this morning to the office and found ALL |
This is borderline ridiculous. |
If all pods go down, the cluster will never go up (in this case I was testing on a single node: minikube, and the node crashed)
The log will look permanently like this:
The text was updated successfully, but these errors were encountered: