Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change_failure_domains_to_manual.py - Traceback error #160

Open
brionweka opened this issue Jun 21, 2023 · 2 comments
Open

change_failure_domains_to_manual.py - Traceback error #160

brionweka opened this issue Jun 21, 2023 · 2 comments

Comments

@brionweka
Copy link

Following error occurred on lab 4.0.5.14 cluster.

2023-06-21 18:49:13 LOG: Getting host-id from the current container on srv-35-02.lan
2023-06-21 18:49:23 LOG: Running 'weka debug manhole -s 0 getServerInfo' on srv-35-02.lan (10.233.105.0) via ssh, capturing output
2023-06-21 18:49:23 LOG: Waiting for host with srv-35-02.lan to become UP in 'weka cluster host'
2023-06-21 18:49:23 LOG: Running 'weka cluster host -J -F id=0' on srv-35-02.lan (10.233.105.0) via ssh, capturing output
2023-06-21 18:49:24 LOG: Validate the failure domain in the stable resources failure domain is FD_0, which means the container loaded properly with the right resources
2023-06-21 18:49:24 LOG: Running 'weka local resources --stable -J' on srv-35-02.lan (10.233.105.0) via ssh, capturing output
Traceback (most recent call last):
File "change_failure_domains_to_manual.py", line 522, in
main()
File "change_failure_domains_to_manual.py", line 499, in main
upgrade(
File "change_failure_domains_to_manual.py", line 513, in upgrade
changed_hosts, skipped_hosts = change_failure_domains(
File "change_failure_domains_to_manual.py", line 461, in change_failure_domains
wait_for_unhealthy_cluster(timeout_secs=wait_unhealthy_timeout_secs)
File "change_failure_domains_to_manual.py", line 64, in wait_for_unhealthy_cluster
if time.time() - start >= timeout_secs:
TypeError: '>=' not supported between instances of 'float' and 'NoneType'
[root@srv-35-02 postinstall] 2023-06-21 18:49:26 $ python3.8 change_failure_domains_to_manual.py -i /root/.ssh/dev.pem
2023-06-21 18:49:44 LOG: Queried srv-35-02.lan: currently running with failure domain type USER (id: 0, name=FD_0)
2023-06-21 18:49:44 LOG: No need to change srv-35-02.lan, it already has manual failure domain called FD_0
2023-06-21 18:49:44 LOG: Queried srv-35-03.lan: currently running with failure domain type AUTO (id: 5, name=)
No rebuild is currently in progress

@shazad-weka
Copy link
Contributor

shazad-weka commented Jun 22, 2023

@IdoWeka can you fix? I believe this is your script.

@IdoWeka
Copy link
Contributor

IdoWeka commented Jun 22, 2023

@shazad-weka i'll try to take a look

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants