Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: adding python side retry mechanism #3354

Merged
merged 21 commits into from
Aug 26, 2024

Conversation

germa89
Copy link
Collaborator

@germa89 germa89 commented Aug 14, 2024

Description

Add a retry mechanism in Python. We are assuming, that because of network issues, one call didn't reach the MAPDL instance. Hence, we can try again safely.

My concern is to trigger twice long calls (SOLVE).

Issue linked

Related to #3342

Checklist

@germa89 germa89 requested a review from a team as a code owner August 14, 2024 17:04
@germa89 germa89 requested review from clatapie and pyansys-ci-bot and removed request for a team August 14, 2024 17:04
@ansys-reviewer-bot
Copy link
Contributor

Thanks for opening a Pull Request. If you want to perform a review write a comment saying:

@ansys-reviewer-bot review

@germa89 germa89 self-assigned this Aug 14, 2024
@github-actions github-actions bot added the new feature Request or proposal for a new feature label Aug 14, 2024
@github-actions github-actions bot added dependencies maintenance General maintenance of the repo (libraries, cicd, etc) labels Aug 14, 2024
@github-actions github-actions bot removed dependencies maintenance General maintenance of the repo (libraries, cicd, etc) labels Aug 19, 2024
@germa89
Copy link
Collaborator Author

germa89 commented Aug 20, 2024

I feel i should ask for some advicing here...

@koubaa @greschd @clatapie @ansys/pyansys-core please feel free to chip in, as much as you want, if you want :)

Copy link
Contributor

@SMoraisAnsys SMoraisAnsys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your implementation seems fine, I just left a minor comment.

src/ansys/mapdl/core/mapdl_core.py Show resolved Hide resolved
Copy link

codecov bot commented Aug 21, 2024

Codecov Report

Attention: Patch coverage is 89.79592% with 5 lines in your changes missing coverage. Please review.

Project coverage is 87.41%. Comparing base (93dd176) to head (90793b3).
Report is 6 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3354      +/-   ##
==========================================
+ Coverage   87.13%   87.41%   +0.27%     
==========================================
  Files          55       55              
  Lines        9816     9843      +27     
==========================================
+ Hits         8553     8604      +51     
+ Misses       1263     1239      -24     

@germa89
Copy link
Collaborator Author

germa89 commented Aug 22, 2024

@pyansys-ci-bot LGTM.

Copy link
Contributor

@pyansys-ci-bot pyansys-ci-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Approving this PR because germa89 said so in here 😬

LGTM

@germa89 germa89 enabled auto-merge (squash) August 22, 2024 14:44
@germa89
Copy link
Collaborator Author

germa89 commented Aug 22, 2024

The exception messages are now very clear:


Traceback (most recent call last):
  File "/Users/german.ayuso/pymapdl/src/ansys/mapdl/core/errors.py", line 318, in wrapper
    out = func(*args, **kwargs)
  File "/Users/german.ayuso/pymapdl/src/ansys/mapdl/core/mapdl_grpc.py", line 2498, in upload
    response = self._stub.UploadFile(chunks_generator)
  File "/Users/german.ayuso/pymapdl/.venv/lib/python3.10/site-packages/grpc/_channel.py", line 1518, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/Users/german.ayuso/pymapdl/.venv/lib/python3.10/site-packages/grpc/_channel.py", line 1005, in _end_unary_response_blocking
    raise _InactiveRpcError(state)  # pytype: disable=not-instantiable
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "failed to connect to all addresses; last error: UNKNOWN: ipv4:10.3.56.207:50053: Failed to connect to remote host: Connection refused"
        debug_error_string = "UNKNOWN:Error received from peer  {grpc_message:"failed to connect to all addresses; last error: UNKNOWN: ipv4:10.3.56.207:50053: Failed to connect to remote host: Connection refused", grpc_status:14, created_time:"2024-08-22T16:36:30.03004+02:00"}"
>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/german.ayuso/pymapdl/tmp/crash.py", line 18, in <module>
    with mapdl.non_interactive:
  File "/Users/german.ayuso/pymapdl/src/ansys/mapdl/core/mapdl_core.py", line 1380, in __exit__
    self._parent()._flush_stored()
  File "/Users/german.ayuso/pymapdl/src/ansys/mapdl/core/mapdl_grpc.py", line 2028, in _flush_stored
    out = self.input(
  File "/Users/german.ayuso/pymapdl/src/ansys/mapdl/core/errors.py", line 318, in wrapper
    out = func(*args, **kwargs)
  File "/Users/german.ayuso/pymapdl/src/ansys/mapdl/core/mapdl_grpc.py", line 1785, in input
    filename = self._get_file_path(fname, progress_bar)
  File "/Users/german.ayuso/pymapdl/src/ansys/mapdl/core/mapdl_grpc.py", line 1939, in _get_file_path
    self.upload(ffullpath, progress_bar=progress_bar)
  File "/Users/german.ayuso/pymapdl/src/ansys/mapdl/core/errors.py", line 373, in wrapper
    handle_generic_grpc_error(error, func, args, kwargs, reason, suggestion)
  File "/Users/german.ayuso/pymapdl/src/ansys/mapdl/core/errors.py", line 452, in handle_generic_grpc_error
    raise MapdlExitedError(msg)
ansys.mapdl.core.errors.MapdlExitedError: Error:
MAPDL server connection terminated unexpectedly while running:
  /var/folders/m7/qsr6z1m57rx8z_8xj8f3nm9r0000gp/T/tmp_jmpiuddomp.inp
called by:
  upload

Suggestions:
  MAPDL *might* have died because it executed a not-allowed command or ran out of memory.
  Check the MAPDL command output for more details.
  Open an issue on GitHub if you need assistance: https://github.com/ansys/pymapdl/issues
Error:
  failed to connect to all addresses; last error: UNKNOWN: ipv4:10.3.56.207:50053: Failed to connect to remote host: Connection refused
Full error:
<_InactiveRpcError of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "failed to connect to all addresses; last error: UNKNOWN: ipv4:10.3.56.207:50053: Failed to connect to remote host: Connection refused"
        debug_error_string = "UNKNOWN:Error received from peer  {grpc_message:"failed to connect to all addresses; last error: UNKNOWN: ipv4:10.3.56.207:50053: Failed to connect to remote host: Connection refused", grpc_status:14, created_time:"2024-08-22T16:36:30.03004+02:00"}"
>

@germa89
Copy link
Collaborator Author

germa89 commented Aug 22, 2024

@PipKat can you spot any issues here (surely there are!) ?

ansys.mapdl.core.errors.MapdlExitedError: Error:
MAPDL server connection terminated unexpectedly while running:
  /var/folders/m7/qsr6z1m57rx8z_8xj8f3nm9r0000gp/T/tmp_jmpiuddomp.inp
called by:
  upload

Suggestions:
  MAPDL *might* have died because it executed a not-allowed command or ran out of memory.
  Check the MAPDL command output for more details.
  Open an issue on GitHub if you need assistance: https://github.com/ansys/pymapdl/issues
Error:
  failed to connect to all addresses; last error: UNKNOWN: ipv4:10.3.56.207:50053: Failed to connect to remote host: Connection refused
Full error:
<_InactiveRpcError of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "failed to connect to all addresses; last error: UNKNOWN: ipv4:10.3.56.207:50053: Failed to connect to remote host: Connection refused"
        debug_error_string = "UNKNOWN:Error received from peer  {grpc_message:"failed to connect to all addresses; last error: UNKNOWN: ipv4:10.3.56.207:50053: Failed to connect to remote host: Connection refused", grpc_status:14, created_time:"2024-08-22T16:36:30.03004+02:00"}"
>

@germa89
Copy link
Collaborator Author

germa89 commented Aug 22, 2024

There are issues with CICD, taking forever to finish ... 75 minutes. It seems it gets stuck after runing this test.

Copy link
Contributor

Hello! 👋

Your PR is changing the image cache. So I am attaching the new image cache in a new commit.

This commit does not re-run the CICD workflows (since no changes are made in the codebase) therefore you will see the actions showing in their status Expected — Waiting for status to be reported. Do not worry. You commit workflow is still running here 😄

You might want to rerun the test to make sure that everything is passing. You can retrigger the CICD sending an empty commit git commit -m "Empty comment to trigger CICD" --allow-empty.

You will see this message everytime your commit changes the image cache but you are not attaching the updated cache. 🤓

@germa89 germa89 merged commit 553bae7 into main Aug 26, 2024
57 of 58 checks passed
@germa89 germa89 deleted the feat/adding-python-side-retry-mechanism branch August 26, 2024 11:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature Request or proposal for a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants