-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation Fault Code 11 with oshrun and shmem_finalize() on new openSHMEM system #6
Comments
On Mar 4, 2020, at 6:58 PM, ming17 ***@***.***> wrote:
Currently trying to install openSHMEM with ucx on a new system, and running into an error involving shmem_finalize(). I have included the source code and error message below, but am unsure what is causing the error.
What address is the segmentation fault (signal: 11) referring to? The code was run using oshrun and compiled using oshcc, as referenced in the installation guide.
Thanks for the report. I’ve never seen anything like this.
Need to do some discovery:
Does this happen with just 1 PE? If so, can you provide output from a run with the executable inside gdb?
Is the OPA in the hostname referring to Omnipath?
Tony
|
On Mar 4, 2020, at 7:03 PM, Tony Curtis ***@***.***> wrote:
> On Mar 4, 2020, at 6:58 PM, ming17 ***@***.*** ***@***.***>> wrote:
>
> Currently trying to install openSHMEM with ucx on a new system, and running into an error involving shmem_finalize(). I have included the source code and error message below, but am unsure what is causing the error.
>
> What address is the segmentation fault (signal: 11) referring to? The code was run using oshrun and compiled using oshcc, as referenced in the installation guide.
>
Thanks for the report. I’ve never seen anything like this.
Need to do some discovery:
Does this happen with just 1 PE? If so, can you provide output from a run with the executable inside gdb?
Is the OPA in the hostname referring to Omnipath?
Also your program run SHMEM_INFO output looks really out of date: are you up date with the repo at
https://github.com/openshmem-org/osss-ucx <https://github.com/openshmem-org/osss-ucx>
?
Tony
|
This does happen with just 1 PE. I don't have sudo access for the system so I can't get the gdb output (some debuginfo libs are missing). OPA is referring to omnipath. I will check about the installation date. |
Hi, I'm working on this system with ming17 too. We do not have install permissions on the system, but sent this repo to one of the admins. It looks like he installed osss-ucx/1.0.2, which is the latest version on your releases page. Are you saying that we should configure from the master? |
On Mar 6, 2020, at 10:44 AM, Alex Johnson ***@***.***> wrote:
Hi, I'm working on this system with ming17 too. We do not have install permissions on the system, but sent this repo to one of the admins. It looks like he installed osss-ucx/1.0.2, which is the latest version on your releases page. Are you saying that we should configure from the master?
Hi, yes, that is old. You should be tracking the master. You don’t need any root privileges, can just install things under your home directories...
Tony
|
Any update on this? I've been having this issue and can't find a solution. |
I have no context for the cause of this issue. |
Currently trying to install openSHMEM with ucx on a new system, and running into an error involving shmem_finalize(). I have included the source code and error message below, but am unsure what is causing the error.
What address is the segmentation fault (signal: 11) referring to? The code was run using oshrun and compiled using oshcc, as referenced in the installation guide.
Source Code:
Error:
The text was updated successfully, but these errors were encountered: