Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] S3FS can sometimes send sigpipe signals to the application #44695

Open
mderoy opened this issue Nov 11, 2024 · 7 comments
Open

[C++] S3FS can sometimes send sigpipe signals to the application #44695

mderoy opened this issue Nov 11, 2024 · 7 comments

Comments

@mderoy
Copy link

mderoy commented Nov 11, 2024

Our application uses many instances running arrow in parallel on linux. Occasionally we will get a SIGPIPE signal from our calls to the s3 filesystem (for both reads and writes). We believe these are coming from the S3 api thats being embedded by arrow. Since arrow is multithreaded, handling these signals around the scope of our arrow api calls is ineffective.

AWS exposes an option to install their own handler
in SDKOptions.HttpOptions.installSigPipeHandler
https://github.com/aws/aws-sdk-cpp/blob/a154acd5893e2b2c913844e312648149d12a12d1/src/aws-cpp-sdk-core/include/aws/core/Aws.h#L103
but arrow has no way to set this. S3GlobalOptions only exposes the s3 log level and number of event loop threads, and the rest of the options are kept private within arrow.

Arrow needs a way to pass this (and probably other AWS options). So I'm opening this bug report.

if the community is okay with adding the installSigPipeHandler option to S3GlobalOptions I'd be happy to contribute.... though I wonder why we don't open the floodgates to let the user set s3 options directly at will?

version: 13.0.0
platform: rhel 8

Component(s)

C++

@mderoy mderoy changed the title s3fs can sometimes send sigpipe signals to the application, which can cause issues s3fs can sometimes send sigpipe signals to the application Nov 11, 2024
@mderoy mderoy changed the title s3fs can sometimes send sigpipe signals to the application S3FS can sometimes send sigpipe signals to the application Nov 11, 2024
@assignUser
Copy link
Member

Hey, thanks for your report!

if the community is okay with adding the installSigPipeHandler option to S3GlobalOptions I'd be happy to contribute.... though I wonder why we don't open the floodgates to let the user set s3 options directly at will?

I looked at the original PR and I don't think there was a motivating reason outside of scope in reducing the number of exposed global options. Maybe @pitrou can comment? A PR would be appreciated either way but I think you can go ahead and open the flood gates :D

@pitrou
Copy link
Member

pitrou commented Nov 11, 2024

I'm curious, why don't you simply ignore SIGPIPE signals by yourself? It should be as easy as:

signal(SIGPIPE, SIG_IGN);

IMHO, it doesn't seem useful to expose an option for something you can trivially do yourself. Except perhaps to document the existence of this issue.

@assignUser
Copy link
Member

IMHO, it doesn't seem useful to expose an option for something you can trivially do yourself.

Is there a reason not to expose the options? I doubt that they change constantly and create a lot of maintenance work?

@pitrou pitrou changed the title S3FS can sometimes send sigpipe signals to the application [C++] S3FS can sometimes send sigpipe signals to the application Nov 12, 2024
@pitrou
Copy link
Member

pitrou commented Nov 12, 2024

Is there a reason not to expose the options? I doubt that they change constantly and create a lot of maintenance work?

Do you mean expose additional individual options in S3GlobalOptions, or let people pass a SDKOptions directly?

@assignUser
Copy link
Member

hm I guess allowing SDKOptions to be passed would be the solution that requires minimal maintenance and provides maximum features to users? Or do we gain anything from wrapping it in our own abstraction with limited features? It should work with forward declaration similar to how we do it with AWSCredentialsProvider without the need to expose aws.h?

@pitrou
Copy link
Member

pitrou commented Nov 14, 2024

hm I guess allowing SDKOptions to be passed would be the solution that requires minimal maintenance and provides maximum features to users? Or do we gain anything from wrapping it in our own abstraction with limited features?

Well, the problem is that allowing SDKOptions needs the user to install the right version of the SDK headers, and include them in their project.

So, if there are any important options there, we should provide an abstraction around them.

pitrou added a commit to pitrou/arrow that referenced this issue Nov 14, 2024
This is mostly convenience for people who don't want to install their own signal handler, though the SDK will also log incoming signals.
@assignUser
Copy link
Member

assignUser commented Nov 15, 2024

to install the right version

😱 ah yeah, that is a reason to wrap them ^^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants