Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3 assetstore doesn't work #754

Open
arjunrajlab opened this issue Jul 16, 2024 · 9 comments
Open

S3 assetstore doesn't work #754

arjunrajlab opened this issue Jul 16, 2024 · 9 comments
Assignees
Labels
bug Something isn't working
Milestone

Comments

@arjunrajlab
Copy link
Collaborator

Appears that the file can be put onto the assetstore, but for some reason, it is not recognized and converted to a large image.

@arjunrajlab arjunrajlab added the bug Something isn't working label Jul 16, 2024
@arjunrajlab arjunrajlab added this to the Alpha-Version milestone Jul 16, 2024
@manthey
Copy link
Member

manthey commented Jul 16, 2024

Curiously, when we upload to an S3 bucket it takes a small amount of time before we can read back the data. (Maybe 0.9 seconds?). As soon as the upload is done, we try to make the uploaded file a large_image; this fails because we can't actually read it (we get a response saying the key doesn't exist). The rest of the processing steps don't occur. I manually hack in a wait loop until we can read the data, it does work. I'll try to convert that hack to something less hacky.

@arjunrajlab
Copy link
Collaborator Author

I was looking it up and seems that S3 has an eventual consistency model that may be the culprit? It seems that you get instant read after write for a PUT of an object, but if you modify or erase or something, there can be a delay. Perhaps there is some modification instead of a new object going on here?

@manthey
Copy link
Member

manthey commented Jul 17, 2024

I'll have a PR in large_image shortly that checks that the size of file reported by S3 is the size we think we uploaded and waits if it isn't. In my testing, the delay to and from S3 is certainly variable -- sometimes it seems to be a few milliseconds but for a little while it was getting close to a second.

@manthey
Copy link
Member

manthey commented Jul 17, 2024

See girder/large_image#1579

@arjunrajlab
Copy link
Collaborator Author

Awesome! So… it works-ish. If I upload a file, it still gets hung up at the same point:

image

Now, however, if I refresh the page, it goes through:

image

So I guess we need something on the front end to wait for girder/large_image? Will we need that in other parts of the interface, too?

@arjunrajlab
Copy link
Collaborator Author

As a related question: let's say a file grew "old" and S3 was configured to put that into Glacier. Then, I guess we would have to initiate a request for retrieval and then check in periodically to check that the file came back. Would that be mostly front end stuff, or can it be built into Girder for the most part? Just curious at this stage.

@manthey
Copy link
Member

manthey commented Jul 18, 2024

It depend on how AWS exposes the file. If the glacier storage is a different assetstore, then when the file is moved to glacier, we'd need some trigger to update the file record in Mongo to point to the glacier assetstore and have the appropriate key in that assetstore. If it is the same file, then the main issue with this is when you ask for a file in glacier, it can take hours before it is available, so on first query, it will appear broken and then only work after those hours have passed.

@arjunrajlab
Copy link
Collaborator Author

My impression is that it is the latter, meaning that the file looks to be there, but you need to wait for it to come back to actually get it. One option is to manually restore the files and just have something that warns the user "File in deep storage; request for it to be restored" or something like that.

@arjunrajlab
Copy link
Collaborator Author

@bruyeret I think we need to update the front end to wait a little to get the image back to address the above issue, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Selected for development
Development

No branches or pull requests

2 participants