Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upload: Auto adjust maxChunkSize until upload succeeds after HTTP-413 #4826

Closed
wants to merge 2 commits into from

Conversation

jkellerer
Copy link

@jkellerer jkellerer commented Aug 6, 2022

Currently when Nextcloud runs behind a server with an upload limit, uploading a large file (e.g. several 100M) will likely fail when nextcloud.cfg is not manually adjusted on every endpoint to limit maxChunkSize (assuming an upload limit is usually not set to 1G but more likely 100M or lower).

This PR implements auto-adjustment of the max upload size when receiving HTTP-413 (request entity too large) or a close connection error after large uploads

This is achieved with the following strategy:

  • The config is enhanced with parameter failsafeMaxChunkSize (default 100M).
  • When HTTP-413 is received:
    • If requestSize > failsafeMaxChunkSize, the failsafe max size applies and the sync is retried immediately.
    • Otherwise maxChunkSize = requestSize / 2 and the sync is retried until it succeeds or minChunkSize is reached (default 1M).
  • A connection close error is treated like HTTP-413 if requestSize > failsafeMaxChunkSize. HTTP usually requires that requests are fully consumed, servers may decide to close the HTTP connection when too much data is sent to prevent this. As connection close is not a specific error, only large uploads are handled by this change.

While failsafeMaxChunkSize can be adjusted in nextcloud.cfg or via env var OWNCLOUD_FAILSAFE_MAX_CHUNK_SIZE, it should have a reasonable default so that it isn't required. When set properly it limits the amount of retries needed and enables uploads for the cases where the HTTP connection is closed as response to sending too much data.

For cases where an upload limit in the web server is greater 1M and a connection close error is only received for uploads >100M, the default values allow all uploads to succeed without any additional configuration.

This should fix most cases reporting errors in this area like #4278 and #925.

For 2 concurrent uploads (~0.5GB each) and the nginx frontend limited to 100M this looks like this (in an earlier implementation without failsafeMaxChunkSize):

2022-08-06 22:18:06:368 [ warning nextcloud.sync.credentials.webflow ./src/gui/creds/webflowcredentials.cpp:229 ]:	"Error transferring https://nextcloud/remote.php/dav/uploads/user/2789530175/0000000000000001 - server replied: Request Entity Too Large"
2022-08-06 22:18:06:369 [ info nextcloud.sync.networkjob.put ./src/libsync/propagateupload.cpp:87 ]:	PUT of "https://nextcloud/remote.php/dav/uploads/user/2789530175/0000000000000001" FINISHED WITH STATUS "UnknownContentError Server replied \"413 Request Entity Too Large\" to \"PUT https://nextcloud/remote.php/dav/uploads/user/2789530175/0000000000000001\"" QVariant(int, 413) QVariant(QString, "Request Entity Too Large")
2022-08-06 22:18:06:369 [ debug nextcloud.sync.propagator.upload ./src/libsync/propagateupload.cpp:662 ]	[ OCC::PropagateUploadFileCommon::commonErrorHandling ]:	"<html>\r\n<head><title>413 Request Entity Too Large</title></head>\r\n<body>\r\n<center><h1>413 Request Entity Too Large</h1></center>\r\n<hr><center>nginx/1.22.0</center>\r\n</body>\r\n</html>\r\n"
2022-08-06 22:18:06:369 [ info nextcloud.sync.database ./src/common/syncjournaldb.cpp:1798 ]:	Setting blacklist entry for "test.img.gz" 1 "Upload of 316 MB exceeds the max upload size allowed by the server. Reducing max upload size to 158 MB" 1659817086 0 1627988766 "" "" 0

2022-08-06 22:18:30:645 [ warning nextcloud.sync.credentials.webflow ./src/gui/creds/webflowcredentials.cpp:229 ]:	"Error transferring https://nextcloud/remote.php/dav/uploads/user/2750000573/0000000000000000 - server replied: Request Entity Too Large"
2022-08-06 22:18:30:645 [ info nextcloud.sync.networkjob.put ./src/libsync/propagateupload.cpp:87 ]:	PUT of "https://nextcloud/remote.php/dav/uploads/user/2750000573/0000000000000000" FINISHED WITH STATUS "UnknownContentError Server replied \"413 Request Entity Too Large\" to \"PUT https://nextcloud/remote.php/dav/uploads/user/2750000573/0000000000000000\"" QVariant(int, 413) QVariant(QString, "Request Entity Too Large")
2022-08-06 22:18:30:645 [ debug nextcloud.sync.propagator.upload ./src/libsync/propagateupload.cpp:662 ]	[ OCC::PropagateUploadFileCommon::commonErrorHandling ]:	"<html>\r\n<head><title>413 Request Entity Too Large</title></head>\r\n<body>\r\n<center><h1>413 Request Entity Too Large</h1></center>\r\n<hr><center>nginx/1.22.0</center>\r\n</body>\r\n</html>\r\n"
2022-08-06 22:18:30:646 [ info nextcloud.sync.database ./src/common/syncjournaldb.cpp:1798 ]:	Setting blacklist entry for "Test.ISO" 1 "Upload of 158 MB exceeds the max upload size allowed by the server. Reducing max upload size to 80 MB" 1659817110 0 1418024480 "" "" 0

2022-08-06 22:18:31:467 [ warning nextcloud.sync.credentials.webflow ./src/gui/creds/webflowcredentials.cpp:229 ]:	"Error transferring https://nextcloud/remote.php/dav/uploads/user/2789530175/0000000000000002 - server replied: Request Entity Too Large"
2022-08-06 22:18:31:467 [ info nextcloud.sync.networkjob.put ./src/libsync/propagateupload.cpp:87 ]:	PUT of "https://nextcloud/remote.php/dav/uploads/user/2789530175/0000000000000002" FINISHED WITH STATUS "UnknownContentError Server replied \"413 Request Entity Too Large\" to \"PUT https://nextcloud/remote.php/dav/uploads/user/2789530175/0000000000000002\"" QVariant(int, 413) QVariant(QString, "Request Entity Too Large")
2022-08-06 22:18:31:467 [ debug nextcloud.sync.propagator.upload ./src/libsync/propagateupload.cpp:662 ]	[ OCC::PropagateUploadFileCommon::commonErrorHandling ]:	"<html>\r\n<head><title>413 Request Entity Too Large</title></head>\r\n<body>\r\n<center><h1>413 Request Entity Too Large</h1></center>\r\n<hr><center>nginx/1.22.0</center>\r\n</body>\r\n</html>\r\n"
2022-08-06 22:18:31:467 [ info nextcloud.sync.database ./src/common/syncjournaldb.cpp:1798 ]:	Setting blacklist entry for "test.img.gz" 1 "Upload of 158 MB exceeds the max upload size allowed by the server. Reducing max upload size to 80 MB" 1659817111 0 1627988766 "" "" 0

After 3 attempts uploads succeed.

Hope this is accepted. Feedback is welcome.

@jkellerer jkellerer force-pushed the enh/handle-http-413 branch 5 times, most recently from 67be42c to 9ef12fb Compare August 11, 2022 18:42
@kesselb kesselb added 2. to review enhancement enhancement of a already implemented feature/code labels Aug 26, 2022
@codecov
Copy link

codecov bot commented Aug 26, 2022

Codecov Report

Merging #4826 (9ef12fb) into master (3dc583c) will decrease coverage by 3.61%.
Report is 80 commits behind head on master.
The diff coverage is 95.45%.

❗ Current head 9ef12fb differs from pull request most recent head 1cb6a19. Consider uploading reports for the commit 1cb6a19 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4826      +/-   ##
==========================================
- Coverage   60.79%   57.18%   -3.61%     
==========================================
  Files         145      138       -7     
  Lines       18836    17187    -1649     
==========================================
- Hits        11451     9829    -1622     
+ Misses       7385     7358      -27     
Files Coverage Δ
src/libsync/account.h 52.00% <100.00%> (+9.14%) ⬆️
src/libsync/configfile.cpp 26.25% <100.00%> (-1.78%) ⬇️
src/libsync/owncloudpropagator.cpp 86.32% <100.00%> (+1.79%) ⬆️
src/libsync/propagateuploadng.cpp 83.95% <100.00%> (-1.10%) ⬇️
src/libsync/propagateuploadv1.cpp 69.44% <100.00%> (ø)
src/libsync/syncfileitem.h 38.59% <100.00%> (-4.96%) ⬇️
src/libsync/syncoptions.h 100.00% <100.00%> (ø)
src/libsync/propagateupload.cpp 77.16% <93.33%> (+1.62%) ⬆️
src/libsync/syncoptions.cpp 71.73% <90.90%> (-2.73%) ⬇️

... and 94 files with indirect coverage changes

src/libsync/account.h Outdated Show resolved Hide resolved
src/libsync/account.h Outdated Show resolved Hide resolved
src/libsync/configfile.cpp Outdated Show resolved Hide resolved
src/libsync/propagateupload.cpp Outdated Show resolved Hide resolved
src/libsync/propagateupload.cpp Outdated Show resolved Hide resolved
src/libsync/propagateupload.cpp Outdated Show resolved Hide resolved
@jkellerer
Copy link
Author

Hi @allexzander, are you ok with the updates to your review or do you expect additional changes?

@kesselb
Copy link
Contributor

kesselb commented Sep 17, 2022

Thanks for your pull request 👍

It's a good improvement to handle a 413 response properly. I wonder if it's possible to make this feature work without a new configuration option. When I read the existing code correct we have some calculation to increase the chunk size step by step. Could we store the last working chunk size and fallback to it on 413 response? I guess there is nothing wrong about an additional configuration but we already have 4 for the chunked uploads 🤔

@jkellerer
Copy link
Author

Hi @kesselb, I see your point.

The reason for the config value is primarily to have a starting point for the auto-sensing that doesn't hurt performance too much (since both cases, connection close and HTTP-413 are handled the same for uploads larger failsafeMaxChunkSize).

Guess what can be done is to have a built-in range for failsafeMaxChunkSize (e.g. from 100mb down to 25mb) that is evaluated at the first failures and then remembered in the KV store. If uploads still fail, auto-sensing would go down to minChunkSize but keep this only until the next restart and start from the remembered failsafeMaxChunkSize again.

This ensures that performance cannot get too bad or stay bad when limits are changed again.

Only question is whether this should be reset at some time or stay in the DB forever?

@jkellerer
Copy link
Author

Hi @kesselb, @allexzander,

Have updated the implementation as follows:

  • On client restart _min and _max chunk size applies (always), until the first failure.
  • On the first failure, the remembered _failsafe chunk size applies and is used as start for auto-sensing.
  • The sensed _failsafe chunk size is remembered down to LowestPossibleFailsafeMaxChunkSize (25mb).
  • Lower limits than this are not persisted and always require auto-sensing after a client restart but this ensures a miss-configuration of the server cannot persist values that would hurt upload performance seriously.
  • In addition to _failsafe size, also the _initial chunk size is stored as this eliminates the initial upload with 10m that would otherwise happen with every new sync job.

@jkellerer
Copy link
Author

The latest commit addresses _initial chunk size to be persisted only when chunk size is dynamic (and supported). However I guess this value is better kept as runtime value in account() and probably should be done in a separate PR. For reference I have not removed it.

@allexzander
Copy link
Contributor

@mgallien WDYT?

@jospoortvliet
Copy link
Member

@mgallien as this is also causing clients to fail to sync, perhaps it'd be good to make it part of the fixes for reliable syncing?

@kesselb
Copy link
Contributor

kesselb commented Dec 7, 2022

If it takes more time to finish this pull request, we could also revert #3600

maxChunkSize = 1000 is only possible when the server accepts such a big file.

@sveken
Copy link

sveken commented Jan 20, 2023

If it takes more time to finish this pull request, we could also revert #3600

maxChunkSize = 1000 is only possible when the server accepts such a big file.

reverting it while a better fix is worked on would be appreciated.

@jkellerer
Copy link
Author

Btw. I'm happy to update the PR if anyone can find the time to review it. Nevertheless reducing maxChunkSize in the meantime is a quick and safe fix.

Copy link
Collaborator

@claucambra claucambra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, we are really sorry this slipped through the cracks for so long -- this is a fantastic contribution, thank you.

Would you be willing to fix the merge conflicts and make this mergeable again? If not, we are happy to cherry-pick your commits and fix it ourselves

@jkellerer
Copy link
Author

@claucambra, will look into it this week

@nextcloud-desktop-bot
Copy link

AppImage file: nextcloud-PR-4826-09be2f7e8139a39a115396d77a8cf666b548bcb4-x86_64.AppImage

To test this change/fix you can simply download above AppImage file and test it.

Please make sure to quit your existing Nextcloud app and backup your data.

@l4kr

This comment was marked as resolved.

@vithusel

This comment was marked as resolved.

@jospoortvliet
Copy link
Member

Hi team, what is needed to get this to the finishing line & get it merged?

@jkellerer
Copy link
Author

jkellerer commented Dec 10, 2023

@jospoortvliet, missing are fixes for unit tests. In the test system I had access to: Win Server 2016 (x64), MacOS (x64), I had no test failures (for the parts that are supported, Win2016 seems missing some parts, but the test that failed here was working fine).

Another Win11 instance created some other issues but that ran on ARM64, so not the most reliable result. To be honest I wasn't so sure that the changes made here are affecting the test failures as the results vary quite a bit depending on the used OS.

Signed-off-by: Matthieu Gallien <matthieu_gallien@yahoo.fr>
@Hyp3rSoniX
Copy link

Hyp3rSoniX commented Mar 30, 2024

Is it possible to revive this PR?

I had to fiddle around the chunk size config on my fresh new system to get it to work. It seems like this PR would modify the parameters on the fly dynamically, which could solve most such problems where the reverse proxy or whatever doesn't like the ootb chunk size.

Adding the following configs helped me in my current case on mac:

chunkSize=10000000
minChunkSize=1000000
maxChunkSize=50000000
targetChunkUploadDuration=6000

@JasperTheMinecraftDev
Copy link

Up! Please take a look at this team!

@pktiuk
Copy link

pktiuk commented May 20, 2024

@mgallien @allexzander Could you look at this PR? This fix is waiting too long.

@codefaux

This comment was marked as off-topic.

@Rello
Copy link
Contributor

Rello commented Dec 3, 2024

Hello,

Thank you for your contribution to the Desktop Client with this pull request.
We are closing this request as it is outdated, no longer relevant (e.g., due to Qt 6), or does not align with the current roadmap.

We truly value your support and hope to see more contributions from you in the future!

Best regards,
Nextcloud Desktop Team

@Rello Rello closed this Dec 3, 2024
@codefaux
Copy link

codefaux commented Dec 3, 2024

Wait, this is for an active issue which had comments on it a few weeks ago and were ignored. The reason it was not accepted is that your TESTS ARE BROKEN, from what I can see. You're closing it because you can't fix it or what?

Issue #4278 == Won't fix? Is this a dumb joke?

@Rello
Copy link
Contributor

Rello commented Dec 3, 2024

Hello,
thank you for reporting this. We will check if there is a misunderstanding here.
I will get back to you...

@skaskiewicz
Copy link

that's why I migrated all clients to owncloud
small things sometimes mean a lot

@codefaux
Copy link

codefaux commented Dec 3, 2024

small things sometimes mean a lot

I wouldn't count "basic file sync functionality" as a "small thing" personally.

@Rello
Copy link
Contributor

Rello commented Dec 4, 2024

Hello,

please let me explain the reason for this particular closing, which was not documented in the closing properly.

The feature is not isolated to one desktop client. It affects all clients: Web, Linux, Windows, macOS, macOS VFS, Android, iOS
Every client has some different implementation so we are aligning all of them by providing a proper integration via server capabilities, that will be used by all clients in the same way.

The server implementation was performed here:
nextcloud/server#47682

The corresponding desktop implementation will be tracked here
#7347

I hope this clears the confusion and I excuse myself for not putting this in the original closing post

@jkellerer jkellerer deleted the enh/handle-http-413 branch December 4, 2024 14:27
@xordspar0
Copy link

#7347 tracks the use of this feature that was added to the server: nextcloud/server#48758

Instead of this PR's clever logic to guess the max chunk size, clients can now ask the server how big the max chunk size is. Nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"Connection closed" message when syncing files larger then +- 100Mb