-
Notifications
You must be signed in to change notification settings - Fork 807
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve speed for initial sync with virtual files #4424
Comments
I'm experiencing a similar problem on Mac OS (around 200k files). In my humble opinion, syncrhonizing the full hierarchy is the key problem here. The typical end user doesn't need to have the full folder hierarchy saved and synchronized. A lazier approach (i.e. trigger on open and/or scan the opened subfolders only, and not the whole depth trough but perhaps 1 or 2 levels below) would grant more scalability and decrease the load on the NC server. |
I'd like to add, that restarting the sync, client or PC will result in a complete restart of the process. Also the sync doesn't seem to start immediately, but it first counts all files it will sync and then starts syncing. The counting alone takes two days for me and the sync isn't done after more than 10 days. At least the sync should pick up where it left of. |
I have the same issue. The most annoying part is that I do not need the folders with all the little files available on my desktop. So it would be enough if I could say "do not sync this folder unless it is accessed by the user". I think the suggestion from @marcotrevisan (see #4464) also sounds promising. |
See #4918 (comment) for a description of a problem with the tray window related to speed issues for inital sync |
I can confirm this issue. I started syncing virtual files (~1 million) on a new notebook. Then I read about restarting the client software here. Now I took some data to verify this behavior: So the best workaround would be a script that restarts the Nextcloud client every 30 minutes or so. 😜 Bu it would be great if this could be fixed. Server: 24.0.7 (docker) |
With latest Nextcloud Client 3.7.3 an inital sync on ~150k files took <1 hour where it was a whole night and endless errors in the past. |
@CWempe |
Like I said here : #3120 (comment), I'm still having the problem with Nextcloud 25 and desktop client 3.8.2. In 24 hours it had not yet finished to count files to synchronize, then it lost connexion, and restarted from scrath... About 2 000 000 files. |
I can also confirm that this issue persists with 3.9.0 and [Cloud] 26.0.2. |
@limatus Try with ownCloud Infinite Scale, 3.0 just got released, would expect 4x performance compared with oC10, |
@hodyroff thank for the hint, but I do not intend to switch servers – the Server was and is from NC! |
@claucambra is this a duplicate of [#5692](#5692 or vice vera? |
They are different, this is related to the Windows VFS (normal sync engine) while #5692 is related to the macOS-specific sync engine in the file provider module |
@allexzander : The problem is only with the initial sync. I'm using VFS on my personal server with success, it's working well. |
@allexzander if I sync the files via normal sync, the bottleneck seems to be the connection speed, which is understandable. Sadly, we mostly use virtual files, as they're simply too many files. It's similar to what @tomdereub mentioned, the initial sync needs days, thereafter, it’s fine. |
@allexzander For the sake of completeness I'd like to add that what @tomdereub and others are describing also happens when a significant amount of files are added to the nextcloud account after the initial sync. As described by @CWempe in #4424 (comment), the sync speed decreases dramatically over time. |
Like said by @PhilippSchlesinger, after some time using VFS on that folder with about 500 000 files, I find it too bad to keep syncing the whole folder tree. Every time somebody modifies quite a lot of files, it starts a long sync. It seems to me impossible to deploy for 30 persons, it will charge a lot the server and each computer.
Is this technically possible ? And if yes, what do you (nextcloud devs) think about it ? |
I'd like to add that under Mac OS things are changing towards a FileProvider based implementation, which will solve the issue by delegating a good part of the sync logic to MacOS. IMHO, if under Windows there's no API like FileProvider, then the client should evolve itself to a lazier approach... a "full sync" approach is against scalability and in the long run it's a major limiting factor for a borader adoption of Nextcloud. |
@tomdereub I'm in a very similar situation to yours and as a mitigation solution I ended up as follows:
In this way, server load is under control (push notifications won't wake up all clients every time) and the clients are snappy enough to work. The advantage is that, for heavily used folders, the NC client has all the files downloaded and ready; the disadvantage is that not all the users are comfortable with such setup. Hope it helps |
@marcotrevisan I'm actually trying mountainduck, and it seems to do everything I want with the "smart synchronization" mode. There is an option to index files or not.
|
Yes, but don't get drunk too fast, it has its own bugs (in Mac OS at least) :-D |
Hi. I can confirm this. We have tested extensively the "Duck" on Windows and while the client does very well in terms of performance there are many other issues around file locking, online detection, working with MS office and so forth. Is there any progress to be expected on improving the initial VFS sync speed? We are migrating at the moment a lot of files to NC and I am already afraid from starting the sync on our clients. At the moment the inital sync with about 100K files takes about 60 minutes. Regards Rob |
@allexzander @mgallien could you please just give us some idea of the priority of this issue and the ways to solve it ? Like "it's not the priority at the moment, so we don't know when it will be worked on", or "it's very complicated to solve, we have to re-write entirely the sync engine, so it will take some time before we can work on it", or "you're just a few users concerned, so it's not a priority, most of our users don't have so much data"... As users, we need to know if there is some chance to get VFS scalable at a short or mid term, or if we have to found other solutions. I don't want to see my company giving up with nextcloud and other opensource software we're using, and fall into full microsoft solutions. |
@joshtrichards : you added a label on this issue, what does that mean ? Will somebody start working on it ? |
From what I can see, looks like they began working on this about a week ago. |
This #6461 is exactly what is needed for windows too. |
Dear Nextcloud developers, @allexzander See #4918 for a description of a performance problem (PR intended to solve the problem in #5941) with the tray window. Solving this heavy issue could also pay off in improving the speed problems with initial sync. |
This first step of initial sync is very hard on the server. You can have a look of cpu consumption of your server, I think it's the bottleneck : in my case I have an intel i5-10210U, 6 cores dedicated to my server, and it's using almost 100% of all cores while doing this first scan of all files. I have about 700 000 files, and it takes between 1/2h and 1h to make the scan. So I'm not surprised that it takes so long on a RPi. |
@Rello hey there, do you have an estimation when this will be done ? we are planning to move from our weird software-solution built on top of windows builtin webdav which has a lot of other issues and officially was already canceled (still available but not getting updates they say).. so a switch will be needed as fast as possible. |
OneDrive takes a smarter approach by downloading the file and folder structure from the server first and instantly replicating it on the local system. This ends up being more efficient than how Nextcloud does it, where it downloads the entire structure first and only then starts creating it locally. It also looks like Nextcloud uses just one thread to handle both downloading and syncing, while OneDrive splits the work into two threads: one for downloading data into a buffer and another for reading from that buffer to create the local structure. This split approach helps OneDrive sync files faster. |
How to use GitHub
Feature description
When using virtual files, the first log in after a new installation will start a syncing process that can take a very long time depending on the number of files to synchronize.
In my case, i'm syncing around ~700000 files, my computer has been already up for 29 hours without a restart and the sync process has now reached the 50% mark. I can see that the virtual files are created one by one, but it can be as slow as 2 per second. Two or more days until Nextcloud can be usable is too much in my opinion.
It would be cool if there was any way to speed up the initial sync.
PS: This is related to #4421
The text was updated successfully, but these errors were encountered: