Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check preconditions in FileSystemResourceManager.write #103 #393

Merged
merged 2 commits into from
Nov 29, 2023

Conversation

HeikoKlare
Copy link
Contributor

@HeikoKlare HeikoKlare commented Mar 31, 2023

The FileSystemResourceManager.write() method fails when the target file has been deleted before the write method has been called. The method defines the existence of the file in the workspace as a precondition but does never check that, which makes the method fail when first accessing information that is not available after the deletion, in particular, its ResourceInfo. This is why the method failed when accessing the ResourceInfo`.

The FileSystemResourceManagerTest.testWriteFile() randomly fails because of this behavior. A concurrently running workspace refresh removes the file right before the write() method is called.

  • Make write() check its precondition, i.e., the existence of the target file to deterministically fail with an IllegalStateException
  • Make testWriteFile() wait for the refresh to be finished before executing write() to have deterministic test behavior
  • Add additional test case validating proper failure of write() when target file has been deleted before
  • Improve the write() method documentation to reflect the current behavior.

Fixes #103

@HeikoKlare HeikoKlare marked this pull request as ready for review March 31, 2023 09:45
@github-actions
Copy link
Contributor

github-actions bot commented Mar 31, 2023

Test Results

     591 files  +  18       591 suites  +18   59m 10s ⏱️ - 10m 31s
  3 838 tests +    1    3 833 ✔️ +    3    5 💤 ±0  0  - 2 
12 120 runs  +150  12 084 ✔️ +152  36 💤 ±0  0  - 2 

Results for commit e40518f. ± Comparison against base commit 05d6dea.

This pull request removes 1 and adds 2 tests. Note that renamed tests count towards both.
org.eclipse.core.tests.internal.localstore.FileSystemResourceManagerTest ‑ testWriteFile2
org.eclipse.core.tests.internal.localstore.FileSystemResourceManagerTest ‑ testWriteDeletedFile
org.eclipse.core.tests.internal.localstore.FileSystemResourceManagerTest ‑ testWriteFileNotInWorkspace

♻️ This comment has been updated with latest results.

*/
public void write(IFile target, InputStream content, IFileInfo fileInfo, int updateFlags, boolean append, IProgressMonitor monitor) throws CoreException {
SubMonitor subMonitor = SubMonitor.convert(monitor, 4);
try {
assertExists(target);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider throwing a CoreException to maintain the contract of the API, instead of an IllegalStateException?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jukzi Maybe this question is rather directed to you, since you worked on https://git.eclipse.org/r/c/platform/eclipse.platform.resources/+/176136

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I have to admit that I am not sure whether I understood the contract correctly. I expected the existance of IFile target to be a precondition for executing the method, which is why an IllegalStateException would make sense to me if called with a non-existing target. I expected the condition regarding throwing a CoreException to refer to the static information in the given fileInfo or in the target, given that the latter is present.

But you are right that in certain situations the method will throw a different exception now, so I am completely open to throwing a CoreException instead. However, there is another test case that expects an IllegalStateException to be thrown when passing a non-existing target (see #103 (comment)).

Either way, as soon as we have agreed we may also improve the contract to clarify its meaning.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding an checked Exception instead of an RuntimeException would break API. However CoreException is already documented to be thrown ... so it would be OK to use it for already documented cases.
However i wonder if it even is according to the javadoc if any excpetion is thrown for an non existing file:
"If the force flag is false we only write the file if it does not exist" does "exists" here mean in workspace or in filesystem?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either way, as soon as we have agreed we may also improve the contract to clarify its meaning.

Changing the contract is bad even if it is an improvement in some cases - there may be code that relies on the contract.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jukzi would be great to have your feedback on which kind of exception you would expect to be thrown here (in case you have an opinion on that). thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The javadoc clearly specifies CoreException should be thrown if file is missing. However as far i remember the root for the IllegalStateException was that platform was not really sure if the file exists or not. A proper fix would be to avoid to get into such state - just don't know how exactly that is reached. Performancewise It might regression to add more costly checks in this method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the contract is bad even if it is an improvement in some cases - there may be code that relies on the contract.

True. What I actually meant was that we should adapt the Javadoc so that it properly represents the existing behavior of the method, since the specification is currently not very precise.

The javadoc clearly specifies CoreException should be thrown if file is missing.

That is one possible interpretation, but I would say that it "clearly specifies" this. Throwing a CoreException is specified for the case that the force flag is false and the other mentioned conditions are not fulfilled. The Javadoc does not make a statement about the exception thrown in the case that the target is not existing. Actually, it defines this as a precondition, which is why I find an IllegalStateException appropriate here. In addition, the current code throws an IllegalStateException at several points where the resource info is null.

However as far i remember the root for the IllegalStateException was that platform was not really sure if the file exists or not. A proper fix would be to avoid to get into such state - just don't know how exactly that is reached.

The reason I saw so far is that the file is concurrently removed and a concurrent workspace refresh is performed, such that the file is not represented in the workspace tree anymore. So maybe a proper/better solution would be to acquire a workspace lock during the write operation or to even wrap it into a workspace operation. But I fear that will at least introduce a performance issue and may, in the worst case, even lead to deadlocks.

jukzi
jukzi previously requested changes Sep 27, 2023
Copy link
Contributor

@jukzi jukzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i still don't see my concerns answered

*/
public void write(IFile target, InputStream content, IFileInfo fileInfo, int updateFlags, boolean append, IProgressMonitor monitor) throws CoreException {
SubMonitor subMonitor = SubMonitor.convert(monitor, 4);
try {
assertExists(target);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding an checked Exception instead of an RuntimeException would break API. However CoreException is already documented to be thrown ... so it would be OK to use it for already documented cases.
However i wonder if it even is according to the javadoc if any excpetion is thrown for an non existing file:
"If the force flag is false we only write the file if it does not exist" does "exists" here mean in workspace or in filesystem?

*/
public void write(IFile target, InputStream content, IFileInfo fileInfo, int updateFlags, boolean append, IProgressMonitor monitor) throws CoreException {
SubMonitor subMonitor = SubMonitor.convert(monitor, 4);
try {
assertExists(target);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either way, as soon as we have agreed we may also improve the contract to clarify its meaning.

Changing the contract is bad even if it is an improvement in some cases - there may be code that relies on the contract.

*/
public void write(IFile target, InputStream content, IFileInfo fileInfo, int updateFlags, boolean append, IProgressMonitor monitor) throws CoreException {
SubMonitor subMonitor = SubMonitor.convert(monitor, 4);
try {
assertExists(target);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The javadoc clearly specifies CoreException should be thrown if file is missing. However as far i remember the root for the IllegalStateException was that platform was not really sure if the file exists or not. A proper fix would be to avoid to get into such state - just don't know how exactly that is reached. Performancewise It might regression to add more costly checks in this method.

@HeikoKlare HeikoKlare force-pushed the issue-103 branch 2 times, most recently from 5043b54 to bf87b99 Compare September 29, 2023 15:11
@HeikoKlare
Copy link
Contributor Author

Sometimes things are much simpler than they first appear 🙂

After the "transferStreams" the file should have exited for an instance. So the question remains how the resourceinfo can be null at this point.

The reason is quite simple: this (and all the other exceptions due to ResourceInfo being null) can happen if the ResourceInfo is null from the beginning, i.e., if the file does not exist when calling the method. The method documentation says that it requires the given target to exist in the workspace. However, this is never checked. And if you simply pass a file that does not exist, i.e., for which you call file.delete(true, null) before performing the write operation, it can fail at different points of execution (when first accessing the resource info). Just take the testWriteFile() and delete the file before callng write() and it will fail at exactly that line of code.

I invested quite some time to produce concurrent updates of the workspace that leads to the resource info becoming null during write execution, but it seems to be impossible, as the method is run inside a workspace operation, thus not allowing concurrent changes. The only way to achieve a concurrent change of the target file is to perform a modification from within that operation. This can, for example, be "achieved" by misusing the passed progress monitor, as, for example, its worked() method runs in the scope of the workspace operation.

So if I understand correctly, the fix for all the different places in write() throwing exceptions because of resource info being null is to simply check the method's precondition, i.e., to check target file for existance before executing the rest of the method. I have updated the PR accordingly.

@HeikoKlare HeikoKlare changed the title Improve handling concurrent workspace changes when storing file (#103) Check preconditions in FileSystemResourceManager.write #103 Sep 29, 2023
@HeikoKlare HeikoKlare force-pushed the issue-103 branch 3 times, most recently from 4a7a2b1 to 06ad2ba Compare October 13, 2023 13:26
@jukzi
Copy link
Contributor

jukzi commented Oct 16, 2023

The build fails - beside that have you tested, that checking the existence conditions twice does not lead to a performance degradation? I don't know a solution but it would feel better to have some sort of atomic operation by synchronization instead of double check - which may depend on timing.

@HeikoKlare
Copy link
Contributor Author

I forget to post the documentation of my recent update of this PR. Sorry for that, you can find it at the end of this comment.

The build fails - beside that have you tested, that checking the existence conditions twice does not lead to a performance degradation? I don't know a solution but it would feel better to have some sort of atomic operation by synchronization instead of double check - which may depend on timing.

I do not expect any performance degradation, because the existence check is only a null check for the ResourceInfo that is retrieved anyway. The checkExists call is only executed in the exceptional situation that the ResourceInfo is null, which is only the case when the method contract is not fulfilled by the caller.
I agree that having some synchronization would better, but that would really have an impact on performance. And actually the additional checks within the method are only for improved debugging, as they only provide a better exception in case the method contract is fulfilled by the caller. It‘s also true that the checks depend on timing, as the resource might still be concurrently removed after the write method has retrieved the ResourceInfo, but this, again, is a bug in the implementation of the caller.

With respect to the build failures: they seem to be unrelated test failures, but I will have another eye on them once we agree on the change.

For documentation, I have made the following changes in my recent update of this PR:

  • Delegate the existence check for the target within the write method to the resource itself, as it already has according functionality that is used at other places to handle non-existent resources (and throw a proper exception).
  • Move the first existence check to the beginning of the method to have a validation of its precondition. This should not introduce a (significant) overhead to the method execution as it simply accesses the resource's ResourceInfo, which is done during the method execution anyway. But it ensures that the missing resource is detected before executing the write operation, such that no write is performed to a non-existent resource.

It is still up to the caller of the method to ensure that the target is and remains existing when calling the method. I have taken a look at the existing callers and they all seem to properly wrap the call into a workspace operation and check the existence before calling the write method, so that no exception will occur. The FileSystemResourceManagerTest.testWriteFile() did only fail because it violated the contract and did not ensure that the target exists when calling the method. So it was actually rather a broken test than an error in the productive code.

@HeikoKlare
Copy link
Contributor Author

@jukzi May I ask you to have another look at this PR together with my explanations in #393 (comment)?
Would be great if we could finalize this issue at the beginning of 4.31 development to have much time for figuring out unexpected impacts (which I do not expect though).

@jukzi
Copy link
Contributor

jukzi commented Nov 16, 2023

I do not expect any performance degradation,

An assumption is not enough. Please verify that. this code is very often used and a bottle neck on windows during build.

@HeikoKlare
Copy link
Contributor Author

An assumption is not enough. Please verify that. this code is very often used and a bottle neck on windows during build.

Thanks for your reply! Actually my comment was not assumption but an assessment. But I see that this is a critical code part, so I will elaborate more based on a static analysis as well as on a test-based evaluation.

Code Analysis

  • The only change affecting performance in the commit is the getResourceInfo() call that is moved from within a conditional block to outside, thus now may being executed more often. But this only happens if the target is not local or the force flag is set.
  • getResourceInfo() is called again later in the method, so an additional call of that method will only create a "linear" overhead w.r.t. to the time consumed by getResourceInfo() calls.
  • getResourceInfo() does not require a significant amount of time, so the linear overhead w.r.t. to that method call is not relevant (see samples below).

Test-Based Evaluation

I did some tests based on different scenarios. I use worst case properties, precisely the force flag being set, so that in the original implementation the resource info is retrieved only once, where the new implementation does it twice. The most important question is how much overhead the additional execution of that method produces.
I wrote different amounts of files with different workspace structures and sampled the write() calls.

Scenario 1

A workspace with 100.000 files directly placed below a project to which a string of 8 characters is written.

Original Implementation:
image
image

New implementation:
image
image

Scenario 2

A workspace with 30.000 files placed in different folders of depth 20 within a project to which a string of 8 characters is written. This scenario is taken since getResourceInfo() needs to traverse the workspace tree, so deeper hierarchies may produce more execution time.

Original implementation:
image
image

New implementation:
image
image

Remarks

  • There is quite some variation in the execution times (sometime getResourceInfo() is not listed as its below threshold, sometimes it's even at 400ms).
  • Most important is the ratio of getResourceInfo() execution time compared to overall write() execution time and the overall write() execution time itself, which do significantly change with the modifications in this PR.

Results

In summary, both from a code analysis perspective as well as from a test-based evaluation perspective there is no noticable performance impact of the change. The only relevant thread to validity of these results I see is the selection of scenarios. In case there are other scenarios that dramatically increase the execution time of getResourceInfo() compared to everything else in the write() method, there would be a performance impact. I do not see a way how that could happen, but if I miss something, of course let me know.

@jukzi Do you see further necessities for verification or does this fit to your expectation?

@jukzi
Copy link
Contributor

jukzi commented Nov 17, 2023 via email

@HeikoKlare
Copy link
Contributor Author

Measurements were taken on Windows 11. Since only the file-system access is OS-dependent while getResourceInfo() is OS-independent, the total overhead on other OSes should be the same (while the relative one may vary depending on slower/faster file system access).

Copy link
Contributor

@jukzi jukzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then i am OK with this change in M1

@HeikoKlare
Copy link
Contributor Author

Thanks @jukzi for all the feedback and the thorough review! So I guess we can remove your request for changes?

I will of course respin and investigate the checks before merging for M1.

@jukzi jukzi dismissed their stale review November 20, 2023 11:41

OK for me

…rm#103

The FileSystemResourceManager.write() method fails when the target file
has been deleted before the write method has been called. The method
defines the existence of the file in the workspace as a precondition but
does never check that, which makes the method fail when first accessing
information that is not available after the deletion, in particular, its
ResourceInfo. This is why the method failed when accessing the
ResourceInfo.

The FileSystemResourceManagerTest.testWriteFile() randomly fails because
of this behavior. A concurrently running workspace refresh removes the
file right before the write() method is called.

- Make write() check its precondition, i.e., the existence of the target
file to deterministically fail with an IllegalStateException
- Make testWriteFile() wait for the refresh to be finished before
executing write() to have deterministic test behavior
- Add additional test case validating proper failure of write() when
target file has been deleted before

Fixes eclipse-platform#103
Document the current behavior of the FileSystemResourceManager.write()
method in terms of parameter meanings and conditions for exceptions to
be thrown.

Contributes to
eclipse-platform#103
@jukzi jukzi merged commit 2d9587c into eclipse-platform:master Nov 29, 2023
16 checks passed
@jukzi
Copy link
Contributor

jukzi commented Nov 29, 2023

lets try... please keep an eye on next i-builds to see if there are any regressions

@HeikoKlare HeikoKlare deleted the issue-103 branch November 29, 2023 07:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

random failing org.eclipse.core.tests.internal.localstore.FileSystemResourceManagerTest.testWriteFile
3 participants