Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with the NotFound Error of the App web dir data #201

Closed
lurenpluto opened this issue Apr 12, 2023 · 3 comments
Closed

Issue with the NotFound Error of the App web dir data #201

lurenpluto opened this issue Apr 12, 2023 · 3 comments
Assignees
Labels
App-Manager The App Manager basic service bug Something isn't working CYFS Stack This is CYFS Stack

Comments

@lurenpluto
Copy link
Member

Currently, based on feedback from some users, there is a small probability that the web dir recorded during installation for a particular app may disappear.

Relate background knowledge

During the installation of an app, the current version's web dir will call the cyfs-stack's trans.publish_file to register with the stack. The stack will then perform several key operations:

1. Traverse the directory to generate dir (object-map) and file objects, as well as corresponding chunks.
2. Update the noc to save the dir and file objects.
3. Update the ndc to record the association between chunks and files/dirs.
4. Update the tracker to record the association between chunks and file locations.

When the stack receives requests for the corresponding dir/files/chunks' binary data, it will perform the following operations:

1. If it is a dir/file, find the corresponding object and obtain the internal chunk list.
2. Look up the file associated with the chunk in the **tracker** and read the relevant data.

The tracker plays a key role here, especially for directly registered external files.

A difficult situation

To handle the uncontrollable operations, such as external file modification and deletion, which may cause incorrect chunk data, a hash verification is added in step 2 above. If an inconsistency is found in the hash after reading, an error will be returned, and the tracker record will be deleted (considered invalid).

Adding Hash verification during reading, from a performance perspective, uses a mechanism that calculates the Hash while reading. Verification is done when the reading is complete. The corresponding code is as follows:

#[async_trait::async_trait]
pub trait ChunkHashErrorHandler: Send + Sync {
fn on_hash_error(&self, chunk_id: &ChunkId, path: &str);
}
pub struct ChunkReaderWithHash {
path: String,
chunk_id: ChunkId,
reader: Box<dyn AsyncReadWithSeek + Unpin + Send + Sync>,
hash: sha2::Sha256,
error_handler: Option<Box<dyn ChunkHashErrorHandler>>,
}

The Problem and possible cause analysis

Therefore, the issue described at the beginning may have two possible situations:

  1. App-manager's version control of the app's web dir has an error
    For example, Inconsistent web dir data, etc.

  2. The verification reading mechanism has bugs
    In other words, ChunkReaderWithHash may produce incorrect verification results under certain circumstances, causing records with correct data to be deleted from the tracker.

Currently, from the logs received from user feedback, it is unlikely that there is a problem with step 1. Therefore, the focus should be on step 2 to analyze and fix the issue.

@lurenpluto lurenpluto added bug Something isn't working CYFS Stack This is CYFS Stack App-Manager The App Manager basic service labels Apr 12, 2023
@lurenpluto lurenpluto moved this to 🐞Discovered Bugs in CYFS-Stack & Services Apr 12, 2023
@lurenpluto
Copy link
Member Author

Based on the design of ChunkReaderWithHash, a complete Hash can be generated for verification only when there is only one chunk that is sequentially read and completely read. Therefore, from a code perspective, it seems that there are several possibilities and improvements:

1. Seek occurred during the read process

ChunkReaderWithHash itself supports the Seek trait. If a seek call occurs during the read process, it will definitely result in an incorrect Hash result. However, in actual use, the reader based on ChunkReaderWithHash is only used in the BDT layer for a chunk task, and only the Read (AsyncRead) trait is used. The current BDT layer code is as follows:

async fn load(&self, cache: &dyn RawCache, storage: &dyn ChunkReader) -> BuckyResult<()> {
let reader = storage.get(self.chunk()).await?;
let writer = cache.async_writer().await?;
let written = async_std::io::copy(reader, writer).await? as usize;
if written != self.chunk().len() {
Err(BuckyError::new(BuckyErrorCode::InvalidInput, ""))
} else {
Ok(())
}
}

From the above code, it can be seen that it is a sequential read based on io::copy, and each read operation is a complete chunk without any seek operation. However, this is a factor to consider when calculating the hash.

2. The internal poll_read returned Poll::Ready(Ok(0)) prematurely

Size=0 indicates that this chunk was read in advance, but are there other possibilities that cause the chunk to end prematurely without being completely read?

3. Temporarily remove error_handler

In the case of verification errors, do not delete the corresponding tracker record for the time being, to prevent accidental deletion.

In addition, the following two aspects have been improved for this issue:

  • When the app-manager installs an app, different versions of the web dir will have separate directories and will no longer use the same directory.
  • When the tracker looks for the corresponding chunk record, it will search from the newest to the oldest according to the time the record was inserted. Since the latest record is more likely to be correct, this ensures that the correct record will be used first.

@lurenpluto
Copy link
Member Author

To better fix and improve this issue, here are some suggestions for reference: #204

@lurenpluto lurenpluto moved this from 🐞Discovered Bugs to 💬To Discuss in CYFS-Stack & Services Apr 12, 2023
@lurenpluto
Copy link
Member Author

Currently, some mechanisms have been added to ensure the correctness of the hash verification. See at
717eabd.

In addition, the tracker recorder fix has been temporarily removed from the chunk reader. See at
6f4439f.

@lurenpluto lurenpluto moved this from 💬To Discuss to ✅Done in CYFS-Stack & Services Apr 17, 2023
streetycat pushed a commit to streetycat/CYFS that referenced this issue May 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
App-Manager The App Manager basic service bug Something isn't working CYFS Stack This is CYFS Stack
Projects
Status: Done
Development

No branches or pull requests

1 participant