Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mounting with non-standard metadata URL does not work / Mounting with Blob objects from main JS thread does not work #486

Open
dipterix opened this issue Sep 22, 2024 · 5 comments
Labels
bug Something isn't working

Comments

@dipterix
Copy link
Contributor

dipterix commented Sep 22, 2024

Dear webr devs,

I generated my dataset using rwasm::file_packager. However, when I try to load it directly from JS, the browser complains

Uncaught (in promise) TypeError: can't convert undefined to BigInt
    doStat https://webr.r-wasm.org/v0.4.2/R.bin.js:2061
    ___syscall_stat64 https://webr.r-wasm.org/v0.4.2/R.bin.js:2061

To replicate, you can download the data from here: https://zenodo.org/records/13825852

When loading from webr::mount locally, everything is fine.

This is a screenshot of results with webr::mount: it can read the files correctly:

image

However, when I try to deploy the website and use the URL

(For you to test the URL, I'm using the zenodo API with CORS. Just FYI the URL has no issue. The error exists when testing it locally too)

const path = "/home/web_user/rave_data";
const data = "https://zenodo.org/api/records/13825852/files/project-demo-minimal.data/content";
const meta = "https://zenodo.org/api/records/13825852/files/project-demo-minimal.js.metadata/content";

await webR.FS.mkdir(path);

// Download image data
const _data = await fetch(data);
const _metadata = await fetch(meta);

// Mount image data
const options = {
  packages: [{
    blob: await _data.blob(),
    metadata: await _metadata.json(),
  }],
}

await webR.FS.mount("WORKERFS", options, path);

await mainWebR.evalR("readLines('~/rave_data/raw_dir/DemoSubject/rave-imaging/fs/mri/transforms/talairach.xfm')")

Error:

Uncaught (in promise) TypeError: can't convert undefined to BigInt
    doStat https://webr.r-wasm.org/v0.4.2/R.bin.js:2061
    ___syscall_stat64 https://webr.r-wasm.org/v0.4.2/R.bin.js:2061
    safeEval blob:http://localhost:3530/40ef506b-683b-40b4-9dfc-512b84137000:2087
    captureR blob:http://localhost:3530/40ef506b-683b-40b4-9dfc-512b84137000:9289
    evalR blob:http://localhost:3530/40ef506b-683b-40b4-9dfc-512b84137000:9338
    dispatch blob:http://localhost:3530/40ef506b-683b-40b4-9dfc-512b84137000:8948
    PostMessageChannelWorker blob:http://localhost:3530/40ef506b-683b-40b4-9dfc-512b84137000:4320
[error.ts:11:4](https://webr.r-wasm.org/src/webR/error.ts)

Browser: Firefox 130.0.1 (64-bit)
OS: Apple M2 Sequoia
WebR: 0.4.2
rwasm: 0.2.0.9000

@dipterix
Copy link
Contributor Author

Also Why there is no metadata path in webr::mount? It seems that webr derives meta URL from the data URL. If I can feed the metadata URL by myself, then maybe I don't have to use webR.FS.mount

@georgestagg
Copy link
Member

Hi,

Yes, webR derives the metadata URL from the data URL. This is why you cannot use your Zenodo URLs, they end in [...]/content and webR is not expecting that. I can see that in your case the automatic derivation fails. I think the right solution here is to provide an optional argument as part of the webr::mount() API for users to give their own metadata URLs where appropriate. The argument does not currently exist, but should not be too difficult to add.


Even with the above, your other method should work - but it looks like there's currently a bug related to how Blob objects are transferred to the webR worker thread before passing them onto the Emscripten FS API. I'm sorry about that, we'll try to get it fixed for the next release. With a fix in place, your JS code should work as is.


If it is urgent, you can work around the bug by doing the work synchronously on the webR worker thread, avoiding the problem with transferring the Blob object. However, this will be extremely messy so it is only worth implementing as a temporary measure:

dir.create("/home/web_user/rave_data")

webr::eval_js('
  const path = "/home/web_user/rave_data";
  const data = "https://zenodo.org/api/records/13825852/files/project-demo-minimal.data/content";
  const meta = "https://zenodo.org/api/records/13825852/files/project-demo-minimal.js.metadata/content";
  
  const data_req = new XMLHttpRequest();
  data_req.responseType = "arraybuffer"
  data_req.open("GET", data, false);
  data_req.send(null);
  const _data = data_req.response
  
  const meta_req = new XMLHttpRequest();
  meta_req.responseType = "json"
  meta_req.open("GET", meta, false);
  meta_req.send(null);
  const _meta = meta_req.response

  const options = {
    packages: [{
      blob: new Blob([_data]),
      metadata: _meta,
    }],
  }

  Module.FS.mount(Module.FS.filesystems.WORKERFS, options, path)
')

readLines('~/rave_data/raw_dir/DemoSubject/rave-imaging/fs/mri/transforms/talairach.xfm')

Screenshot 2024-09-23 at 09 21 51

@georgestagg georgestagg added the bug Something isn't working label Sep 23, 2024
@dipterix
Copy link
Contributor Author

Thanks for checking it. Now I see what's going on... So blobs are converted to typed array when transferred to workers... No wonder when I console.log options, the blob is typed arrays.

Currently my workaround is to manually slice the data manually and use webR.FS.writeFile to generate directory tree. It worked great haha. (see https://rave.wiki/posts/3dviewer/viewer201.html)

I do have a wishlist to convert WORKREFS to IDBFS. Currently everytime when users open the website, they need to download the data from the internet. This is fine is the internet if fast. However it's a waste of resources and time when viewing it on phone or metered wifi. If the data can be stored on the client side, then this might significantly speed up. Any suggestions how I can implement this feature?

@georgestagg
Copy link
Member

So blobs are converted to typed array when transferred to workers

Yes, we are forced to make the conversion for annoying but unrelated technical reasons. In the future when we're able to work around the issue we'll be able to return to transferring Blob objects with a more traditional JS postMessage() transfer instead.

I do have a wishlist to convert WORKREFS to IDBFS.

This should work, but it's not been tested heavily. If you do investigate this method don't forget to run syncfs() at the correct times with the correct arguments.

If the data can be stored on the client side, then this might significantly speed up.

I did notice that your data was not being cached by my browser when I was testing. How much control do you have over the content HTTP headers? If you can configure Zenodo to send the content with HTTP header Cache-Control: max-age=86400 (or some other larger number) the web browser should automatically cache this download and restore it from disk cache on subsequent reloads, rather than re-downloading the entire dataset each time.

@georgestagg georgestagg changed the title Unable to mount data from JS: "can't convert undefined to BigInt" Mounting with non-standard metadata URL does not work / Mounting with Blob objects from main JS thread does not work Sep 23, 2024
@dipterix
Copy link
Contributor Author

In the future when we're able to work around the issue we'll be able to return to transferring Blob objects with a more traditional JS postMessage() transfer instead.

Or webr.FS.mount can accept typed arrays? It's easy to make blobs out of the arrays. Just a quick thought as typed arrays are naturally supported by JS workers. You just need to convert to blobs at the final Module.FS.mount step

This should work, but it's not been tested heavily. If you do investigate this method don't forget to run syncfs() at the correct times with the correct arguments.

Thanks I will try it.

I did notice that your data was not being cached by my browser when I was testing. How much control do you have over the content HTTP headers? If you can configure Zenodo to send the content with HTTP header Cache-Control: max-age=86400 (or some other larger number) the web browser should automatically cache this download and restore it from disk cache on subsequent reloads, rather than re-downloading the entire dataset each time.

I don't have controls over Zenodo. I guess I'll try the cache by myself for now. Good to know that you can set cache controls. Then this is an edge case and won't appear too often in the future : )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants