Skip to content

Commit

Permalink
feat: non-parallel piece hashing and CAR upload (#1305)
Browse files Browse the repository at this point in the history
I have a gut feeling that synchronous piece hashing is causing store
requests to timeout, specifically waiting for headers in node.js. I
think they are probably sent, but event loop is too busy to actually
receive them and a timeout gets triggered.

This PR removes parallelism around generating piece hashes, sending
`store/add` invocation and uploading a CAR.

refs #1290
  • Loading branch information
Alan Shaw authored Mar 19, 2024
1 parent fe2e3d5 commit 7a6385b
Show file tree
Hide file tree
Showing 3 changed files with 10 additions and 25 deletions.
1 change: 0 additions & 1 deletion packages/upload-client/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,6 @@
"ipfs-utils": "^9.0.14",
"multiformats": "^12.1.2",
"p-retry": "^5.1.2",
"parallel-transform-web": "^1.0.1",
"varint": "^6.0.0"
},
"devDependencies": {
Expand Down
27 changes: 10 additions & 17 deletions packages/upload-client/src/index.js
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
import { Parallel } from 'parallel-transform-web'
import * as PieceHasher from '@web3-storage/data-segment/multihash'
import * as Link from 'multiformats/link'
import * as raw from 'multiformats/codecs/raw'
Expand All @@ -11,8 +10,6 @@ import { ShardingStream, defaultFileComparator } from './sharding.js'
export { Store, Upload, UnixFS, CAR }
export * from './sharding.js'

const CONCURRENT_REQUESTS = 3

/**
* Uploads a file to the service and returns the root data CID for the
* generated DAG.
Expand Down Expand Up @@ -122,24 +119,20 @@ async function uploadBlockStream(conf, blocks, options = {}) {
const shards = []
/** @type {import('./types.js').AnyLink?} */
let root = null
const concurrency = options.concurrentRequests ?? CONCURRENT_REQUESTS
const hasher = options.pieceHasher ?? PieceHasher
await blocks
.pipeThrough(new ShardingStream(options))
.pipeThrough(
new Parallel(concurrency, async (car) => {
const bytes = new Uint8Array(await car.arrayBuffer())
const [cid, piece] = await Promise.all([
Store.add(conf, bytes, options),
(async () => {
const multihashDigest = await hasher.digest(bytes)
return /** @type {import('@web3-storage/capabilities/types').PieceLink} */ (
Link.create(raw.code, multihashDigest)
)
})(),
])
const { version, roots, size } = car
return { version, roots, size, cid, piece }
new TransformStream({
async transform(car, controller) {
const bytes = new Uint8Array(await car.arrayBuffer())
const multihashDigest = await hasher.digest(bytes)
/** @type {import('@web3-storage/capabilities/types').PieceLink} */
const piece = Link.create(raw.code, multihashDigest)
const cid = await Store.add(conf, bytes, options)
const { version, roots, size } = car
controller.enqueue({ version, roots, size, cid, piece })
},
})
)
.pipeTo(
Expand Down
7 changes: 0 additions & 7 deletions pnpm-lock.yaml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit 7a6385b

Please sign in to comment.