-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use S3 to generate checksum #4873
Conversation
.build(), | ||
AsyncRequestBody.fromByteBuffer(bs) | ||
).map { response => | ||
response.eTag().filter(_ != '"') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the filter, did you see it return that value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, yes it returns the MD5 quoted 😱
@@ -28,7 +28,7 @@ trait S3StorageClient { | |||
|
|||
def getFileAttributes(bucket: String, key: String): IO[GetObjectAttributesResponse] | |||
|
|||
def underlyingClient: S3[IO] | |||
def underlyingClient: S3AsyncClientOp[IO] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this change was a step towards removing the client from the interface, maybe the uploadFile
could go behind it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could do, but I was a but reluctant to return a pipe, or do something which consumes the stream. I think for now this is fine but certainly something that we could revisit?
8561c91
to
f02eddf
Compare
f02eddf
to
e986f80
Compare
case "MD5" => response.eTag().stripPrefix("\"").stripSuffix("\"") | ||
case "SHA-256" => Hex.encodeHexString(Base64.getDecoder.decode(response.checksumSHA256())) | ||
case "SHA-1" => Hex.encodeHexString(Base64.getDecoder.decode(response.checksumSHA1())) | ||
case _ => throw new IllegalArgumentException(s"Unsupported algorithm for S3: ${algorithm.value}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not ideal we have to do this, but the algorithm is validated upon config loading, so this code path should never be hit
implicit class PutObjectRequestOps(request: PutObjectRequest.Builder) { | ||
def deltaDigest(algorithm: DigestAlgorithm): PutObjectRequest.Builder = | ||
algorithm.value match { | ||
case "MD5" => request | ||
case "SHA-256" => request.checksumAlgorithm(ChecksumAlgorithm.SHA256) | ||
case "SHA-1" => request.checksumAlgorithm(ChecksumAlgorithm.SHA1) | ||
case _ => throw new IllegalArgumentException(s"Unsupported algorithm for S3: ${algorithm.value}") | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unusual to use an implicit class here, but given that the request builder is mutable it makes sense we do this to avoid passing the builder between methods
source <- fileOps.fetch(bucket, attr.path) | ||
} yield consume(source) | ||
|
||
assertIO(result, content) | ||
} | ||
} | ||
|
||
test("Use MD5 to calculate a checksum") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear: the previous test checks SHA-256, this is just an additional test because the code path is different
algorithm: DigestAlgorithm | ||
): IO[(String, Long)] = { | ||
for { | ||
fileSizeAcc <- Ref.of[IO, Long](0L) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be possible to get the file size from S3 no ?
Fixes #4860