You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(This is probably not really the right repo to open this issue but I couldn't find a better one, please move if you know of a better place @tsibley)
Context
I noticed that brotli is much better at compressing auspice trees than gzip and checked if we were using brotli for downloading resources from AWS. Turns out we don't.
Description
It would be great if we supported brotli compression as the default compression for trees downloaded from AWS (ncov-data, etc.)
Examples
Compression using brotli is much better, see here for the Nextclade reference build with 4k tips:
It should be possible to swap gzip for brotli, but we'll have to support a mix of the two for a long time (potentially ~forever) because it will be impossible to coordinate all sources.
While just looking at the compression benchmarks in isolation makes for clear benefits, it's not clear to me that swapping is worth it with full consideration of the effort involved (e.g. time to engineer the swap (plan, write, test, etc), ongoing complexity of supporting both, opportunity cost of working on this instead of something else, etc).
We don't use CloudFront's dynamic compression, as not all access goes through CloudFront: a lot goes directly to S3. So we pre-compress and store compressed objects on S3. IIRC, CloudFront's dynamic compression also has (or used to have at least?) fairly low upper limits on the uncompressed size it supports.
(This is probably not really the right repo to open this issue but I couldn't find a better one, please move if you know of a better place @tsibley)
Context
I noticed that brotli is much better at compressing auspice trees than gzip and checked if we were using brotli for downloading resources from AWS. Turns out we don't.
Description
It would be great if we supported brotli compression as the default compression for trees downloaded from AWS (
ncov-data
, etc.)Examples
Compression using brotli is much better, see here for the Nextclade reference build with 4k tips:
Brotli compresses 4x better than gzip.
Possible solution
Apparently it's not too hard to enable brotli compression on the AWS end: https://aws.amazon.com/about-aws/whats-new/2020/09/cloudfront-brotli-compression/
We may also need to change charon request headers, though. Not sure where these are set.
I think we should try to use brotli wherever possible, also for things like auspice jsons. It generally does better than gzip.
Finally, we could also consider using brotli compression for
nextstrain remote download
- though there it's of less need, I think.The text was updated successfully, but these errors were encountered: