v0.14.0
RustedBones
released this
01 Feb 12:26
·
238 commits
to refs/heads/main
since this release
What's Changed
Includes Beam 2.53.0 support.
Breaking Changes
- avro removed from
scio-core
. scalafix rules helping: FixAvroCoder, FixAvroSchemasPackage, FixDynamicAvro - some avro API changes . scalafix rules helping: FixGenericAvro.
- fallback kryo coder requires explicit import
- use of official tensorflow metadata
- BigQuery error-info and result handling API change
sio-smb
module not pulling implementation dependenciessio-smb
inJobTest
expectingSmbIO
test input/output
See the Migration Guide for more information.
🚀 Enhancements
- improve testing framework by @RustedBones in #4962
- fs.gs.inputstream.fadvise defaults to SEQUENTIAL by @farzad-sedghi in #5132
- (scio-smb) Support mixed FileOperations per BucketedInput by @clairemcginty in #5064
- Support Tap for SMB writes (addresses #5080) by @clairemcginty in #5144
- feat: add save dynamic csv by @klDen in #5130
- Make Sparkey testable by @kellen in #5128
- Integrate avro datum factory in scio-avro by @RustedBones in #5152
- Handle BQ write result as ClosedTap side output by @RustedBones in #5172
- Support projection in ParquetAvroSortedBucketIO by @clairemcginty in #5173
- Expose bigtable read maxBufferElementCount option by @RustedBones in #5026
- Integrate datum-factory in smb-avro by @RustedBones in #5181
- Require import for kryo implicit fallback coder by @RustedBones in #5199
- Add kryo serializer for GAX api exceptions by @RustedBones in #5198
- Initial Iceberg bucket support by @regadas in #5205
- Add scala enumeration implicit coder by @RustedBones in #5213
- Add SortedBucketTransform counter for records written by @clairemcginty in #5220
- Add PipelineTestUtils helper for Taps by @clairemcginty in #5216
- Add counter for SMB Predicate filtering by @clairemcginty in #5221
🐛 Bug Fixes
- Fix race in sparkey write by @kellen in #4937
- Support null CharSequence keys by @RustedBones in #5113
- Add location to BigQuery LoadOps by @f-loris in #5106
- Fix bigtable option conversion for bulk API by @RustedBones in #5167
- Fix multi-line DML statement detection by @RustedBones in #5169
- (fix #5147) Fix Materialize for elements that match compression encoding signature by @clairemcginty in #5148
- In SMB and ParquetAvroIOTap set GenericDataSupplier and read schemas by @shnapz in #5121
- (bugfix) Set metadata in AvroSortedBucketIO by @clairemcginty in #5184
- (fixes #5193) Serialize BucketMetadata#hashType as String by @clairemcginty in #5194
- Close client in async DoFn by @RustedBones in #5206
- Use non-deprecated version of murmur3_32 by @regadas in #5204
- Use fork-join common pool for async DonFn callbacks by @RustedBones in #5209
📜 Scalafix Migrations
- Add 0.14 scalafix migration for saveAsAvroFile and update avro coder one by @RustedBones in #5215
- Add Scalafix rule for LogicalTypeSupplier removal by @clairemcginty in #5178
📗 Documentation
- Add docs about SMB secondary keys by @kellen in #5095
- Remove references to Spotify FOSS Slack by @BalestraPatrick in #5149
- Convert SortMergeBucketExample to Parquet + update tests by @clairemcginty in #5191
- Fix list in Builtin.md by @kellen in #5214
- Scio 0.14 migration guide by @RustedBones in #5212
- Update scio and beam release table by @RustedBones in #5222
🧪 Test Improvements
- Fix flaky SCollectionTest by @kellen in #5098
- Drop deprecated sbt
IntegrationTest
configuration by @RustedBones in #4971 - Move populate test data to compile scope by @RustedBones in #5138
🏗️ Build Improvements
- Fix compiler warnings by @RustedBones in #4934
- Update sbt-assembly to 2.1.5 by @scala-steward in #5093
- Update sbt-bloop to 1.5.12 by @scala-steward in #5097
- Use sbt-typelevel for build by @RustedBones in #5107
- Ack or fix deprecation warnings by @RustedBones in #5124
- Ack expected non-exhaustive pattern match by @RustedBones in #5126
- Update sbt-jmh to 0.4.7 by @scala-steward in #5166
- Update sbt-typelevel to 0.6.5 by @scala-steward in #5164
- Update sbt-avro to 3.4.4 by @scala-steward in #5157
- Update sbt-mdoc to 2.5.2 by @scala-steward in #5163
- Update sbt, sbt-dependency-tree to 1.9.8 by @scala-steward in #5162
- Build integration test in PRs originating from repo by @RustedBones in #5143
- Remove duplicated scalac option by @RustedBones in #5171
- Handle unused warning as error by @RustedBones in #5180
- Skip dependency check if compile is skipped by @RustedBones in #5188
- Update sbt-protoc to 1.0.7 by @scala-steward in #5196
- Update sbt-paradox to 0.10.6 by @scala-steward in #5210
- Fix jar signing by @RustedBones in #5223
🔧 Refactorings
- Remove avro from scio-core, implement binary file source by @kellen in #4913
- Tensorflow metadata by @RustedBones in #4944
- Use avro builder API by @RustedBones in #5119
- Cleanup deprecated API by @RustedBones in #5134
- Move BQ typed from query to queryRaw by @RustedBones in #5137
- Rename scala SortedBucketIO class to SmbIO by @clairemcginty in #5140
- Rename SortedBucketIOTest to match SmbIO by @clairemcginty in #5146
- Relax versioning regex by @RustedBones in #5139
- Change scio-smb to depend on provided scio io modules by @RustedBones in #5004
- Drop collection compat shim in favor of official scala-collection-compat by @RustedBones in #5069
🌱 Dependency Updates
- Update algebra, cats-core, cats-kernel to 2.10.0 by @scala-steward in #4952
- Update jedis to 5.1.0 by @scala-steward in #5094
- Update metrics-core to 4.2.23 by @scala-steward in #5110
- Update hadoop to v3.2.4 by @RustedBones in #5135
- Update beam to v2.53 by @RustedBones in #5133
- Update scalac-compat-annotation, ... to 0.1.4 by @scala-steward in #5165
- Update neo4j-java-driver to 4.4.13 by @scala-steward in #5161
- Update jna to 5.14.0 by @scala-steward in #5160
- Update magnolify to 0.7.0 by @RustedBones in #5155
- Upgrade Parquet to 0.13.1 by @clairemcginty in #5175
- Update elasticsearch-java to 8.12.0 by @scala-steward in #5185
- Update mysql-connector-j to 8.3.0 by @scala-steward in #5174
- Update cloud-sql-connector-jdbc-sqlserver to 1.15.2 by @scala-steward in #5186
- Update mysql-socket-factory-connector-j-8 to 1.15.2 by @scala-steward in #5187
- Use new vendored Guava version by @clairemcginty in #5195
- Update metrics-core to 4.2.25 by @scala-steward in #5211
- Update testcontainers-scala-elasticsearch, ... to 0.41.2 by @scala-steward in #5217
New Contributors
- @BalestraPatrick made their first contribution in #5149
- @klDen made their first contribution in #5130
Full Changelog: v0.13.6...v0.14.0