Skip to content

Releases: datastax/cassandra-data-migrator

5.1.1

22 Nov 17:50
Compare
Choose a tag to compare

Key Highlights

  • Writetime filter has been fixed to work correctly when used with custom writetimestamp (bug fixed)
  • Removed deprecated properties printStatsAfter and printStatsPerPart. Run metrics should now be tracked using the trackRun feature instead.

Jar file can also be found in the packages section here.

5.1.0

18 Nov 13:50
Compare
Choose a tag to compare

Key Highlights

  • Improves metrics output by producing stats labels in an intuitive and consistent order
  • Refactored JobCounter by removing any references to thread or global as CDM operations are now isolated within partition-ranges (parts). Each such part is then parallelly processed and aggregated by Spark.

Jar file can also be found in the packages section here.

5.0.0

09 Nov 03:51
Compare
Choose a tag to compare

Key Highlights

  • CDM refactored to be fully Spark Native and more performant when deployed on a multi-node Spark Cluster
  • trackRun feature has been expanded to record run-info for each part in the CDM_RUN_DETAILS table. Along with granular metrics, this information can be used to troubleshoot any unbalanced problematic partitions.
  • This release has feature parity with 4.x release and is also backword compatible while adding the above mentioned improvements. However, we are upgrading it to 5.x as its a major rewrite of the code to make it Spark native.

Jar file can also be found in the packages section here.

4.7.0

25 Oct 13:56
Compare
Choose a tag to compare

Key Highlights

  • CDM refractored to work on a Spark Cluster
  • More performant for large migration efforts (multi-terabytes clusters with several billions of rows) using Spark Cluster (instead of individual VMs)
  • No functional changes and fully backward compatible

Note: The Spark Cluster based deployment in this release currently has a bug. It reports '0' for all count metrics, while doing underlying tasks (Migration, Validation, etc.). We are working to address this in the upcoming releases. Also note that this issue is only with the Spark cluster deployment and not with the single VM run (i.e. no impact to current users).

Jar file can also be found in the packages section here.

4.6.1

22 Oct 03:25
Compare
Choose a tag to compare

Key Highlights

  • Make trackRun feature work on all versions of Cassandra/DSE by replacing the IN clause on cdm_run_details table.
  • Updated README docs.

Jar file can also be found in the packages section here.

4.6.0

21 Oct 19:09
Compare
Choose a tag to compare

Key Highlights

  • Allow using Collections and/or UDTs for ttl & writetime calculations. This is specifically helpful in scenarios where the only non-key columns are Collections and/or UDTs.

Jar file can also be found in the packages section here.

4.5.1

14 Oct 16:43
Compare
Choose a tag to compare

Key Highlights

  • Made CDM generated SCB unique & much short-lived when using the TLS option to connect to Astra more securely.

Jar file can also be found in the packages section here.

4.5.0

07 Oct 18:00
Compare
Choose a tag to compare

Key Highlights

  • Upgraded to use log4j 2.x and included a template properties file that will help separate general logs from CDM class specific logs including a separate log for rows identified by DiffData (Validation) errors.
  • Upgraded to use Spark 3.5.3

Jar file can also be found in the packages section here.

4.4.1

19 Sep 23:24
Compare
Choose a tag to compare

Key Highlights

  • Added two new codecs STRING_BLOB and ASCII_BLOB to allow migration from TEXT and ASCII fields to BLOB fields. These codecs can also be used to convert BLOB to TEXT or ASCII, but in such cases the BLOB value must be TEXT based in nature & fit within the applicable limits.

Jar file can also be found in the packages section here.

4.4.0

19 Sep 13:06
Compare
Choose a tag to compare

Key Highlights

  • Added property spark.cdm.connect.origin.tls.isAstra and spark.cdm.connect.target.tls.isAstra to allow connecting to Astra DB without using SCB. This may be needed for enterprises that may find credentials packaged within SCB as a security risk [while actually it is not a real concern as they're protected with access tokens; having access to just one of them won't grant access to the Astra DB cluster]. TLS properties can now be passed as params OR wrapper scripts (not included) could be used to pull sensitive credentials from a vault service in real-time & pass them to CDM.
  • Switched to using Apache Cassandra® 5.0 docker image for testing
  • Introduces smoke testing of vector CQL data type

Jar file can also be found in the packages section here.