Releases: datastax/cassandra-data-migrator
5.1.1
Key Highlights
- Writetime filter has been fixed to work correctly when used with custom
writetimestamp
(bug fixed) - Removed deprecated properties
printStatsAfter
andprintStatsPerPart
. Run metrics should now be tracked using thetrackRun
feature instead.
Jar file can also be found in the packages section here.
5.1.0
Key Highlights
- Improves metrics output by producing stats labels in an intuitive and consistent order
- Refactored JobCounter by removing any references to
thread
orglobal
as CDM operations are now isolated within partition-ranges (parts
). Each suchpart
is then parallelly processed and aggregated by Spark.
Jar file can also be found in the packages section here.
5.0.0
Key Highlights
- CDM refactored to be fully Spark Native and more performant when deployed on a multi-node Spark Cluster
trackRun
feature has been expanded to recordrun-info
for each part in theCDM_RUN_DETAILS
table. Along with granular metrics, this information can be used to troubleshoot any unbalanced problematic partitions.- This release has feature parity with 4.x release and is also backword compatible while adding the above mentioned improvements. However, we are upgrading it to 5.x as its a major rewrite of the code to make it Spark native.
Jar file can also be found in the packages section here.
4.7.0
Key Highlights
- CDM refractored to work on a Spark Cluster
- More performant for large migration efforts (multi-terabytes clusters with several billions of rows) using Spark Cluster (instead of individual VMs)
- No functional changes and fully backward compatible
Note: The Spark Cluster based deployment in this release currently has a bug. It reports '0' for all count metrics, while doing underlying tasks (Migration, Validation, etc.). We are working to address this in the upcoming releases. Also note that this issue is only with the Spark cluster deployment and not with the single VM run (i.e. no impact to current users).
Jar file can also be found in the packages section here.
4.6.1
Key Highlights
- Make
trackRun
feature work on all versions of Cassandra/DSE by replacing theIN
clause oncdm_run_details
table. - Updated
README
docs.
Jar file can also be found in the packages section here.
4.6.0
Key Highlights
- Allow using
Collections
and/orUDTs
forttl
&writetime
calculations. This is specifically helpful in scenarios where the only non-key columns areCollections
and/orUDTs
.
Jar file can also be found in the packages section here.
4.5.1
Key Highlights
- Made CDM generated SCB unique & much short-lived when using the TLS option to connect to Astra more securely.
Jar file can also be found in the packages section here.
4.5.0
Key Highlights
- Upgraded to use log4j 2.x and included a template properties file that will help separate general logs from CDM class specific logs including a separate log for rows identified by DiffData (Validation) errors.
- Upgraded to use Spark 3.5.3
Jar file can also be found in the packages section here.
4.4.1
Key Highlights
- Added two new codecs
STRING_BLOB
andASCII_BLOB
to allow migration fromTEXT
andASCII
fields toBLOB
fields. These codecs can also be used to convertBLOB
toTEXT
orASCII
, but in such cases theBLOB
value must beTEXT
based in nature & fit within the applicable limits.
Jar file can also be found in the packages section here.
4.4.0
Key Highlights
- Added property
spark.cdm.connect.origin.tls.isAstra
andspark.cdm.connect.target.tls.isAstra
to allow connecting to Astra DB without using SCB. This may be needed for enterprises that may find credentials packaged within SCB as a security risk [while actually it is not a real concern as they're protected with access tokens; having access to just one of them won't grant access to the Astra DB cluster]. TLS properties can now be passed as params OR wrapper scripts (not included) could be used to pull sensitive credentials from a vault service in real-time & pass them to CDM. - Switched to using Apache Cassandra®
5.0
docker image for testing - Introduces smoke testing of
vector
CQL data type
Jar file can also be found in the packages section here.