Skip to content
This repository has been archived by the owner on Jan 3, 2023. It is now read-only.

Releases: Intel-bigdata/SSM

SSM v1.5.0

28 Jun 07:01
Compare
Choose a tag to compare

Highlight:

  • Add HDFS 3.x support. HDFS 3.1.0 dependency is introduced to SSM.
  • Add SSM Erasure Coding (EC) feature.
    • Support converting a typical 3-replica file to the one stored in a given EC policy.
    • Support user to customize an EC task with SSM rule.
  • Add SSM compression feature. SSM supports compressing an HDFS file with a given codec. User can also customize a compression task with SSM rule.
  • A few web UI optimizations.
    • Support action search.
    • Add goto certain page support in action page.
    • Add statistic info for ssm nodes.
  • Refined all docs.
  • Add a few test scripts.
    • Add SSM EC functionality and performance test scripts.
    • Add SSM Compression functionality and performance test scripts.

Change Log:

  • Ignore SSM work dir and trigger fetch operation when a file is moved to covered dir (#2067)
  • Add EC performance test scripts (#2059)
  • Enable SSM only cover the info of user specified HDFS dirs (#2045)
  • Refine storage utilization page & node info page (#2050)
  • Merge SSM Compression to trunk (#1873)
  • Support deploying master and standby server on a single node (#2027)
  • Support debug mode for standby server & agent (#2020)
  • Add CheckSumAction (#1974)
  • Add read end flag of SSM Small File (#1975)
  • Enable user password changing (#1964)
  • Solve #1939, Add goto certain page support in action page (#1945)
  • Solve #1941, Add node statistic info in node page (#1944)
  • Add a throttle on EC to avoid IO overload (#1947)
  • Add a series of actions for EC policy (#1937)
  • Solve #1931, Support EC in rule
  • Support Erasure Coding on HDFS-3.x (#1924)
  • Solve #1888, Add statistic info for ssm nodes
  • Support HDFS 3.x (#1913)
  • Show which version of hadoop ssm is compiled for by ssm version cmd
  • Solve #1889, Assign a unique id to ssm agents (#1890)
  • Solve #1876, Add host info for actions (#1877)
  • Add install script to simplify the deployment of SSM (#1874)
  • Solve #1848, Add action to execute general command (#1867)
  • Solve #1838, Add search support for actions (#1842)
  • Support showing SSM detailed version (#1847)
  • Add data sync performance test scripts (#1831)

SSM v1.4.0

29 Jun 09:10
Compare
Choose a tag to compare

Highlights:

  • Small File Solution. SSM compacts batch of small files into one big file stored in HDFS, through this saves memory for managing block info and improves the Namenode scalability, at the same time optimizes the small file read performance.
  • Disaster Recovery Solution with S3 support, files now can be synced to S3 cluster. [Experimental]
  • Refined SSM high-availability.
  • Functional enhancements:
    • Add one-shot rule support
    • Add file relative-temperature support
    • Add new actions (sleep, trancate0, alldisk, onedisk, ramdisk, copy2s3)
    • Add new properties to define rule (isDir, acTop, acTopSp, acBot, acBotSp, storage capacity/free/utilization)
    • Dynamic add new SSM Agents support
    • Add throughput throttling for Disaster Recovery Solution
    • UI enhancements for monitoring Metastore utilization, submitting rule/cmdlet, displaying resource ...
    • Historical cmdlet/file access info/file diff info purging
    • Configurable JVM parameters for SSM server/agents
    • Defer cmdlet execution support
  • Performance optimizations:
    • Concurrent Namespace fetcher
    • Concurrent cmdlet dispatcher
    • Optimized meta store access.
    • Optimizations for handling large HDFS namespace
    • Optimize cmdlet status report
    • Fine-grained locks for many shared resources
    • Load balance for cmdlet execution
  • Refined SSM documents
  • Add a lot of scripts for SSM functional and performance tests

Change log:

  • #1442, Fix and refactor CopyScheduler failover (#1826)
  • #1820, Fix reporting file rename action success but actually failed bug (#1821)
  • #1791, Fix destination path bug in copy scheduler (#1792)
  • #1787, Fix timing bug for schedule-failed cmdlets (#1788)
  • Avoid NPE when tackling timeout action (#1780)
  • #1701, Add one-shot rule support
  • #1681, Add data throttling for file copy action
  • #1678, Fix delete failure during DFSIO (#1741)
  • #1728, Dispatch cmdlet to given node with free slot (#1729)
  • #1721, Fix namespace mismatch caused by unlink (#1717)
  • #1707, Refine handling of AT-trigger rule (#1708)
  • #1692, Defer cmdlet execution support (#1696)
  • #1688, Downgrade a error log to debug in CopyScheduler.baseSync (#1695)
  • #1653, Fix cmdlet generation issues when stopping rule (#1654)
  • Avoid NPE for inferCmdletStatus and batchsync actions (#1648)
  • #1605, Delete unfinished cmdlet and action when a rule is disabled (#1607)
  • #1636, Avoid null pointer exception for mapStorageCapacity (#1638)
  • #1624, Synchronization issue on mapStorageCapacity (#1625)
  • #1617, Fix namespace fetching heap memory usage issue (#1618)
  • #1615, Fix bug in RPC API getActionInfo (#1616)
  • #1608, Fix file mover OOM issue (#1610)
  • #1595, Fix use the correct rpc server address to create SmartDFSClient
  • #1176, Show action execution result under submission area. (#1592)
  • #1524 and #1563, tune Performance on large namespace (#1566)
  • #1568, Create SSM Id file /system/ssm.id in HDFS (#1569)
  • Fix file_state key length on mysql 5.6 or older (#1551)
  • #1546, Testing feature: Skip fetch entire HDFS namespace and update based on iNotify only
  • Format the database only when all necessary tables don't exist
  • #1531, Adjust storage utilization UI page
  • #1519, Fix dest path issue in copy2s3 related rule (#1521)
  • Fix action args column type and catch launchCmdlet exception.
  • #1517, Fix fake data generation bug (#1518)
  • #1504, Fix long-run action state update issue (#1506)
  • #1480, Add statistic info for dispatcher
  • #1499, Fix memory exhaust due to too many pending cmdlets (#1500)
  • #1478, Fix cmdlet dispatcher performance issue (#1479)
  • Fix the notebook bugs. (#1465)

SSM v1.3.2

09 Feb 08:38
Compare
Choose a tag to compare

Highlights:

  • Display the storage utilization info for cache, SSD, archive and disk it in SSM's web UI dynamically.
  • The Smart Server code is refined to improve SSM's performance on huge namespace with 10 million files.

Change log:

  • #1176, Show action execution result under submit area.
  • #1579, Add enable/disable smart client scripts.
  • #1524 & #1563, Tune SSM performance on huge namespace.
  • Add SetXattr feature to Copy2S3Action.
  • #1544, Fix cmdlet dispatch bug when local execution disabled.
  • Add storage and node info pages in web UI.
  • #1536, Define interface to query node info of SSM cluster.
  • #1508, Define interface to get historical data for resources.
  • #1499, Avoid memory exhaust due to too many pending cmdlets.
  • Add Truncate0Action.
  • #1463 , Record user's action history in smart notebook page.

SSM v1.3.1

01 Dec 15:40
Compare
Choose a tag to compare

Highlights:

  • Asynchronous data copy to S3. New action 'copy2s3' added to copy a file from local Hadoop cluster to S3 buckets. Please refer S3 Supporting for more info.
  • Action chain supported. An action chain contains 2 or more actions and these actions are executed sequentially. Failure of one action will cause the whole action chain execution failure. Please refer Section SSM rule examples in SSM Deployment Guide.
  • Cmdlet history purging. Two policies supported (number of cmdlets and time-to-live) for cmdlet purging. Please refer Section Performance tuning in SSM Deployment Guide.
  • Dynamic enable/disable SmartDFSClient supported. Please refer Section Performance tuning in SSM Deployment Guide.

Change log:

  • #1459, Fix stop cmdlet error
  • Add endpoint and proxy to s3-support
  • #1419, Get hazelcast members from separate file
  • Solve S3 dependency of CDH-5.10.1 (hadoop-2.6)
  • #1438, Fix data sync rename and baseSync crash
  • Enable copy to S3 feature
  • Fix help bugs and delete websocket service from zepplin-web
  • #1432, Set host info for active smart server
  • #1422, Fix mover test random failure
  • Fix smart-table bug and remove websocket form notebookCtr
  • #858, Add user info to file Access Event
  • #1416, Complete disable SmartClient on node
  • #1285, Mechanism to remove non-needed records from tables
  • #1409, Delete finished cmdlets to keep only N newest finished cmdlets
  • #1402, Delete finished cmdlets before given timestamp
  • #1386, Make tidb ports & db user's password configurable
  • #1256, Fix unnecessary sync actions caused by dirty cache
  • #1399, Fix TestMoverExecutor random failure
  • #1397, Fix no cmdlet executor service available
  • Refine enable-kerberos.md
  • #1394, Fix data node storage data incorrect issue

SSM v1.3.0

09 Nov 13:13
Compare
Choose a tag to compare

Highlights:

  • Metastore high-availability support. TiDB embedded into SSM to provide better scalability and high availability over MySQL for SSM. TiDB (https://github.com/pingcap/tidb) is a distributed MySQL-compatible database.
  • Security-related enhancements:
    • Login mechanism enabled for Web UI
    • Kerberos authentication supported
  • Web UI optimized for handling a large number of files/actions.
  • Cmdlet processing pipeline optimized for better efficiency and stability.

Change log:

  • #1384, Enable Kerberos for SSM
  • #1365, Refine execution time of move action
  • #1211, Refine cmdlet scheduler
  • #1162, Dispatch actions to given type of executor
  • #1123, Fix list actions slow for 10K+ actions
  • #1315, Add data sync stress test script
  • Enable code-style check for SSM
  • #1207, Fix reloading pending cmdlets and actions
  • Enable web UI login
  • #1268,Fix action running time display error
  • #1252, Refine baseSync and remote check
  • Fix issues in date display, action log format and cluster tab
  • #1202, Fix mysql password is visible in log files
  • Update hdfs-ssm-design.md
  • #1189, Make scheduler and dispatcher work in parallel
  • #1183, Millisecond-level rule executor support
  • #1172, add batch Support for baseSync
  • #1170, Add option to disable local cmdlet execution
  • Refine scheduler.FAIL action status

SSM v1.2.0

24 Sep 11:47
Compare
Choose a tag to compare

Highlights:

  1. Asynchronous data sync between HDFS clusters. This is the first stage of SSM Cluster Disaster Recovery solution. Sync operations can be triggered in a few seconds after changes happened in the source cluster. Both file content (incremental sync) and metadata are synced.
  2. Action scheduler and rule plugin mechanism support. Interfaces provided for user to control the processing of action and rule.
  3. Web UI refined. Improves the interactive experience and provides more specific metrics for scenarios like data disaster recovery and movement of data with different temperatures.

Change log:

  • #1155, Fix list cached files failure
  • Fix a deadlock caused by meta
  • #1126, Fix slow when loading helper web page
  • Create support-new-action-guide.md
  • #1119, Refine Mover progress report
  • #1127,Refine 'Actions' and 'Cluster' web UI
  • #1128, Data throttling for file move action
  • #1109, Upgrade Kerby version to v1.0.1
  • #1107, Fix random failure of TestActionRestApi
  • #1105, Fix action progress data error issue
  • #1095, Namespace not sync after restart SSM
  • Add list file actions for mover and copy
  • Remove duplicate actions
  • Add mover page to list movers by rule
  • List Move ans Sync rule restful api
  • Add list actions by rid
  • #1030 Disable smartnotebook broadcast paragraph
  • Add listActionsByType Restful API support
  • Add metastore support for listing actions by type
  • #1022, Fix ActionInfo been modified unexpectedly issue
  • #1015, Add interface to impact the execution of rules
  • #920 Store datanodeinfo fetched into metastore
  • #1011, Add action scheduler interface
  • Avoid fetching the name space if possible when start SSM server

SSM v1.1.0

06 Sep 15:33
Compare
Choose a tag to compare

Highlights:

  1. File movement enhancement. Shift load from SSM server, decrease the impaction on Namenode when moving files, and make it possible for potential fine-grained move scheduler.
  2. HDFS HA support.
  3. SSM Agent support. Optional service, actions can be dispatched to agents for execution to shift loads from SSM server.
  4. New web user interface. Based on Apache Zeppelin, many improvements made for better user experience.
  5. Multi-hadoop version support. Project code refactored to provide multi-hadoop version support architecturally, currently supports Apache Hadoop 2.7 and Cloudera CDH-5.10.
  6. Multi-JDK version support. Supports JDK 1.7 and 1.8 architecturally.
  7. Verified in integration tests. More than 40 cases tested to verify its functionality and robustness.
  8. Performance improvement. Many performance issues fixed for better performance.

Change log:

  • #994, Adjust key length for tables to 1000bytes
  • #986, Add index for tables in SSM
  • #984, Make number of rpc handlers configurable
  • #978, Add properties support in move plan to control mover behavior
  • Access HA Namenode support
  • Add Help pages for action and rule
  • #949, Add define for sys_info and cluster_info table
  • #947, Make name space fetcher run asynchronously
  • Improve error message once malformat druid.xml is detected
  • #946, Fix the ace editor bug
  • #938, Changing script names to use "ssm" instead of "smart"
  • #934, Refactor SmartFileSystem
  • Update ssm-cdh5.10-deployment-guide.md
  • Set key limits to 512 bytes
  • Initially introduce in meta service
  • Add support for testing multiple profiles on travis
  • #930, Fix the googlefonts build bug
  • #927, Add web UI for mover and copy
  • #916, Add mover and copy view
  • #914, Add cmdlet based CopyScheduler
  • #905, Adjust metastore log
  • #901, Add BackUp Dao
  • #900, Fix caret issue in web UI
  • #891, Refactor hdfs related tests
  • #885, Fix code style of config Dao
  • #882, Add unit tests for move file actions
  • #881, Export interfaces of datanode related DAOs in MetaStore
  • #879, Add Batch insert with increment in globalconfigdao
  • #878, Add batch insert with increment primary key in clusterConfigDao
  • #869, Add methods for cluster config and global config in metastore
  • #864, Fix some bugs and add more methods in Dao
  • #850, Fix web context setup failure issue
  • #806, Add Service module into SmartAgent
  • #778, Add Rule format description on web UI
  • #773, Add Copy, Rename and delete FileAction
  • #761, Refine boot scripts
  • #757, Fix Tidb create table issue
  • #751, Fix rule one shot issue when no time specified
  • #741, Remove the usage of view
  • #719, Merge branch 'zeppelin' based web UI into trunk
  • #665, Merge start-agent.sh into start-smart.sh

SSM v1.0.0

06 Sep 14:26
Compare
Choose a tag to compare

Highlights:

  1. SSM core modules implemented
  2. Access based hot/cold data storage management feature implemented
  3. Web UI implemented for SSM management