Skip to content

Commit

Permalink
Merge pull request #4 from murilocmiranda/master
Browse files Browse the repository at this point in the history
Enhancements to node_collector.sh and README updates
  • Loading branch information
anupshirolkar authored Mar 29, 2023
2 parents 1b1cee1 + e235583 commit 3b56e56
Show file tree
Hide file tree
Showing 3 changed files with 182 additions and 50 deletions.
41 changes: 32 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,24 +2,47 @@
This tool is used to collect information from a Cassandra cluster to add in problem diagnosis or review.

# Design info:
There are two scripts used in instacollector tool. The `node_collector.sh` is supposed to be executed on each Cassandra node.
The `cluster_collector.sh` can be executed on a machine connected to Cassandra cluster e.g. user laptop or Jumpbox having connectivity
with Cassandra cluster.
There are two scripts used in Instacollector tool. The `node_collector.sh` is supposed to be executed on each Cassandra node.
The `cluster_collector.sh` can be executed on a machine connected to Cassandra cluster e.g. user laptop or Jumpbox having connectivity with Cassandra cluster.

The node_collector.sh executes Linux and nodetool commands and copies configuration and log files required for cluster health check.
The cluster_collector.sh executes node_collector.sh on each Cassndra node using ssh.
The `node_collector.sh` executes Linux and nodetool commands and copies configuration and log files required for cluster health check.
The `cluster_collector.sh` executes `node_collector.sh` on each Cassandra node using ssh.
It uses a file containing IP addresses or host names of Cassandra cluster nodes to establish ssh connections.



# Execution settings:
The cluster_collector.sh has setting of connecting to cluster nodes using key file or id file.
If the ssh key has passphrase enabled then please use ssh-agent and ssh-add commands to add the passphrase before running cluster_collector.sh.
The `cluster_collector.sh` has setting of connecting to cluster nodes using key file or id file.
If the ssh key has passphrase enabled then please use `ssh-agent` and `ssh-add` commands to add the passphrase before running `cluster_collector.sh`.
If there is another method required for `ssh`, user is requested to change the script as applicable.
Alternatively, the node_collector.sh can also be executed on individual nodes if cluster_collector.sh is not useful in any case.
Alternatively, the `node_collector.sh` can also be executed on individual nodes if `cluster_collector.sh `is not useful in any case.

The `cluster_collector.sh` supports optional arguments to provide username and password, to work with JMX authentication and is going to ask for the username to log into the cluster node OS, local path of the identity file and a file with the list of node IPs:

```
Usage: cluster_collector.sh [-u username -p password]
-u JMX agent username. [optional]
-p Password. [optional]
```

The `node_collector.sh` supports optional arguments to provide username and password, to work with JMX authentication:

```
Usage: node_collector.sh [-u username -p password]
-u JMX agent username. [optional]
-p Password. [optional]
```

Bellow is an example of a file containing `the list of IPs` to collect the data (one IP per line):

```
10.10.2.196
10.10.3.64
10.10.3.148
```

The Cassandra configuration file locations, data directory location and other settings are used as per Apache Cassandra default setup.
User is requested to change those in node_collector.sh if other values are required.
**User is requested to change those in `node_collector.sh` if other values are required.**

**Note:** The scripts should be executed on bash shell.

Expand Down
43 changes: 40 additions & 3 deletions cluster_collector.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,25 @@
#!/bin/bash

help_function()
{
echo ""
echo "Usage: $0"
echo "Usage: $0 -u username -p password"
echo -e "\t-u Remote JMX agent username."
echo -e "\t-p Password."
exit 1 # Exit script after printing help
}

# Handles parameters
while getopts "u:p:" opt
do
case "$opt" in
u ) parameter_username="$OPTARG" ;;
p ) parameter_password="$OPTARG" ;;
? ) help_function ;; # Print help_function in case parameter is non-existent
esac
done

#GLOBAL VARIABLES
SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
INFO_DIR=/tmp/InstaCollection_$(date +%Y%m%d%H%M)
Expand All @@ -21,10 +41,27 @@ if [[ ! -f ${peers_file} || ! -s ${peers_file} ]]; then
fi

#Execute the node_collector on each node
while read peer
do
if ! [[ -z "${parameter_username}" && -z "${parameter_password}" ]];
then
while read peer
do
if [ -z "$(ssh-keygen -F $peer)" ]; then
ssh-keyscan -H $peer >> ~/.ssh/known_hosts
fi

ssh -i $id_file $user@$peer "bash -s" < node_collector.sh -u $parameter_username -p $parameter_password &
done < "$peers_file"
else
while read peer
do
if [ -z "$(ssh-keygen -F $peer)" ]; then
ssh-keyscan -H $peer >> ~/.ssh/known_hosts
fi

ssh -i $id_file $user@$peer "bash -s" < node_collector.sh &
done < "$peers_file"
done < "$peers_file"
fi


#waiting for all node_collectors to complete
wait
Expand Down
148 changes: 110 additions & 38 deletions node_collector.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,65 +20,136 @@ io_stats_file=$data_dir/io_stat.info

copy_config_files()
{
echo "$ip : Copying files"
local config_files=("$CONFIG_PATH/cassandra.yaml" "$CONFIG_PATH/cassandra-env.sh" "$LOG_PATH/system.log" "$CONFIG_PATH/jvm.options" "$CONFIG_PATH/logback.xml")
echo "$ip : Copying files"
local config_files=("$CONFIG_PATH/cassandra.yaml" "$CONFIG_PATH/cassandra-env.sh" "$CONFIG_PATH/jvm.options" "$CONFIG_PATH/logback.xml")

if [ "$GC_LOGGING_ENABLED" == "yes" ]
then
config_files+=( "$GC_LOG_PATH/gc.log*" )
fi
for i in "${config_files[@]}"
do
cp $i $data_dir
done
}

for i in "${config_files[@]}"
do
cp $i $data_dir
done
copy_log_files()
{
echo "$ip : Copying log files"
local log_files=("$LOG_PATH/system.log" "$LOG_PATH/debug.log")

if [ "$GC_LOGGING_ENABLED" == "yes" ]
then
log_files+=( "$GC_LOG_PATH/gc.log*" )
fi

for i in "${log_files[@]}"
do
cp $i $data_dir
done
}

get_size_info()
{
echo "$ip : Executing linux commands"
local commands=("df -h" "du -h")
local paths=($(echo "$DATA_PATHS" | tr ',' '\n'))
echo "$ip : Executing linux commands"
local commands=("df -h" "du -h")
local paths=($(echo "$DATA_PATHS" | tr ',' '\n'))

for i in "${commands[@]}"
do
for j in "${paths[@]}"
for i in "${commands[@]}"
do
echo "" >> $data_file
k=$(echo $i $j)
echo "$k" >> $data_file
eval $k >> $data_file
for j in "${paths[@]}"
do
echo "" >> $data_file
k=$(echo $i $j)
echo "$k" >> $data_file
eval $k >> $data_file
done
done
done
}

get_io_stats()
{
echo "$ip : Executing iostat command"
#Collecting iostat for 60 sec. please change according to requirement
eval timeout -sHUP 60s iostat -x -m -t -y -z 30 < /dev/null > $io_stats_file
echo "$ip : Executing iostat command"
#Collecting iostat for 60 sec. please change according to requirement
eval timeout -sHUP 60s iostat -x -m -t -y -z 30 < /dev/null > $io_stats_file

}

get_node_tool_info()
get_nodetool() # Prameters: username, password
{
#The nodetool commands and their respective filenames are on the same index in the arrays
#the total number of entries in the arrays is used in the for loop.

local commands=("nodetool info" "nodetool version" "nodetool status" "nodetool tpstats" "nodetool compactionstats -H" "nodetool gossipinfo" "nodetool cfstats -H" "nodetool ring")
local filenames=("nodetool_info" "nodetool_version" "nodetool_status" "nodetool_tpstats" "nodetool_compactionstats" "nodetool_gossipinfo" "nodetool_cfstats" "nodetool_ring")
# Handles parameters
nodetool_args=""
if ! [[ -z "${parameter_username}" && -z "${parameter_password}" ]];
then
nodetool_args="-u $parameter_username -pw $parameter_password"
fi

#The nodetool commands and their respective filenames are on the same index in the arrays
#the total number of entries in the arrays is used in the for loop.

local commands=("nodetool ${nodetool_args} describecluster" "nodetool ${nodetool_args} info" "nodetool ${nodetool_args} version" "nodetool ${nodetool_args} status" "nodetool ${nodetool_args} tpstats" "nodetool ${nodetool_args} compactionstats -H" "nodetool ${nodetool_args} gossipinfo" "nodetool ${nodetool_args} cfstats -H" "nodetool ${nodetool_args} ring")
local filenames=("nodetool_describecluster" "nodetool_info" "nodetool_version" "nodetool_status" "nodetool_tpstats" "nodetool_compactionstats" "nodetool_gossipinfo" "nodetool_cfstats" "nodetool_ring")

echo "$ip : Executing nodetool commands "

for i in {0..8}
do
local cmd_file=$data_dir/${filenames[i]}.info
echo "" >> $cmd_file
eval ${commands[i]} >> $cmd_file
done

echo "$ip : Executing nodetool commands "
}

for i in {1..8}
do
local cmd_file=$data_dir/${filenames[i]}.info
get_nodetool_tablehistograms() # Prameters: username, password
{
# Handles parameters
nodetool_args=""
if ! [[ -z "${parameter_username}" && -z "${parameter_password}" ]];
then
cqlsh_args="-u $parameter_username -p $parameter_password"
nodetool_args="-u $parameter_username -pw $parameter_password"
fi

local cmd_file="${data_dir}/nodetool_tablehistograms.info"
echo "" >> $cmd_file
eval ${commands[i]} >> $cmd_file
done

# Fetch all the keyspaces
cqlsh_keyspace_arr=($(cqlsh $(hostname -i) ${cqlsh_args} -e "DESC KEYSPACES;"))
cqlsh_keyspace_arr=("${cqlsh_keyspace_arr[@]//$'\n'/}")
for i in "${cqlsh_keyspace_arr[@]}"
do
# Fetch all the tables
cqlsh_tables_arr=($(cqlsh $(hostname -i) ${cqlsh_args} -e "USE ${i}; DESC TABLES;"))
cqlsh_tables_arr=("${cqlsh_tables_arr[@]//$'\n'/}")
for j in "${cqlsh_tables_arr[@]}"
do
eval "nodetool ${nodetool_args} tablehistograms ${i} ${j}" >> "$cmd_file" 2> /dev/null
done

unset cqlsh_tables_arr

done
}


help_function()
{
echo ""
echo "Usage: $0"
echo "Usage: $0 -u username -p password"
echo -e "\t-u Remote JMX agent username."
echo -e "\t-p Password."
exit 1 # Exit script after printing help
}

# Handles parameters
while getopts "u:p:" opt
do
case "$opt" in
u ) parameter_username="$OPTARG" ;;
p ) parameter_password="$OPTARG" ;;
? ) help_function ;; # Print help_function in case parameter is non-existent
esac
done

# Starts actual script execution
echo "$ip : Creating local directory for data collection $data_dir"
#rename already exsisting directory
mv $data_dir $data_dir_`date +%Y%m%d%H%M` 2>/dev/null
Expand All @@ -87,9 +158,10 @@ mkdir $data_dir
#start execution
get_io_stats &
copy_config_files &
copy_log_files &
get_size_info &
get_node_tool_info &

get_nodetool $parameter_username $parameter_password &
get_nodetool_tablehistograms $parameter_username $parameter_password &

echo "$ip : Waiting for background functions to complete"
wait
Expand Down

0 comments on commit 3b56e56

Please sign in to comment.