Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev #13

Merged
merged 56 commits into from
Aug 31, 2024
Merged

Dev #13

Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
bfe5db6
fix: change power_cycles to power_cycle
Jun 16, 2024
2f6836f
chore: Update readme
Jun 16, 2024
0dd2db2
fix: remove debug code
Jun 16, 2024
1db3e78
fix: use cycles instead of cycle
Jun 16, 2024
15da858
feat: use json output
Jul 20, 2024
906272d
wip:tests
Jul 23, 2024
388199a
wip:compact main
Jul 27, 2024
b4baef7
wip:compact main more
Jul 28, 2024
a36e66c
chore: fix sat metrics
Aug 9, 2024
11f2d7f
chore: remove test bak file
Aug 9, 2024
4b8dc0f
feat: add written_bytes
Aug 13, 2024
abbe3a5
get sector_size from smartctl info
Aug 18, 2024
58d76a3
add coverage to readme
Aug 21, 2024
9100d74
add workflow for coverage
Aug 21, 2024
72ab699
add workflow for coverage
Aug 21, 2024
b822b40
add workflow for coverage
Aug 21, 2024
9abd107
add workflow for coverage
Aug 21, 2024
aa7fe4a
add workflow for coverage
Aug 21, 2024
96bff81
add workflow for coverage
Aug 21, 2024
b0179c6
add workflow for coverage
Aug 21, 2024
6e6871e
add workflow for coverage
Aug 21, 2024
0302c2f
add workflow for coverage
Aug 21, 2024
a479533
add workflow for coverage
Aug 21, 2024
0b3042a
add workflow for coverage
Aug 21, 2024
a5a3d2d
add workflow for coverage
Aug 22, 2024
8ccf0ed
add workflow for coverage
Aug 22, 2024
d8c7339
add workflow for coverage
Aug 22, 2024
b513c67
add workflow for coverage
Aug 22, 2024
909bb2c
add workflow for coverage
Aug 22, 2024
a6b1f20
add workflow for coverage
Aug 22, 2024
bf9304a
add workflow for coverage
Aug 22, 2024
a884d62
add workflow for coverage
Aug 22, 2024
db5ddad
add workflow for coverage
Aug 22, 2024
edf0630
add workflow for coverage
Aug 22, 2024
203cd25
add workflow for coverage
Aug 22, 2024
22825c0
add workflow for coverage
Aug 22, 2024
9d0faec
add workflow for coverage
Aug 22, 2024
82e1462
add workflow for coverage
Aug 22, 2024
1dd652d
add workflow for coverage
Aug 22, 2024
f51e3e1
add workflow for coverage
Aug 22, 2024
7beae41
add workflow for coverage
Aug 22, 2024
4c5d412
add workflow for coverage
Aug 22, 2024
43cb188
add workflow for coverage
Aug 22, 2024
0ef13ce
add workflow for coverage
Aug 22, 2024
a0f3aba
add workflow for coverage
Aug 22, 2024
fcb3846
add workflow for coverage
Aug 22, 2024
4ce987e
add workflow for coverage
Aug 22, 2024
78c951f
add workflow for coverage
Aug 22, 2024
f7e8ffa
Update coverage badge
github-actions[bot] Aug 22, 2024
e8f216d
add workflow for coverage
Aug 22, 2024
74965b2
Merge branch 'dev' of github.com:micha37-martins/S.M.A.R.T-disk-monit…
Aug 22, 2024
966179e
Update coverage badge
github-actions[bot] Aug 22, 2024
8a0867c
add workflow for coverage
Aug 22, 2024
0461c8f
Merge branch 'dev' of github.com:micha37-martins/S.M.A.R.T-disk-monit…
Aug 22, 2024
2aa7207
Add install script
Aug 30, 2024
402ca1e
Add Grafana dashboard
Aug 31, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 70 additions & 0 deletions .github/workflows/github-workflow.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# yamllint disable rule:line-length
---
name: Coverage

# yamllint disable-line rule:truthy
on:
push:
branches:
- main
- dev

jobs:
coverage:
runs-on: ubuntu-22.04
container:
image: kcov/kcov:v42
options: --privileged
# volumes:
# - ${{ github.workspace }}:/workspace

steps:
- name: Checkout code
uses: actions/checkout@v3.6.0

- name: Install necessary tools (Git and Curl)
run: |
apt-get update && apt-get install -y curl git jq smartmontools

- name: Install specific version of bats
run: |
curl -L https://github.com/bats-core/bats-core/archive/v1.4.1.tar.gz | tar -xz
cd bats-core-1.4.1
./install.sh /usr/local
rm -rf bats-core-1.4.1

- name: Checkout code with submodules
uses: actions/checkout@v3
with:
submodules: true # Fetch all submodules
fetch-depth: 0 # Fetch full history, required for submodules
submodule-token: ${{ secrets.GITHUB_TOKEN }} # Ensure access to private submodules if needed

- name: Run coverage script with Bash
run: |
chmod +x ./coverage.sh
bash ./coverage.sh

- name: Extract coverage percentage from index.js
id: coverage
shell: bash
run: |
coverage=$(grep -oP '(?<=covered":")[^"]+' ./coverage/test_smartmon.coverage/index.js | head -n 1)
echo "Extracted coverage percentage: $coverage"
echo "::set-output name=coverage::$coverage"

- name: Print Current Working Directory
run: pwd

- name: Configure Git Safe Directory
run: |
git config --global --add safe.directory "$(pwd)"

- name: Update README.md with coverage badge
run: |
sed -i 's|https:\/\/img\.shields\.io\/badge\/Coverage-[0-9]*\(\.[0-9]*\)\?%25-brightgreen|https:\/\/img\.shields\.io\/badge\/Coverage-65.2%25-brightgreen|g' README.md
git config --global user.name 'github-actions[bot]'
git config --global user.email 'github-actions[bot]@users.noreply.github.com'
git add README.md # Ensure README.md is staged
git commit -m "Update coverage badge"
git push origin ${GITHUB_REF#refs/heads/}
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
*.swp
*.swo
*.tmp
coverage/
9 changes: 9 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
[submodule "test/bats"]
path = test/bats
url = https://github.com/bats-core/bats-core.git
[submodule "test/test_helper/bats-support"]
path = test/test_helper/bats-support
url = https://github.com/bats-core/bats-support.git
[submodule "test/test_helper/bats-assert"]
path = test/test_helper/bats-assert
url = https://github.com/bats-core/bats-assert.git
167 changes: 141 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,28 +1,116 @@
# S.M.A.R.T.-disk-monitoring-for-Prometheus text_collector
![Tag](https://img.shields.io/github/v/tag/micha37-martins/S.M.A.R.T-disk-monitoring-for-Prometheus)
![Coverage](https://img.shields.io/badge/Coverage-65.2%25-brightgreen)
![Language](https://img.shields.io/github/languages/top/micha37-martins/S.M.A.R.T-disk-monitoring-for-Prometheus)
![License](https://img.shields.io/github/license/micha37-martins/S.M.A.R.T-disk-monitoring-for-Prometheus)
![Last Commit](https://img.shields.io/github/last-commit/micha37-martins/S.M.A.R.T-disk-monitoring-for-Prometheus)

Prometheus `node_exporter` `text_collector` for S.M.A.R.T disk values
# S.M.A.R.T. disk monitoring for Prometheus
_SMART Exporter for Prometheus node-exporter_

Following dashboards are designed for this exporter:
This script is a specialized tool designed to collect SMART data from various
types of disks (ATA, NVMe) and format it for Prometheus monitoring. To collect
SMART values it uses `smartctl`.

https://grafana.com/dashboards/10530
_Inspired by the great examples of the Prometheus community:_
[textfile-collector-scripts](https://github.com/prometheus-community/node-exporter-textfile-collector-scripts)
___
It has been specifically developed to work seamlessly with the
[SMART Disk Monitoring for Prometheus Dashboard](https://grafana.com/grafana/dashboards/10530-s-m-a-r-t-disk-monitoring-for-prometheus-dashboard/) on Grafana.

https://grafana.com/dashboards/10531
>
>[https://grafana.com/dashboards/10530](https://grafana.com/dashboards/10530)
>![screenshot1](media/grafana_dashboard_1.png)

## Purpose
This text_collector is a customized version of the S.M.A.R.T. `text_collector` example from `node_exporter` github repo:
https://github.com/prometheus/node_exporter/tree/master/text_collector_examples
If you're interested in alternative solutions, you might want to check out the [smartctl_exporter](https://github.com/prometheus-community/smartctl_exporter) project from the Prometheus community.

## Requirements
- Prometheus
___
>**Warning:** This script has been rewritten from version 0.1.0 and has breaking changes. Please be aware of the following:
>
>- A lot of renaming has been done
>- The script now uses the JSON output of `smartctl`
>- SCSI support has been dropped

## Prerequisites
Mandatory:
- Bash
- `jq` (https://jqlang.github.io/jq/download/)
- Root privileges (optional, but required to access SMART data for all disks)

Optional:
- node_exporter
- text_collector enabled for node_exporter
- Grafana = 6.2
- smartmontools = 7
- Grafana >= 10
- smartmontools >= 7

## Usage
1. Clone the repository:

git clone https://github.com/your-username/smart-disk-exporter.git

2. Make the script executable:

chmod +x smartmon.sh

3. Run the script (use `sudo`):

./smartmon.sh

The script will output the SMART data for all detected disks in Prometheus format.

## (WIP)Install / Uninstall
For convenience this repo contains helper scripts for installing and uninstalling.
Make them executable and run the desired script:

>Note that the script should be executed with root privileges as the path are
>usually not accessible for normal users.

Like this:
```sh
chmod +x install.sh
```
```sh
sudo ./install.sh
```

## Set up Prometheus Node Exporter
To enable `text_collector` set the following flag to `node_exporter`:
- `--collector.textfile.directory /var/lib/node_exporter/textfile_collector`

## Set up
To enable text_collector set the following flag for `node_exporter`:
- `--collector.textfile.directory`
run command with `/var/lib/node_exporter/textfile_collector`
example:
```sh
/usr/bin/prometheus-node-exporter --collector.textfile.directory /var/lib/node_exporter/textfile_collector/
```

Install [smartmontools](https://www.smartmontools.org/)

UBUNTU: `sudo apt-get install smartmontools`

To enable the text_collector on your system add the following as cronjob or create
a Systemd timer unit like shown in the install.sh script.

The Cronjob will execute the script every five minutes and save the result to
the `text_collector` directory.

Example for UBUNTU:

```sh
sudo crontab -e

*/5 * * * * /usr/local/bin/smartmon.sh > /var/lib/node_exporter/textfile_collector/smart_metrics.prom
```

## Running Locally
If you want to test the exporter locally. For example on a laptop you can move
the exporter to the following directory and run it.
```sh
# execute collector
sudo sh -c 'smartmon.sh > /var/lib/node_exporter/textfile_collector/smart_metrics.prom'

# let node-exporter run
/usr/bin/prometheus-node-exporter --collector.textfile.directory /var/lib/node_exporter/textfile_collector/
```

## Troubleshooting
To get an up to date version of smartmontools it could be necessary to compile it:
https://www.smartmontools.org/wiki/Download#Installfromthesourcetarball

Expand All @@ -32,17 +120,44 @@ https://www.smartmontools.org/wiki/Download#Installfromthesourcetarball

- save it under `/usr/local/bin/smartmon.sh`

To enable the text_collector on your system add the following as cronjob.
It will execute the script every five minutes and save the result to the `text_collector` directory.
- make sure `/var/lib/node_exporter/textfile_collector/` exists
- `mkdir -p /var/lib/node_exporter/textfile_collector/`


## Development and Testing
### Adding New Metrics
Feel free to adapt this script to your needs. The metrics provided are a subset
of all available, so feel free to add more. If you'd like to add a new metric,
here's a general guide:

1. Identify the new metric you want to add. This can be done by checking the
output of `smartctl -A -j /dev/diskX`, where `X` is the device's name. Look for
the relevant attribute in the JSON output.
2. Modify the appropriate parsing function (`parse_smartctl_attributes_json` for
ATA devices, `parse_smartctl_nvme_attributes_json` for NVMe devices) to add a new `if` statement in the `while` loop to check for the new metric's attribute name.
3. Calculate the new metric's value based on the attribute's value and any relevant conversion factors.
4. Print the new metric in Prometheus format using `printf`.

Example for UBUNTU `crontab -e`:
### Test using bats (bats-core)
How to install and use is best described here: [bats-tutorial](https://bats-core.readthedocs.io/en/stable/tutorial.html)

`*/5 * * * * /usr/local/bin/smartmon.sh > /var/lib/node_exporter/textfile_collector/smart_metrics.prom`
Run tests with:
```sh
bats test
```

## How to add specific S.M.A.R.T. attributes
If you are missing some attributes you can extend the text_collector.
Add the desired attributes to `smartmon_attrs` array in `smartmon.sh`.
### Generate Coverage
```sh
./run_coverage.sh
```
OR
```sh
kcov --bash-dont-parse-binary-dir \
--include-path=. \
/var/tmp/coverage \
bats -t test/test_smartmon.bats
```

You get a list of your disks privided attributes by executing:
`sudo smartctl -i -H /dev/<sdx>`
`sudo smartctl -A /dev/<sdx>`
## TODO
- create container
- Test install.sh script
49 changes: 49 additions & 0 deletions coverage.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
#!/usr/bin/env bash

SRC_DIR="./src"
TEST_DIR="./test"
COVERAGE_DIR="./coverage"

# Function to create the coverage directory
create_coverage_dir() {
mkdir -p "$COVERAGE_DIR"
}

# Function to run kcov with bats tests
run_kcov() {
local test_file; test_file="$1"
local src_file; src_file="$2"
local coverage_file; coverage_file="$COVERAGE_DIR/$(basename "$test_file" .bats).coverage"

if ! kcov --bash-dont-parse-binary-dir --include-path="$SRC_DIR" "$coverage_file" bats -t "$test_file"; then
printf "Error: kcov failed for %s\n" "$test_file" >&2
return 1
fi
}

# Main function
main() {
create_coverage_dir

local test_file; test_file="$TEST_DIR/test_smartmon.bats"
local src_file; src_file="$SRC_DIR/smartmon.sh"

if [[ ! -f "$test_file" ]]; then
printf "Error: Test file %s does not exist\n" "$test_file" >&2
return 1
fi

if [[ ! -f "$src_file" ]]; then
printf "Error: Source file %s does not exist\n" "$src_file" >&2
return 1
fi

if ! run_kcov "$test_file" "$src_file"; then
printf "Error: Failed to run kcov for %s\n" "$test_file" >&2
return 1
fi

printf "Coverage report generated in %s\n" "$COVERAGE_DIR"
}

main "$@"
Loading
Loading