Skip to content

Commit

Permalink
feat: Harvest should monitor wafl.dir.size.warning
Browse files Browse the repository at this point in the history
Fixes: #3243
  • Loading branch information
cgrinds committed Nov 18, 2024
1 parent d6443e0 commit edb92fe
Show file tree
Hide file tree
Showing 3 changed files with 43 additions and 1 deletion.
7 changes: 7 additions & 0 deletions conf/ems/9.6.0/ems.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -944,6 +944,13 @@ events:
- parameters.mirror_config_id => mirror_config_id
- parameters.primary_config_id => primary_config_id

- name: wafl.dir.size.warning
exports:
- parameters.fileid => directory_inum
- parameters.vol => volume
- parameters.app => app
- parameters.volident => vol_ident

- name: wafl.readdir.expired
exports:
- parameters.app => app
Expand Down
25 changes: 24 additions & 1 deletion container/prometheus/ems_alert_rules.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2008,4 +2008,27 @@ groups:
{{- end -}}
annotations:
summary: "SnapMirror active sync planned failover operation completed for Destination path: \"{{ $labels.dstpath }}\"."
impact: "Protection"
impact: "Protection"

- alert: Directory size is approaching the maximum directory size (maxdirsize) limit
expr: last_over_time(ems_events{message="wafl.dir.size.warning"}[5m]) == 1
labels:
severity: >
{{- if $labels.severity -}}
{{- if eq $labels.severity "alert" -}}
critical
{{- else if eq $labels.severity "error" -}}
warning
{{- else if eq $labels.severity "emergency" -}}
critical
{{- else if eq $labels.severity "notice" -}}
info
{{- else if eq $labels.severity "informational" -}}
info
{{- else -}}
{{ $labels.severity }}
{{- end -}}
{{- end -}}
annotations:
summary: "Directory size for file ID \"{{ $labels.directory_inum }}\" in volume \"{{ $labels.volume }}\" is approaching the maximum directory size (maxdirsize) limit."
impact: "Availability"
12 changes: 12 additions & 0 deletions docs/resources/ems-alert-runbook.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,18 @@ If you use Cloud Volumes ONTAP, perform the following corrective actions:
2. Ensure that the login and connectivity information is still valid.
Contact NetApp technical support if the issue persists.

### Directory size is approaching the maximum directory size (maxdirsize) limit

**Impact**: Availability

**EMS Event**: `wafl.dir.size.warning`

This message occurs when the size of a directory surpasses a configured percentage (default: 90%) of its current maximum directory size (maxdirsize) limit.

**Remediation**

Use the "volume file show-inode" command with the file ID and volume name information to find the file path. Reduce the number of files in the directory. If not possible, use the (privilege:advanced) option "volume modify -volume vol_name -maxdir-size new_value" to increase the maximum number of files per directory. However, doing so could impact system performance. If you need to increase the maximum directory size, contact NetApp technical support.

### Disk Out of Service

**Impact**: Availability
Expand Down

0 comments on commit edb92fe

Please sign in to comment.