Roadmap

This is a list documenting the work that is being considered;

make it easier to determine the data retention policy in graphite from the ansible playbook. The caveat is that the lowest granularity is 10s - since this syncs with the collectd interval
add an alerting dashboard : create a notification channel, and provide basic alerts from key metrics to this channel
examine whether dashUpdater should be used to set up the notification channel for alerting through Grafana's http api
update the osd collector to report on recovery and backfill latency per OSD, then review how/where these metrics can be used (at-a-glance and osd-latency dashboard to begin with)

investigate tagging - can this be used to tag nic interfaces as public/cluster for frontend/backend charts?
review a migration to prometheus, replacing graphite - but retaining grafana for visualisation
review the benefit of including haproxy based statistics within the dashboard(s)
review the integration options with cephmgr

Provide feedback