-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tracking important errors #1
Comments
Comment by woodsaj You can use the existing events in Grafana. https://github.com/raintank/grafana/blob/master/pkg/events/events.go If enabled, these events will be pushed to a rabbitmq Topic Exchange with the routingKey set to So an event defined as
will use the routingKey The events package will need some updating as it hard codes the event.Priority to 'INFO' |
Comment by Dieterbe so do we have consensus that this is the best approach? is it still the right approach when we consider regular grafana users who want to run grafana on their server and have a different way to track errors? they often use something like logstash, so in that case a different eventlistener would be needed that pushes to a logstash queue or something i suppose? or perhaps an event listener that writes events to a text file log? (not something for us to worry too much about now, but good to keep in the back of our mind) |
Comment by torkelo one solution is to have something external (like logstash), tail log files and push to ES |
Comment by torkelo but I guess pushing directly to rabbit -> logstash -> ES has some advantages in that you can log more rich data (as json) and have that data indexed and searchable in ES, without going through logfile -> logstash parsing -> ES |
Comment by Dieterbe not sure how i should go about the actual log calls. FWIW my current idea is (not tested yet)
|
Comment by torkelo bus Publish code does not use publishAfterCommit , publishAfterCommit is just a utility function in the sqlstore package. |
Comment by torkelo not sure I understand your comment "my current idea is"? just looks like code paste from the events.go |
Comment by Dieterbe no it's a diff that shows some additions, basically an error type and a way to override the priority |
Comment by Dieterbe k i'll use bus.Publish |
Comment by torkelo if you want to pipe log messages to rabbitmq you could write a rabbitmq log writer |
Comment by Dieterbe hm not sure if i'll get to that before end of next week, but i guess that could fairly easily be done by one of you guys if needed. so i'll focus more on the specifics of alerting itself for now. |
Comment by nopzor1200 Can the messages generated by Litmus go into the same storage backend as the messages that are generated from the Collectors? (so they can be viewed in the events panel from elasticsearch)? |
Comment by Dieterbe i don't see why they couldn't, but one thing to keep in mind is decoupling the monitoring system from the system being monitored. if prod ES goes down for whatever reason then we'll want to look at events in a monitoring system which probably should not be the same ES instance. |
Comment by nopzor1200 @woodsaj to make a call on this potentially loggly, potentially ELK... but all in agreement that centralized logging for * is required |
Comment by woodsaj Given that we now have ELK set up, i think the best approach is to just log warnings and errors. WE just need to ensure that the log messages contain all relevant data. |
Comment by Dieterbe yep. do we need to do a big code review to make sure we have the right log calls in all places or are we pretty confident we're in good shape? I'll review the alerting pkg to be sure of that, at least. |
Comment by ctdk We can do some processing and mutating of the logs as they come into logstash too. I'll have that set up soon. |
Issue by Dieterbe
Tuesday May 12, 2015 at 22:17 GMT
Originally opened as raintank/grafana#91
do we have any convention on how we will keep tabs on critical/important error events in grafana?
maybe log to a file and then use heka or logstash to shove them into ES?
The text was updated successfully, but these errors were encountered: