Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem: we don't know in time when something goes wrong #62

Open
plexus opened this issue Sep 13, 2019 · 3 comments
Open

Problem: we don't know in time when something goes wrong #62

plexus opened this issue Sep 13, 2019 · 3 comments

Comments

@plexus
Copy link
Member

plexus commented Sep 13, 2019

There are two things that we should have alarms on

  • the slack logging has stopped
  • the site is down
@oxalorg
Copy link
Member

oxalorg commented Nov 16, 2020

I have two simple solutions for this which will get the job done without too much of ops work.

To track if the site is down or not we can use https://uptimerobot.com/ which pings the site every 5 minutes and send an email alert if the site is down.

To track if slack import is working as expected, we can use something like https://healthchecks.io/ (or librato if we're already using it).

@plexus
Copy link
Member Author

plexus commented Nov 20, 2020

I actually use uptimerobot for lambda island, yes would be great if we could set that up here as well. The bigger issue is the slack imports, this has happened a couple times in the past. Having a check for instance that the latest message on the #clojure channel is less than a day old would be great. Or having email notifications when certain jobs error.

@oxalorg
Copy link
Member

oxalorg commented Apr 12, 2021

Temporarily I have setup uptime robot to check https://clojurians-log.clojureverse.org/ every 15 minutes and alert me if it goes down.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants