Status page launched last week and 50 users are trying it out so far. If you're one of the first users, please let me know if you'd like to see anything improved. I'll run through what I've been working on in the first week of the service below.
You can see the broader changes that have been made since launch on the public roadmap, but here is a summary of the smaller points as well:
- Send alerts on outages - this is now in testing for paid accounts
- Exponential backoff on status checks if a service is down
- Added status on services to allow suspending them
- Moved api behind a load balancer and added another instance
- Added flags for paid accounts, and allow multiple services for those accounts
- Fixed permissions on adding a new service and uploading logos
Project Page api downtime
The uptime of the project page api has gone down this month to 99.95% due to some over-zealous checking from the status checker service, and maintenance for server updates. It's not significant downtime, but I'd prefer if this service was at 100%.
This was caused by some services which were permanently down or had certificate errors, which caused more frequent checks from the status checker, and more calls to the api service than it could handle.I've added exponential backoff on status checks and added another api server behind a load balancer to mitigate this.
I'm currently testing allowing editing the alert text, and adding comments, so that you can report incidents to users easily from the status page screen. and give them the reasons for an outage for example.
Show only significant events
I'd also like to consider filtering events so that only significant ones show up (or perhaps are recorded). We have a second check presently , so I may expose the interval for that on the service.
The api will now alert you if your service has been down, by email or SMS. This is currently in testing with some users and is working well, please contact me if you'd like to try it out yourself (either email or SMS alerts).