Opsgrid

September 20 2020

I’ve built up a number of side projects over the years. The ones that require a backend are hosted on individual vms from BuyVM, which means I need to monitor about a half-dozen servers.

While there’s plenty of free Pingdom-style uptime monitoring options out there, cheap options to monitor metrics like cpu, ram, and disk are more rare. The big names usually have a free tier allowing one or two hosts with minimal data retention, then a big jump to a $10+/month per host paid tier. I can’t justify spending that when I only pay $2/month for the vms themselves! There’s also some smaller independent options out there, but they tend to require custom server agents and aren’t all that much cheaper. So, I decided to build my own.

I didn’t need anything fancy, just support for the usual metrics, user-defined alerting, and, ideally, reasonable retention of past data so I could identify trends. Like my other projects, it also needed to be operationally simple.

Opsgrid is the result. It allows free server/application alerting with about 1 year of data retention per host. The trick is that Opsgrid doesn’t actually store historical metrics: instead, users connect a Google account, and it pushes data as rows to a Google Sheet per host. It also makes use of Telegraf, an open-source monitoring agent, to avoid dealing with metric collection code.

Check it out if you also run your own servers! I plan to keep it free unless it attracts a significant number of users, but even then I should be able to price it about 5-10x lower than offerings like New Relic or Datadog.


Subscribe to future posts via email or rss, or view all posts.