New generation status page

pierreozoux · February 14, 2019, 7:45pm

We discussed various times about hosting each other a cachet.

I have a better idea, it is a mix of the following:

https://github.com/prometheus/blackbox_exporter
grafana dahboard
hugo (or whatever static site generator)
netlifly cms

the netlifly is for maintenances (with RSS).

And grafana is for real time status of services (with prom and blackbox)

Now, let’s say we have Alice and Bob. Alice will host the status page for Bob.

Alice needs to discover the list of services to monitor.

This looks like this the configuration.

Bob could setup one http endpoint with this file, and Alice has to scrap it regularly and when it changes restart her Prom.

This is just the basic idea, but if you are interested, we should do a hackathon (even remote )

pierreozoux · February 15, 2019, 5:54am

Alice and Bob met on this forum, or in the matrix channel

Actually, this could also be a service offered by the network.

Both ways are fine for me.

Regarding alerts, Alice can also configure Prometheus alerts so that if a service is down, it sends an email or a webhook to Bob endpoint.

pierreozoux · February 15, 2019, 10:19am

This can be helpful too

how · February 20, 2019, 10:10am

Of course this is containerized, so we just need to know the Docker commands and the cost of hosting it.

Maybe we should make an inventory of “our” resources (I think a lot of people here use Hetzner, so it might be interesting to see how much “cloud” could be turned into a colocation for example.)

gandhiano · February 20, 2019, 11:35am

If we are to automate, we need to establish some “protocol” beyond the technology itself.

If we converge on Docker, we already have a common base for disseminating the containers. But I find the challenge on a few other questions:

Which addresses should these containers probe?
Do they scrape some entries on the librehosters.json?
Do they scrape all addresses published there, or a few of them (meaning there is some orchestration in saying e.g. an address is to be probed by 3 other containers, think of replicas)?
Can all the information to be scraped public, or do we have an internal trusted chain?

pierreozoux · March 5, 2019, 8:23pm

My answer here is always discovery/registration.

So Bob wants Alice to host hist status page.

I think that Bob has to register it’s list of endpoint, probably an array in a json.
And Alice would have to watch that endpoint to rerender a prometheus configuration and restart prom.
Or, when Bob changes his endpoint, he could send a webhook to Alice (which is better for the environement).

In a k8s context, we could also imagine that Alice hosts prom in a k8s cluster. The list of services to monitor is just a CRD, and Bob can modify this CRD himself.

how · June 4, 2019, 4:50pm

I tried installing Cachet with the Docker setup and got this (really):

Oops. I’m interested in a review of various solutions for status pages…