So recently been spending time configuring my selfhosted services with notifications usint ntfy. I’ve added ntfy to report status on containers and my system using Beszel. However, only 12 out of my 44 containers seem to have healthcheck “enabled” or built in as a feature. So im now wondering what is considered best practice for monitoring the uptime/health of my containers. I am already using uptimekuma, with the “docker container” option for each of my containers i deem necessary to monitor, i do not monitor all 44 of them 😅

So I’m left with these questions;

  1. How do you notify yourself about the status of a container?
  2. Is there a “quick” way to know if a container has healthcheck as a feature.
  3. Does healthcheck feature simply depend on the developer of each app, or the person building the container?
  4. Is it better to simply monitor the http(s) request to each service? (I believe this in my case would make Caddy a single point of failure for this kind of monitor).

Thanks for any input!

  • folekaule@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    ·
    20 hours ago
    1. Some kind of monitoring software, like the Grafana stack. I like email and Discord notifications.
    2. The Dockerfile will have a HEALTHCHECK statement, but in my experience this is pretty rare. Most of the time I set up a health check in the docker compose file or I extended the Dockerfile and add my own. You sometimes need to add a tool (like curl) to do the health check anyway.
    3. It’s a feature of the container, but the app needs to support some way of signaling “health”, such as through a web API.
    4. It depends on your needs. You can do all of the above. You can do so-called black box monitoring where you’re just monitoring whether your webapp is up or down. Easy. However, for a business you may want to know about problems before they happen, so you add white box monitoring for sub-components (database, services), timing, error counts, etc.

    To add to that: health checks in Docker containers are mostly for self-healing purposes. Think about a system where you have a web app running in many separate containers across some number of nodes. You want to know if one container has become too slow or non-responsive so you can restart it before the rest of the containers are overwhelmed, causing more serious downtime. So, a health check allows Docker to restart the container without manual intervention. You can configure it to give up if it restarts too many times, and then you would have other systems (like a load balancer) to direct traffic away from the failed subsystems.

    It’s useful to remember that containers are “cattle not pets”, so a restart or shutdown of a container is a “business as usual” event and things should continue to run in a distributed system.