diff --git a/docs/metric/README.md b/docs/metric/README.md deleted file mode 100644 index 1c066da0d..000000000 --- a/docs/metric/README.md +++ /dev/null @@ -1,36 +0,0 @@ -## Minio Prometheus Metric - -Minio server exposes an endpoint for Promethueus to scrape server data at `/minio/prometheus/metrics`. - -### Prometheus probe -Prometheus is used to monitor Minio server information like http request, disk storage, network stats etc.. It uses a config file named `prometheus.yaml` to scrape data from server. The value for `metrics_path` and `targets` need to be configured in the config yaml to specify the endpoint and url as shown: -``` -scrape_configs: - # The job name is added as a label `job=` to any timeseries scraped from this config. - - job_name: minio - metrics_path: /minio/prometheus/metrics - - # metrics_path defaults to '/metrics' - # scheme defaults to 'http'. - - static_configs: - - targets: ['localhost:9000'] -``` - Prometheus can be run by executing : -``` -./prometheus --config.file=prometheus.yml -``` - -### List of Minio metric exposed -Minio exposes the following list of metric to Prometheus -- `minio_disk_storage_bytes` : Total byte count of disk storage available to current Minio server instance -- `minio_disk_storage_free_bytes` : Total byte count of free disk storage available to current Minio server instance -- `minio_http_requests_duration_seconds_bucket` : The bucket into which observations are counted for creating Histogram -- `minio_http_requests_duration_seconds_count` : The count of current number of observations i.e. total HTTP requests (HEAD/GET/PUT/POST/DELETE). -- `minio_http_requests_duration_seconds_sum` : The current aggregate time spent servicing all HTTP requests (HEAD/GET/PUT/POST/DELETE) in seconds -- `minio_http_requests_total` : Total number of requests served by current Minio server instance -- `minio_network_received_bytes_total` : Total number of bytes received by current Minio server instance -- `minio_network_sent_bytes_total` : Total number of bytes sent by current Minio server instance -- `minio_offline_disks` : Total number of offline disks for current Minio server instance -- `minio_total_disks` : Total number of disks for current Minio server instance -- `minio_server_start_time_seconds` : Time Unix time in seconds when current Minio server instance started diff --git a/docs/metrics/README.md b/docs/metrics/README.md new file mode 100644 index 000000000..5663dc1d4 --- /dev/null +++ b/docs/metrics/README.md @@ -0,0 +1,20 @@ +## Minio Monitoring Guide + +Minio server exposes monitoring data over un-authenticated endpoints so monitoring tools can pick the data without you having to share Minio server credentials. This document lists the monitoring endpoints and relevant documentation. + +### Healthcheck Probe + +Minio server has two healthcheck related endpoints, a liveness probe to indicate if server is working fine and a readiness probe to indicate if server is not accepting connections due to heavy load. + +- Liveness probe available at `/minio/health/live` +- Readiness probe available at `/minio/health/ready` + +Read more on how to use these endpoints in [Minio healthcheck guide](./healthcheck/README.md). + +### Prometheus Probe + +Minio server exposes Prometheus compatible data on a single endpoint. + +- Prometheus data available at `/minio/prometheus/metrics` + +To use this endpoint, setup Prometheus to scrape data from this endpoint. Read more on how to use Prometheues to monitor Minio server in [How to monitor Minio server with Prometheus](https://github.com/minio/cookbook/blob/master/docs/how-to-monitor-minio-with-prometheus.md). diff --git a/docs/healthcheck/README.md b/docs/metrics/healthcheck/README.md similarity index 94% rename from docs/healthcheck/README.md rename to docs/metrics/healthcheck/README.md index 313e09573..4ecce1e50 100644 --- a/docs/healthcheck/README.md +++ b/docs/metrics/healthcheck/README.md @@ -4,11 +4,11 @@ Minio server exposes two un-authenticated, healthcheck endpoints - liveness prob ### Liveness probe -This probe is used to identify situations where the server is running but may not behave optimally, i.e. sluggish response or corrupt backend. Such problems can be *only* fixed by a restart. +This probe is used to identify situations where the server is running but may not behave optimally, i.e. sluggish response or corrupt back-end. Such problems can be *only* fixed by a restart. Internally, Minio liveness probe handler does a ListBuckets call. If successful, the server returns 200 OK, otherwise 503 Service Unavailable. -When liveness probe fails, Kubernetes like platforms restart the container. +When liveness probe fails, Kubernetes like platforms restart the container. ### Readiness probe