Ingestar las métricas de Prometheus en Azure Monitor

Si estás trabajando con AKS quizás te interese prescindir del servidor de Prometheus y utilizar en su lugar Azure Monitor, ya que la configuración que necesitas para que esto ocurra es la misma y te quitas de mantener este primero.

En este artículo te cuento cómo configurarlo.

Habilitar Azure Monitor para tu cluster de AKS

Cuando habilitas Azure Monitor para un clúster se instalan en este unos pods llamados oms-agent en forma de DaemonSet, esto es uno por cada nodo. Si creas un AKS desde el portal la monitorización de los contenedores está habilitada por defecto.

De no tenerlo configurado puedes hacerlo a través de este comando:

az aks enable-addons -a monitoring -n AKS_NAME -g RESOURCE_GROUP_NAME

ConfigMap para los agentes de Azure Monitor

El siguiente paso es actualizar el ConfigMap que utilizan los agentes de Azure Monitor, el cual puedes descargarlo desde aquí. Yo lo he configurado de la siguiente manera:

kind: ConfigMap
apiVersion: v1
data:
  schema-version:
    #string.used by agent to parse config. supported versions are {v1}. Configs with other schema versions will be rejected by the agent.
    v1
  config-version:
    #string.used by customer to keep track of this config file's version in their source control/repository (max allowed 10 chars, other chars will be truncated)
    ver1
  log-data-collection-settings: |-
    # Log data collection settings
    # Any errors related to config map settings can be found in the KubeMonAgentEvents table in the Log Analytics workspace that the cluster is sending data to.

    [log_collection_settings]
       [log_collection_settings.stdout]
          # In the absense of this configmap, default value for enabled is true
          enabled = true
          # exclude_namespaces setting holds good only if enabled is set to true
          # kube-system log collection is disabled by default in the absence of 'log_collection_settings.stdout' setting. If you want to enable kube-system, remove it from the following setting.
          # If you want to continue to disable kube-system log collection keep this namespace in the following setting and add any other namespace you want to disable log collection to the array.
          # In the absense of this configmap, default value for exclude_namespaces = ["kube-system"]
          exclude_namespaces = ["kube-system"]

       [log_collection_settings.stderr]
          # Default value for enabled is true
          enabled = true
          # exclude_namespaces setting holds good only if enabled is set to true
          # kube-system log collection is disabled by default in the absence of 'log_collection_settings.stderr' setting. If you want to enable kube-system, remove it from the following setting.
          # If you want to continue to disable kube-system log collection keep this namespace in the following setting and add any other namespace you want to disable log collection to the array.
          # In the absense of this cofigmap, default value for exclude_namespaces = ["kube-system"]
          exclude_namespaces = ["kube-system"]

       [log_collection_settings.env_var]
          # In the absense of this configmap, default value for enabled is true
          enabled = true
       [log_collection_settings.enrich_container_logs]
          # In the absense of this configmap, default value for enrich_container_logs is false
          enabled = false
          # When this is enabled (enabled = true), every container log entry (both stdout & stderr) will be enriched with container Name & container Image
       [log_collection_settings.collect_all_kube_events]
          # In the absense of this configmap, default value for collect_all_kube_events is false
          # When the setting is set to false, only the kube events with !normal event type will be collected
          enabled = false
          # When this is enabled (enabled = true), all kube events including normal events will be collected
  prometheus-data-collection-settings: |-
    # Custom Prometheus metrics data collection settings
    [prometheus_data_collection_settings.cluster]
        # Cluster level scrape endpoint(s). These metrics will be scraped from agent's Replicaset (singleton)
        # Any errors related to prometheus scraping can be found in the KubeMonAgentEvents table in the Log Analytics workspace that the cluster is sending data to.

        #Interval specifying how often to scrape for metrics. This is duration of time and can be specified for supporting settings by combining an integer value and time unit as a string value. Valid time units are ns, us (or µs), ms, s, m, h.
        interval = "15s"

        ## Uncomment the following settings with valid string arrays for prometheus scraping
        #fieldpass = ["metric_to_pass1", "metric_to_pass12"]

        #fielddrop = ["metric_to_drop"]

        # An array of urls to scrape metrics from.
        # urls = ["http://myurl:9101/metrics"]

        # An array of Kubernetes services to scrape metrics from.
        # kubernetes_services = ["http://my-service-dns.my-namespace:9102/metrics"]

        # When monitor_kubernetes_pods = true, replicaset will scrape Kubernetes pods for the following prometheus annotations:
        # - prometheus.io/scrape: Enable scraping for this pod
        # - prometheus.io/scheme: If the metrics endpoint is secured then you will need to
        #     set this to `https` & most likely set the tls config.
        # - prometheus.io/path: If the metrics path is not /metrics, define it with this annotation.
        # - prometheus.io/port: If port is not 9102 use this annotation
        monitor_kubernetes_pods = true

        ## Restricts Kubernetes monitoring to namespaces for pods that have annotations set and are scraped using the monitor_kubernetes_pods setting.
        ## This will take effect when monitor_kubernetes_pods is set to true
        ##   ex: monitor_kubernetes_pods_namespaces = ["default1", "default2", "default3"]
        # monitor_kubernetes_pods_namespaces = ["default1"]

    [prometheus_data_collection_settings.node]
        # Node level scrape endpoint(s). These metrics will be scraped from agent's DaemonSet running in every node in the cluster
        # Any errors related to prometheus scraping can be found in the KubeMonAgentEvents table in the Log Analytics workspace that the cluster is sending data to.

        #Interval specifying how often to scrape for metrics. This is duration of time and can be specified for supporting settings by combining an integer value and time unit as a string value. Valid time units are ns, us (or µs), ms, s, m, h.
        interval = "1m"

        ## Uncomment the following settings with valid string arrays for prometheus scraping

        # An array of urls to scrape metrics from. $NODE_IP (all upper case) will substitute of running Node's IP address
        # urls = ["http://$NODE_IP:9103/metrics"]

        #fieldpass = ["metric_to_pass1", "metric_to_pass12"]

        #fielddrop = ["metric_to_drop"]
  alertable-metrics-configuration-settings: |-
    # Alertable metrics configuration settings for container resource utilization
    [alertable_metrics_configuration_settings.container_resource_utilization_thresholds]
        # The threshold(Type Float) will be rounded off to 2 decimal points
        # Threshold for container cpu, metric will be sent only when cpu utilization exceeds or becomes equal to the following percentage
        container_cpu_threshold_percentage = 95.0
        # Threshold for container memoryRss, metric will be sent only when memory rss exceeds or becomes equal to the following percentage
        container_memory_rss_threshold_percentage = 95.0
        # Threshold for container memoryWorkingSet, metric will be sent only when memory working set exceeds or becomes equal to the following percentage
        container_memory_working_set_threshold_percentage = 95.0
metadata:
  name: container-azm-ms-agentconfig
  namespace: kube-system

Como puedes ver, lo único que necesitas para recolectar las métricas de Prometheus en Azure Monitor es poner el valor de monitor_kubernetes_pods a true. Por otro lado, puedes modificar el intervalo del rastreo a un tiempo mayor o inferior. Para esta demostración lo he bajado a 15 segundos.

Para que los agentes tengan en cuenta este ConfigMap debes aplicarlo de nuevo a través del siguiente comando:

kubectl apply -f azure-monitor-configmap.yml

El proceso tardará unos segundos en aplicarse en los agentes.

Anotaciones de Prometheus

Siguiendo las indicaciones del ConfigMap, para que los agentes puedan recuperar las métricas de tus aplicaciones debes tener en cuenta lo siguiente:

        # When monitor_kubernetes_pods = true, replicaset will scrape Kubernetes pods for the following prometheus annotations:
        # - prometheus.io/scrape: Enable scraping for this pod
        # - prometheus.io/scheme: If the metrics endpoint is secured then you will need to
        #     set this to `https` & most likely set the tls config.
        # - prometheus.io/path: If the metrics path is not /metrics, define it with this annotation.
        # - prometheus.io/port: If port is not 9102 use this annotation
        monitor_kubernetes_pods = true

Si estuvieras utilizando los valores por defecto en tu desarrollo, con habilitar prometheus.io/scrape a true sería suficiente. Para el ejemplo que utilicé cuando te hablé por primera vez de Prometheus necesito únicamente especificar el puerto:

apiVersion: apps/v1
kind: Deployment
metadata:
    namespace: bookstore
    name: booksapi
    labels:
        app: bookstore
spec:
    replicas: 2
    selector:
        matchLabels:
            app: bookstore
    template:
        metadata:
            annotations:
                prometheus.io/scrape: "true"               
                prometheus.io/port: "80"
            labels:
                app: bookstore
        spec:
            containers:
            - name: books-api
              image: 0gis0/books-api
              ports:
               - containerPort: 80

Es decir, la configuración es exactamente la misma que utilizaría para el servidor de Prometheus.

Nota: Este ejemplo utiliza un MongoDB, explicado en el artículo sobre los StatefulSets y un Ingress Controller para facilitar el acceso.

Consultar las métricas de Prometheus en Azure Monitor

Al cabo de unos instantes ya deberías de tener métricas ingestadas en Azure Monitor. Para comprobarlo, accede al apartado Monitoring > Logs de tu AKS y busca en InsightsMetrics por el nombre de cualquiera de las métricas generadas por tu aplicación.

InsightsMetrics
| where Name == «http_requests_received_total»

¡Saludos!