Storage

By default, the operator configures Pods to store data on emptyDir volumes which aren’t persisted when the Pods are redeployed. To maintain data across deployments and version upgrades, you can configure persistent storage for Prometheus, Alertmanager and ThanosRuler resources.

Kubernetes supports several kinds of storage volumes. The Prometheus Operator works with PersistentVolumeClaims, which support the underlying PersistentVolume to be provisioned when requested.

This document assumes a basic understanding of PersistentVolumes, PersistentVolumeClaims, and their provisioning.

Storage Provisioning on AWS

Automatic provisioning of storage requires a StorageClass.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ssd
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2

Note: Make sure that AWS as a cloud provider is properly configured with your cluster, or storage provisioning will not work.

For best results, use volumes that have high I/O throughput. These examples use SSD EBS volumes. Read the Kubernetes Persistent Volumes documentation to adapt this StorageClass to your needs.

The StorageClass that was created can be specified in the storage section in the Prometheus resource (note that if you’re using kube-prometheus, then instead of making the following change to your Prometheus resource, see the prometheus-pvc.jsonnet example).

apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: persisted
spec:
  storage:
    volumeClaimTemplate:
      spec:
        storageClassName: ssd
        resources:
          requests:
            storage: 40Gi

The full documentation of the storage field can be found in the API reference.

When creating the Prometheus object, a PersistentVolumeClaim is used for each Pod in the StatefulSet, and the storage should automatically be provisioned, mounted and used.

The same approach should work with other cloud providers (GCP, Azure, …) and any Kubernetes storage provider supporting dynamic provisioning.

Manual storage provisioning

The Prometheus CRD specification allows you to support arbitrary storage through a PersistentVolumeClaim.

The easiest way to use a volume that cannot be automatically provisioned (for whatever reason) is to use a label selector alongside a manually created PersistentVolume.

For example, using an NFS volume might be accomplished with the following manifests:

apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: my-example-prometheus-name
  labels:
    prometheus: example
spec:
  replicas: 1
  storage:
    volumeClaimTemplate:
      spec:
        selector:
          matchLabels:
            app.kubernetes.io/name: my-example-prometheus
        resources:
          requests:
            storage: 50Gi
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-pv-name
  labels:
    app.kubernetes.io/name: my-example-prometheus
spec:
  capacity:
    storage: 50Gi
  accessModes:
  - ReadWriteOnce # required
  nfs:
    server: myServer
    path: "/path/to/prom/db"

Disabling Default StorageClasses

To manually provision volumes (as of Kubernetes 1.6.0), you may need to disable the default StorageClass that is automatically created for certain Cloud Providers. Default StorageClasses are pre-installed on Azure, AWS, GCE, OpenStack, and vSphere.

The default StorageClass behavior will override manual storage provisioning, preventing PersistentVolumeClaims from automatically binding to manually created PersistentVolumes.

To override this behavior, you must explicitly create the same resource, but set it to not be default (see the changelog for more information.)

For example, to disable default StorageClasses on a Google Container Engine cluster, create the following StorageClass:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: standard
  annotations:
    # disable this default storage class by setting this annotation to false.
    storageclass.kubernetes.io/is-default-class: "false"
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-ssd
  zone: us-east1-d

Resizing volumes

Even if the StorageClass supports resizing, Kubernetes doesn’t support (yet) volume expansion through StatefulSets. This means that when you update the storage requests in the spec.storage field of a custom resource such as Prometheus, the operator has to delete/recreate the underlying StatefulSet and the associated PVCs aren’t expanded (more details in the KEP issue).

It is still possible to fix the situation manually.

First check that the storage class allows volume expansion:

$ kubectl get storageclass -o custom-columns=NAME:.metadata.name,ALLOWVOLUMEEXPANSION:.allowVolumeExpansion
NAME      ALLOWVOLUMEEXPANSION
gp2-csi   true
gp3-csi   true

Next, update the spec.paused field to true (to prevent the operator from recreating the StatefulSet) and update the storage request in the spec.storage field of the custom resource. Assuming a Prometheus resource named example for which you want to increase the storage size to 10Gi:

kubectl patch prometheus/example --patch '{"spec": {"paused": true, "storage": {"volumeClaimTemplate": {"spec": {"resources": {"requests": {"storage":"10Gi"}}}}}}}' --type merge

Next, patch every PVC with the updated storage request (10Gi in this example):

for p in $(kubectl get pvc -l operator.prometheus.io/name=example -o jsonpath='{range .items[*]}{.metadata.name} {end}'); do \
  kubectl patch pvc/${p} --patch '{"spec": {"resources": {"requests": {"storage":"10Gi"}}}}'; \
done

Next, delete the underlying StatefulSet using the orphan deletion strategy:

kubectl delete statefulset -l operator.prometheus.io/name=example --cascade=orphan

Last, change spec.paused field of the custom resource back to false.

kubectl patch prometheus/example --patch '{"spec": {"paused": false}}' --type merge

The operator should recreate the StatefulSet immediately, there will be no service disruption thanks to the orphan strategy and the volumes mounted in the Pods should have the updated size.