Status subresource for Prometheus operator Workload CRDs
- Owners:
- Status:
Implemented
- Related Tickets:
- Other docs:
- N/A
This proposal describes how we will extend the Prometheus operator workload Custom Resource Definitions (CRDs) with a Status subresource field.
Why
Core Kubernetes resources differentiate between the desired state of an object
(the spec
field) and the current status of the object (the status
field)
details.
Before this proposal, the current status of the objects was never reflected by
the Prometheus operator which makes it harder for external actors to know if
the underlying resource is available or not.
Pitfalls of the current solution
Though some of the custom resource definitions expose already a Status
subresource, the Prometheus operator never updates the status subresource and
it is only possible to retrieve the information via the custom API exposed by
the operator’s web service. This makes it harder than necessary for users to
know if the declared resources are ready and to understand why if they aren’t.
Goals
- Define the structure of the status subresource for the custom resource
definitions that materialize as Pod objects.
Alertmanager
Prometheus
PrometheusAgent
ThanosRuler
- Define how the operator would reconcile the status subresource.
Non-goals
- Implement the status subresource for configuration objects like
ServiceMonitor
,PodMonitor
,PrometheusRule
,Probe
andScrapeConfig
.- The main difficulty is that a
ServiceMonitor
object for instance can be reconciled by different objects. It brings more complexity in terms of API definition as well as implementation. - This will be addressed in a separate proposal.
- The main difficulty is that a
- Extend the status subresource beyond what the operator can infer from the core Kubernetes API.
- Emit events on resource updates.
- Once the operator implements status subresources, it seems a natural evolution to generate events on status changes but this isn’t in the scope of this proposal.
Background
The status subresource is a well-defined concept in Kubernetes:
- Kubernetes documentation about custom resource definitions.
- Kubernetes API conventions.
- kubebuilder documentation.
- OperatorSDK documentation.
As much as possible, the proposal follows the recommendations established by the Kubernetes community.
API
Prometheus
The Prometheus CRD has a Status
subresource that exposes the following fields:
Paused
Replicas
UpdatedReplicas
AvailableReplicas
UnavailableReplicas
We propose to add the following new fields:
Conditions
as recommended by the document describing the Kubernetes API conventions.ShardStatuses
which is a drilled-down status for each Prometheus shard.
type PrometheusStatus struct {
// Represents whether any actions on the underlying managed objects are
// being performed. Only delete actions will be performed.
Paused bool `json:"paused"`
// Total number of non-terminated pods targeted by this Prometheus deployment
// (their labels match the selector).
Replicas int32 `json:"replicas"`
// Total number of non-terminated pods targeted by this Prometheus deployment
// that have the desired version spec.
UpdatedReplicas int32 `json:"updatedReplicas"`
// Total number of available pods (ready for at least minReadySeconds)
// targeted by this Prometheus deployment.
AvailableReplicas int32 `json:"availableReplicas"`
// Total number of unavailable pods targeted by this Prometheus deployment.
UnavailableReplicas int32 `json:"unavailableReplicas"`
// The current state of the Prometheus deployment.
// +listType=map
// +listMapKey=type
// +optional
Conditions []Condition `json:"conditions,omitempty"`
// The list has one entry per shard. Each entry provides a summary of the shard status.
// +listType=map
// +listMapKey=shardID
// +optional
ShardStatuses []ShardStatus `json:"shardStatuses,omitempty"`
// Shards is the most recently observed number of shards.
Shards int32 `json:"shards,omitempty"`
// The selector used to match the pods targeted by this Prometheus resource.
Selector string `json:"selector,omitempty"`
}
// Condition represents the state of the resources associated with the
// Prometheus, Alertmanager or ThanosRuler resource.
// +k8s:deepcopy-gen=true
type Condition struct {
// Type of the condition being reported.
// +required
Type ConditionType `json:"type"`
// Status of the condition.
// +required
Status ConditionStatus `json:"status"`
// lastTransitionTime is the time of the last update to the current status property.
// +required
LastTransitionTime metav1.Time `json:"lastTransitionTime"`
// Reason for the condition's last transition.
// +optional
Reason string `json:"reason,omitempty"`
// Human-readable message indicating details for the condition's last transition.
// +optional
Message string `json:"message,omitempty"`
// ObservedGeneration represents the .metadata.generation that the
// condition was set based upon. For instance, if `.metadata.generation` is
// currently 12, but the `.status.conditions[].observedGeneration` is 9, the
// condition is out of date with respect to the current state of the
// instance.
ObservedGeneration int64 `json:"observedGeneration,omitempty"`
}
type ConditionType string
const (
// Available indicates whether enough pods are ready to provide the
// service.
// The possible status values for this condition type are:
// - True: all pods are running and ready, the service is fully available.
// - Degraded: some pods aren't ready, the service is partially available.
// - False: no pods are running, the service is totally unavailable.
// - Unknown: the operator couldn't determine the condition status.
Available ConditionType = "Available"
// Reconciled indicates whether the operator has reconciled the state of
// the underlying resources with the object's spec.
// The possible status values for this condition type are:
// - True: the reconciliation was successful.
// - False: the reconciliation failed.
// - Unknown: the operator couldn't determine the condition status.
Reconciled ConditionType = "Reconciled"
)
type ConditionStatus string
const (
ConditionTrue ConditionStatus = "True"
ConditionDegraded ConditionStatus = "Degraded"
ConditionFalse ConditionStatus = "False"
ConditionUnknown ConditionStatus = "Unknown"
)
Example of a Prometheus resource’s status for which all pods are up and running:
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
name: prometheus
namespace: default
spec:
replicas: 2
shards: 2
status:
status:
availableReplicas: 4
conditions:
- lastTransitionTime: "2024-09-10T14:24:26Z"
message: ""
observedGeneration: 4
reason: ""
status: "True"
type: Available
- lastTransitionTime: "2024-09-10T14:24:26Z"
message: ""
observedGeneration: 4
reason: ""
status: "True"
type: Reconciled
paused: false
replicas: 4
selector: app.kubernetes.io/instance=prometheus,app.kubernetes.io/managed-by=prometheus-operator,app.kubernetes.io/name=prometheus,operator.prometheus.io/name=prometheus,prometheus=prometheus
shardStatuses:
- availableReplicas: 2
replicas: 2
shardID: "0"
unavailableReplicas: 0
updatedReplicas: 2
- availableReplicas: 2
replicas: 2
shardID: "1"
unavailableReplicas: 0
updatedReplicas: 2
shards: 2
unavailableReplicas: 0
updatedReplicas: 4
Example of a Prometheus resource’s status for which some pods are missing due to scheduling issues:
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
name: monitoring
spec:
replicas: 2
shards: 2
status:
availableReplicas: 2
conditions:
- lastTransitionTime: "2024-09-10T14:31:29Z"
message: |-
shard 0: pod prometheus-prometheus-1: 0/1 nodes are available: 1 node(s) didn't match pod anti-affinity rules. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.
shard 1: pod prometheus-prometheus-shard-1-1: 0/1 nodes are available: 1 node(s) didn't match pod anti-affinity rules. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.
observedGeneration: 5
reason: SomePodsNotReady
status: Degraded
type: Available
- lastTransitionTime: "2024-09-10T14:31:29Z"
message: ""
observedGeneration: 5
reason: ""
status: "True"
type: Reconciled
paused: false
replicas: 4
selector: app.kubernetes.io/instance=prometheus,app.kubernetes.io/managed-by=prometheus-operator,app.kubernetes.io/name=prometheus,operator.prometheus.io/name=prometheus,prometheus=prometheus
shardStatuses:
- availableReplicas: 1
replicas: 2
shardID: "0"
unavailableReplicas: 1
updatedReplicas: 1
- availableReplicas: 1
replicas: 2
shardID: "1"
unavailableReplicas: 1
updatedReplicas: 1
shards: 2
unavailableReplicas: 2
updatedReplicas: 2
Alertmanager
The subresource status for the Alertmanager custom resource definition should
be very similar to the API defined for the Prometheus CRD. The main difference
is that we don’t need the shardStatuses
field.
type AlertmanagerStatus struct {
// Represents whether any actions on the underlying managed objects are
// being performed. Only delete actions will be performed.
Paused bool `json:"paused"`
// Total number of non-terminated pods targeted by this Alertmanager
// object (their labels match the selector).
Replicas int32 `json:"replicas"`
// Total number of non-terminated pods targeted by this Alertmanager
// object that have the desired version spec.
UpdatedReplicas int32 `json:"updatedReplicas"`
// Total number of available pods (ready for at least minReadySeconds)
// targeted by this Alertmanager cluster.
AvailableReplicas int32 `json:"availableReplicas"`
// Total number of unavailable pods targeted by this Alertmanager object.
UnavailableReplicas int32 `json:"unavailableReplicas"`
// The selector used to match the pods targeted by this Alertmanager object.
Selector string `json:"selector,omitempty"`
// The current state of the Alertmanager object.
// +listType=map
// +listMapKey=type
// +optional
Conditions []Condition `json:"conditions,omitempty"`
}
Thanos Ruler
The subresource status for the Thanos Ruler custom resource definition is identical to the Alertmanager CRD.
type ThanosRulerStatus struct {
// Represents whether any actions on the underlying managed objects are
// being performed. Only delete actions will be performed.
Paused bool `json:"paused"`
// Total number of non-terminated pods targeted by this ThanosRuler deployment
// (their labels match the selector).
Replicas int32 `json:"replicas"`
// Total number of non-terminated pods targeted by this ThanosRuler deployment
// that have the desired version spec.
UpdatedReplicas int32 `json:"updatedReplicas"`
// Total number of available pods (ready for at least minReadySeconds)
// targeted by this ThanosRuler deployment.
AvailableReplicas int32 `json:"availableReplicas"`
// Total number of unavailable pods targeted by this ThanosRuler deployment.
UnavailableReplicas int32 `json:"unavailableReplicas"`
// The current state of the ThanosRuler object.
// +listType=map
// +listMapKey=type
// +optional
Conditions []Condition `json:"conditions,omitempty"`
}
Alternatives
N/A
Action Plan
N/A