Skip to search

NodeHealthCheck

remediation.medik8s.io / v1alpha1

apiVersion: remediation.medik8s.io/v1alpha1 kind: NodeHealthCheck metadata: name: example
View raw schema
apiVersion string
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
kind string
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
metadata object
spec object
NodeHealthCheckSpec defines the desired state of NodeHealthCheck
escalatingRemediations []object
EscalatingRemediations contain a list of ordered remediation templates with a timeout. The remediation templates will be used one after another, until the unhealthy node gets healthy within the timeout of the currently processed remediation. The order of remediation is defined by the "order" field of each "escalatingRemediation". Mutually exclusive with RemediationTemplate
order integer required
Order defines the order for this remediation. Remediations with lower order will be used before remediations with higher order. Remediations must not have the same order.
remediationTemplate object required
RemediationTemplate is a reference to a remediation template provided by a remediation provider. If a node needs remediation the controller will create an object from this template and then it should be picked up by a remediation provider.
apiVersion string
API version of the referent.
fieldPath string
If referring to a piece of an object instead of an entire object, this string should contain a valid JSON/Go field access statement, such as desiredState.manifest.containers[2]. For example, if the object reference is to a container within a pod, this would take on a value like: "spec.containers{name}" (where "name" refers to the name of the container that triggered the event) or if no container name is specified "spec.containers[2]" (container with index 2 in this pod). This syntax is chosen only to have some well-defined way of referencing a part of an object.
kind string
Kind of the referent. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
name string
Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
namespace string
Namespace of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/
resourceVersion string
Specific resourceVersion to which this reference is made, if any. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#concurrency-control-and-consistency
uid string
UID of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#uids
timeout string required
Timeout defines how long NHC will wait for the node getting healthy before the next remediation (if any) will be used. When the last remediation times out, the overall remediation is considered as failed. As a safeguard for preventing parallel remediations, a minimum of 60s is enforced. Expects a string of decimal numbers each with optional fraction and a unit suffix, eg "300ms", "1.5h" or "2h45m". Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h".
pattern: ^([0-9]+(\.[0-9]+)?(ns|us|µs|ms|s|m|h))+$
healthyDelay string
HealthyDelay is the time before NHC would allow a node to be healthy again. A negative value means that NHC will never consider the node healthy and a manual intervention is expected
pattern: ^-?([0-9]+(\.[0-9]+)?(ns|us|µs|ms|s|m|h))+$
maxUnhealthy string | integer
Remediation is allowed if no more than "MaxUnhealthy" nodes selected by "selector" are not healthy. Expects either a non-negative integer value or a percentage value. Percentage values must be positive whole numbers and are capped at 100%. 0% is valid and will block all remediation. MaxUnhealthy should not be used with remediators that delete nodes (e.g. MachineDeletionRemediation), as this breaks the logic for counting healthy and unhealthy nodes. MinHealthy and MaxUnhealthy are configuring the same aspect, and they cannot be used at the same time.
string pattern: ^((100|[0-9]{1,2})%|[0-9]+)$
minHealthy string | integer
Remediation is allowed if at least "MinHealthy" nodes selected by "selector" are healthy. Expects either a non-negative integer value or a percentage value. Percentage values must be positive whole numbers and are capped at 100%. 100% is valid and will block all remediation. MinHealthy and MaxUnhealthy are configuring the same aspect, and they cannot be used at the same time.
string pattern: ^((100|[0-9]{1,2})%|[0-9]+)$
pauseRequests []string
PauseRequests will prevent any new remediation to start, while in-flight remediations keep running. Each entry is free form, and ideally represents the requested party reason for this pausing - i.e: "imaginary-cluster-upgrade-manager-operator"
remediationTemplate object
RemediationTemplate is a reference to a remediation template provided by an infrastructure provider. If a node needs remediation the controller will create an object from this template and then it should be picked up by a remediation provider. Mutually exclusive with EscalatingRemediations
apiVersion string
API version of the referent.
fieldPath string
If referring to a piece of an object instead of an entire object, this string should contain a valid JSON/Go field access statement, such as desiredState.manifest.containers[2]. For example, if the object reference is to a container within a pod, this would take on a value like: "spec.containers{name}" (where "name" refers to the name of the container that triggered the event) or if no container name is specified "spec.containers[2]" (container with index 2 in this pod). This syntax is chosen only to have some well-defined way of referencing a part of an object.
kind string
Kind of the referent. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
name string
Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
namespace string
Namespace of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/
resourceVersion string
Specific resourceVersion to which this reference is made, if any. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#concurrency-control-and-consistency
uid string
UID of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#uids
selector object
Label selector to match nodes whose health will be exercised. Selecting both control-plane and worker nodes in one NHC CR is highly discouraged and can result in undesired behaviour. Note: mandatory now for above reason, but for backwards compatibility existing CRs will continue to work with an empty selector, which matches all nodes.
matchExpressions []object
matchExpressions is a list of label selector requirements. The requirements are ANDed.
key string required
key is the label key that the selector applies to.
operator string required
operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.
values []string
values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch.
matchLabels object
matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed.
stormCooldownDuration string
StormCooldownDuration defines the duration of an optional cooldown phase after a storm. A "storm" happens when the number of (un)healthy nodes exceeds the threshold defined by minHealthy or maxUnhealthy. Sometimes this is triggered by a single root cause. When that cause is fixed, there is a risk to remediate healthy nodes: the async nature of node status updates would result in only some nodes being detected as healthy by NHC in a first round of updates, which results in minHealthy or maxUnhealthy threshold being fulfilled (the storm ends) and triggering unneeded new remediation. The storm cooldown phase will prevent creation of new remediation for the given duration by giving NHC some time to get the latest node statuses. Expects a string of decimal numbers each with optional fraction and a unit suffix, e.g. "300ms", "1.5h" or "2h45m". Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h".
pattern: ^([0-9]+(\.[0-9]+)?(ns|us|µs|ms|s|m|h))+$
unhealthyConditions []object
UnhealthyConditions contains a list of the conditions that determine whether a node is considered unhealthy. The conditions are combined in a logical OR, i.e. if any of the conditions is met, the node is unhealthy.
duration string required
Duration of the condition specified when a node is considered unhealthy. Expects a string of decimal numbers each with optional fraction and a unit suffix, eg "300ms", "1.5h" or "2h45m". Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h".
pattern: ^([0-9]+(\.[0-9]+)?(ns|us|µs|ms|s|m|h))+$
status string required
The condition status in the node's status to watch for. Typically False, True or Unknown.
minLength: 1
type string required
The condition type in the node's status to watch for.
minLength: 1
status object
NodeHealthCheckStatus defines the observed state of NodeHealthCheck
conditions []object
Represents the observations of a NodeHealthCheck's current state. Known .status.conditions.type are: "Disabled"
lastTransitionTime string required
lastTransitionTime is the last time the condition transitioned from one status to another. This should be when the underlying condition changed. If that is not known, then using the time when the API field changed is acceptable.
format: date-time
message string required
message is a human readable message indicating details about the transition. This may be an empty string.
maxLength: 32768
observedGeneration integer
observedGeneration represents the .metadata.generation that the condition was set based upon. For instance, if .metadata.generation is currently 12, but the .status.conditions[x].observedGeneration is 9, the condition is out of date with respect to the current state of the instance.
format: int64
minimum: 0
reason string required
reason contains a programmatic identifier indicating the reason for the condition's last transition. Producers of specific condition types may define expected values and meanings for this field, and whether the values are considered a guaranteed API. The value should be a CamelCase string. This field may not be empty.
pattern: ^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$
minLength: 1
maxLength: 1024
status string required
status of the condition, one of True, False, Unknown.
enum: True, False, Unknown
type string required
type of condition in CamelCase or in foo.example.com/CamelCase.
pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$
maxLength: 316
healthyNodes integer
HealthyNodes specified the number of healthy nodes observed
inFlightRemediations object
InFlightRemediations records the timestamp when remediation triggered per node. Deprecated in favour of UnhealthyNodes.
lastUpdateTime string
LastUpdateTime is the last time the status was updated.
format: date-time
observedNodes integer
ObservedNodes specified the number of nodes observed by using the NHC spec.selector
phase string
Phase represents the current phase of this Config. Known phases are Disabled, Paused, Remediating and Enabled, based on:\n - the status of the Disabled condition\n - the value of PauseRequests\n - the value of InFlightRemediations
reason string
Reason explains the current phase in more detail.
unhealthyNodes []object
UnhealthyNodes tracks currently unhealthy nodes and their remediations.
conditionsHealthyTimestamp string
ConditionsHealthyTimestamp is RFC 3339 date and time at which the unhealthy conditions didn't match anymore. The remediation CR will be deleted at that time, but the node will still be tracked as unhealthy until all remediation CRs are actually deleted, when remediators finished cleanup and removed their finalizers.
format: date-time
healthyDelayed boolean
HealthyDelayed notes whether a node should be considered healthy, but isn't due to NodeHealthCheckSpec.HealthyDelay configuration.
name string required
Name is the name of the unhealthy node
remediations []object
Remediations tracks the remediations created for this node
resource object required
Resource is the reference to the remediation CR which was created
apiVersion string
API version of the referent.
fieldPath string
If referring to a piece of an object instead of an entire object, this string should contain a valid JSON/Go field access statement, such as desiredState.manifest.containers[2]. For example, if the object reference is to a container within a pod, this would take on a value like: "spec.containers{name}" (where "name" refers to the name of the container that triggered the event) or if no container name is specified "spec.containers[2]" (container with index 2 in this pod). This syntax is chosen only to have some well-defined way of referencing a part of an object.
kind string
Kind of the referent. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
name string
Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
namespace string
Namespace of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/
resourceVersion string
Specific resourceVersion to which this reference is made, if any. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#concurrency-control-and-consistency
uid string
UID of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#uids
started string required
Started is the creation time of the remediation CR
format: date-time
templateName string
TemplateName is required when using several templates of the same kind
timedOut string
TimedOut is the time when the remediation timed out. Applicable for escalating remediations only.
format: date-time

No matches. Try .spec.escalatingRemediations for an exact path