Automatic horizontal scaling of cluster node instances

🌐 This document is available in both English and Ukrainian. Use the language toggle in the top right corner to switch between versions.

1. The principle of automatic horizontal scaling under OKD cluster

Object HPA (Horizontal Pod Autoscaler) — is an object that is responsible for automatically creating or deleting pod instances when minimum or maximum specified resource thresholds such as CPU and/or Memory are reached.

The HPA object specifies the minimum and maximum number of pod replicas to which the value of running pod instances can be expanded or contracted. Also, in the HPA object, the limit values of resources are indicated, upon reaching which the instances of the pod are added or removed.

Horizontal Autoscaling Definition

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
    name: image-registry
    namespace: default
spec:
    maxReplicas: 7
    minReplicas: 3
    scaleTargetRef:
        apiVersion: apps.openshift.io/v1
        kind: DeploymentConfig
        name: image-registry
    targetCPUUtilizationPercentage: 75
status:
    currentReplicas: 5
    desiredReplicas: 0

If a new pod instance needs to be added, the kube-scheduler decides which of the node instances to run the new instance on, based on the node’s free resources, such as CPU and Memory, and meta-information about where the additional pod instance can be started (taints, affinity, and anti-affinity). If none of the existing node instances in the cluster meet the requirements for launching a pod instance, an Event will be created with a message that there are no node instances on which the pod can be launched.

Event example

0/2 nodes are available: 1 Insufficient memory, 1 node(s) had taints that the pod didn't tolerate.

2. The principle of automatic horizontal scaling of OKD-cluster nodes

Automatic horizontal scaling of nodes (HNA) occurs based on events (Events) in the cluster about the inability to create a new Pod instance due to the lack of node instances in the cluster that meet the requirements. HNA operates with three Kubernetes objects: ClusterAutoscaler, MachineAutoScaler and MachineSet.

ClusterAutoscaler - is a Kubernetes object that specifies the OKD size of the cluster to meet its current deployment requirements. CPU and Memory are resources used by cluster autoscaler when scaling instances of cluster nodes. Cluster autoscaler increases the size of the cluster when there are pod instances that cannot be started on any of the existing node instances due to a lack of required resources or the node instances not meeting the deployment requirements. The Cluster autoscaler does not expand the cluster resources beyond the limits specified in the Cluster Autoscaler definition.

ClusterAutoscaler Definition

apiVersion: "autoscaling.openshift.io/v1"
kind: "ClusterAutoscaler"
metadata:
  name: "default"
spec:
  podPriorityThreshold: -10
  resourceLimits:
    maxNodesTotal: 24
    cores:
      min: 8
      max: 128
    memory:
      min: 4
      max: 256
  scaleDown:
    enabled: true
    delayAfterAdd: 10m
    delayAfterDelete: 5m
    delayAfterFailure: 30s
    unneededTime: 5m

MachineAutoScaler - a Kubernetes object containing information about the limit values for scaling cluster node instances and the node instance’s membership to the corresponding MachineSet object.

MachineAutoscaler Definition

apiVersion: "autoscaling.openshift.io/v1beta1"
kind: "MachineAutoscaler"
metadata:
  name: "worker-us-east-1a"
  namespace: "openshift-machine-api"
spec:
  minReplicas: 1
  maxReplicas: 12
  scaleTargetRef:
    apiVersion: machine.openshift.io/v1beta1
    kind: MachineSet
    name: worker-us-east-1a

MachineSet - a Kubernetes object that groups instances of cluster nodes according to the specified parameters.

MachineSet Definition for AWS Cloud Provider

apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
  labels:
    machine.openshift.io/cluster-api-cluster: <infrastructure_id>
  name: <infrastructure_id>-<role>-<zone>
  namespace: openshift-machine-api
spec:
  replicas: 1
  selector:
    matchLabels:
      machine.openshift.io/cluster-api-cluster: <infrastructure_id>
      machine.openshift.io/cluster-api-machineset: <infrastructure_id>-<role>-<zone>
  template:
    metadata:
      labels:
        machine.openshift.io/cluster-api-cluster: <infrastructure_id>
        machine.openshift.io/cluster-api-machine-role: <role>
        machine.openshift.io/cluster-api-machine-type: <role>
        machine.openshift.io/cluster-api-machineset: <infrastructure_id>-<role>-<zone>
    spec:
      metadata:
        labels:
          node-role.kubernetes.io/<role>: ""
      providerSpec:
        value:
          ami:
            id: ami-046fe691f52a953f9
          apiVersion: awsproviderconfig.openshift.io/v1beta1
          blockDevices:
            - ebs:
                iops: 0
                volumeSize: 120
                volumeType: gp2
          credentialsSecret:
            name: aws-cloud-credentials
          deviceIndex: 0
          iamInstanceProfile:
            id: <infrastructure_id>-worker-profile
          instanceType: m4.large
          kind: AWSMachineProviderConfig
          placement:
            availabilityZone: us-east-1a
            region: us-east-1
          securityGroups:
            - filters:
                - name: tag:Name
                  values:
                    - <infrastructure_id>-worker-sg
          subnet:
            filters:
              - name: tag:Name
                values:
                  - <infrastructure_id>-private-us-east-1a
          tags:
            - name: kubernetes.io/cluster/<infrastructure_id>
              value: owned
          userDataSecret:
            name: worker-user-data

Horizontal Node Autoscaling

horizontal node autoscaling

HPA - (Horisontal Pod Autoscaler) - a Kubernetes object that automatically updates a worker resource (such as a Deployment or StatefulSet) in order to automatically scale worker resources as required.

RC - (Replication Controller) - a Kubernetes object that ensures that the appropriate number of pod replicas are running at any given time. ReplicationController ensures that a specified number of pod replicas are running at any one time.
DC - (Deployment Configuration) - a Kubernetes object that includes one or more Replication Controllers containing the temporal state of the deployment as a template for the pod object.

HNA - (Horisontal Node Autoscaler) - a process involving Kubernetes objects that provide automatic horizontal scaling of cluster nodes.

ClusterAutoscaler - cluster autoscaler configures the size of the OKD cluster to meet its current deployment requirements.
MachineAutoscaler - machine autoscaler configures the number of MachinSet node instances deployed in the OKD cluster.