Autoscaler

Scale Deployments and StatefulSets horizontally and vertically.

In Kubernetes, it is possible to scale Deployments and StatefulSets based on resources such as CPU and memory of the current pods.

There are two different types of scaling: horizontal scaling, where you start more pods, and vertical scaling, where you restart pods with more resources. The former is preferable, although it can be challenging, especially with StatefulSets.

For both types of scaling, resources are available in the cluster as Custom Resource Definitions (CRDs): HorizontalPodAutoscaler and VerticalPodAutoscaler.

Here is an example that scales pods between 3 and 10 replicas based on CPU and memory utilization:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: example
  namespace: example
spec:
  maxReplicas: 10
  metrics:
    - resource:
        name: memory
        target:
          averageUtilization: 80
          type: Utilization
      type: Resource
    - resource:
        name: cpu
        target:
          averageUtilization: 80
          type: Utilization
      type: Resource
  minReplicas: 3
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: examples

And here’s an example for vertical scaling of the example container within a StatefulSet between 1Gi and 4Gi memory: And an example for vertical scaling:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: example
  namespace: example
spec:
  resourcePolicy:
    containerPolicies:
      - containerName: example
        controlledResources:
          - memory
        maxAllowed:
          memory: 1Gi
        minAllowed:
          memory: 4Gi
      - containerName: "*"
        mode: "Off"
  targetRef:
    apiVersion: apps/v1
    kind: StatefulSet
    name: example
  updatePolicy:
    updateMode: Auto