Skip to content

Latest commit

 

History

History
371 lines (287 loc) · 13.7 KB

compute_resources.adoc

File metadata and controls

371 lines (287 loc) · 13.7 KB

Quotas and Limit Ranges

Overview

Using quotas and limit ranges, cluster administrators can set constraints to limit the number of objects or amount of compute resources that are used in your project. This helps cluster administrators better manage and allocate resources across all projects, and ensure that no projects are using more than is appropriate for the cluster size.

The following sections help you understand how to check on your quota and limit range settings, what sorts of things they can constrain, and how you can request or limit compute resources in your own pods and containers.

Quotas

Note

Quotas are set by cluster administrators and are scoped to a given project.

Viewing Quotas

Full quota definitions can be viewed by running oc export on the object. The following show some sample quota definitions:

Resources Managed by Quota

Quota Scopes

Quota Enforcement

admin_guide/quota.adoc If project modifications exceed a quota usage limit, the server denies the action. An appropriate error message is returned explaining the quota constraint violated, and what your currently observed usage stats are in the system.

Requests Versus Limits

See Compute Resources for more on setting requests and limits in pods and containers.

Limit Ranges

Note

Limit ranges are set by cluster administrators and are scoped to a given project.

Viewing Limit Ranges

Full limit range definitions can be viewed by running oc export on the object. The following shows an example limit range definition:

Container Limits

Compute Resources

Each container running on a node consumes compute resources, which are measurable quantities that can be requested, allocated, and consumed.

When authoring a pod configuration file, you can optionally specify how much CPU, memory (RAM), and local ephemeral storage (if your administrator has enabled the ephemeral-storage tech preview in {product-title} 3.10) each container needs in order to better schedule pods in the cluster and ensure satisfactory performance.

CPU is measured in units called millicores. Each node in a cluster inspects the operating system to determine the amount of CPU cores on the node, then multiplies that value by 1000 to express its total capacity. For example, if a node has 2 cores, the node’s CPU capacity would be represented as 2000m. If you wanted to use 1/10 of a single core, it would be represented as 100m.

Memory and ephemeral storage are measured in bytes. In addition, it may be used with SI suffices (E, P, T, G, M, K) or their power-of-two-equivalents (Ei, Pi, Ti, Gi, Mi, Ki).

apiVersion: v1
kind: Pod
spec:
  containers:
  - image: nginx
    name: nginx
    resources:
      requests:
        cpu: 100m (1)
        memory: 200Mi (2)
        ephemeral-storage: 1Gi (3)
      limits:
        cpu: 200m (4)
        memory: 400Mi (5)
        ephemeral-storage: 2Gi (6)
  1. The container requests 100m CPU.

  2. The container requests 200Mi memory.

  3. The container requests 1Gi ephemeral storage (if your administrator has enabled the ephemeral-storage tech preview in {product-title} 3.10).

  4. The container limits 200m CPU.

  5. The container limits 400Mi memory.

  6. The container limits 2Gi ephemeral storage (if your administrator has enabled the ephemeral-storage tech preview in {product-title} 3.10).

CPU Requests

Each container in a pod can specify the amount of CPU it requests on a node. The scheduler uses CPU requests to find a node with an appropriate fit for a container.

The CPU request represents a minimum amount of CPU that your container may consume, but if there is no contention for CPU, it can use all available CPU on the node. If there is CPU contention on the node, CPU requests provide a relative weight across all containers on the system for how much CPU time the container may use.

On the node, CPU requests map to Kernel CFS shares to enforce this behavior.

Viewing Compute Resources

To view compute resources for a pod:

$ oc describe pod nginx-tfjxt
Name:       nginx-tfjxt
Namespace:      default
Image(s):     nginx
Node:       /
Labels:       run=nginx
Status:       Pending
Reason:
Message:
IP:
Replication Controllers:  nginx (1/1 replicas created)
Containers:
  nginx:
    Container ID:
    Image:    nginx
    Image ID:
    QoS Tier:
      cpu:  Burstable
      memory: Burstable
    Limits:
      cpu:  200m
      memory: 400Mi
      ephemeral-storage: 1Gi
    Requests:
      cpu:    100m
      memory:   200Mi
      ephemeral-storage: 2Gi
    State:    Waiting
    Ready:    False
    Restart Count:  0
    Environment Variables:

CPU Limits

Each container in a pod can specify the amount of CPU it is limited to use on a node. CPU limits control the maximum amount of CPU that your container may use independent of contention on the node. If a container attempts to exceed the specified limit, the system will throttle the container. This allows the container to have a consistent level of service independent of the number of pods scheduled to the node.

Memory Requests

By default, a container is able to consume as much memory on the node as possible. In order to improve placement of pods in the cluster, specify the amount of memory required for a container to run. The scheduler will then take available node memory capacity into account prior to binding your pod to a node. A container is still able to consume as much memory on the node as possible even when specifying a request.

Ephemeral Storage Requests

NOTE: This only applies if your administrator has enabled the ephemeral-storage tech preview in {product-title} 3.10.

By default, a container is able to consume as much local ephemeral storage on the node as is available. In order to improve placement of pods in the cluster, specify the amount of required local ephemeral storage for a container to run. The scheduler will then take available node local storage capacity into account prior to binding your pod to a node. A container is still able to consume as much local ephemeral storage on the node as possible even when specifying a request.

Memory Limits

If you specify a memory limit, you can constrain the amount of memory the container can use. For example, if you specify a limit of 200Mi, a container will be limited to using that amount of memory on the node. If the container exceeds the specified memory limit, it will be terminated and potentially restarted dependent upon the container restart policy.

Ephemeral Storage Limits

NOTE: This only applies if your administrator has enabled the ephemeral-storage tech preview in {product-title} 3.10.

If you specify an ephemeral storage limit, you can constrain the amount of ephemeral storage the container can use. For example, if you specify a limit of 2Gi, a container will be limited to using that amount of ephemeral storage on the node. If the container exceeds the specified memory limit, it will be terminated and potentially restarted dependent upon the container restart policy.

Quality of Service Tiers

When created, a compute resource is classified with a quality of service (QoS). There are three tiers, and each is based on the request and limit value specified for each resource:

Quality of Service Description

BestEffort

Provided when a request and limit are not specified.

Burstable

Provided when a request is specified that is less than an optionally specified limit.

Guaranteed

Provided when a limit is specified that is equal to an optionally specified request.

A container may have a different quality of service for each compute resource. For example, a container can have Burstable CPU and Guaranteed memory qualities of service.

The quality of service has different impacts on different resources, depending on whether the resource is compressible or not. CPU is a compressible resource, whereas memory is an incompressible resource.

With CPU Resources:
  • A BestEffort CPU container is able to consume as much CPU as is available on a node but runs with the lowest priority.

  • A Burstable CPU container is guaranteed to get the minimum amount of CPU requested, but it may or may not get additional CPU time. Excess CPU resources are distributed based on the amount requested across all containers on the node.

  • A Guaranteed CPU container is guaranteed to get the amount requested and no more, even if there are additional CPU cycles available. This provides a consistent level of performance independent of other activity on the node.

With Memory Resources:
  • A BestEffort memory container is able to consume as much memory as is available on the node, but there are no guarantees that the scheduler will place that container on a node with enough memory to meet its needs. In addition, a BestEffort container has the greatest chance of being killed if there is an out of memory event on the node.

  • A Burstable memory container is scheduled on the node to get the amount of memory requested, but it may consume more. If there is an out of memory event on the node, Burstable containers are killed after BestEffort containers when attempting to recover memory.

  • A Guaranteed memory container gets the amount of memory requested, but no more. In the event of an out of memory event, it will only be killed if there are no more BestEffort or Burstable containers on the system.

Specifying Compute Resources via CLI

To specify compute resources via the CLI: