The website uses cookies. By using this site, you agree to our use of cookies as described in the Privacy Policy.
I Agree
blank_error__heading
blank_error__body
Text direction?

Maximizing Resource Utilization with cgroup2

Today's data centers are serving more users and providing more high-bandwidth services than ever. With this incredible growth in cloud-based computing comes increased demand for computing resources.

As more individuals, businesses, and governments come to rely on cloud computing, resource utilization—exploiting the full capacity of existing resources—has become a high-priority focus area for production engineers, to ensure the services they run and maintain continue working well for their users.

This guide introduces cgroup2, a Linux kernel component that provides a mechanism to isolate, measure, and control the distribution of resources for a collection of processes on a server. It also describes how cgroup2 is being deployed in production today, along with tools and strategies built around cgroups, to drive substantial resource control gains in large data centers.

Introducing cgroup2

cgroup enables the grouping and structuring of workloads, to control and limit the amount of system resources assigned to each.

One of the key use cases for cgroup is Isolating a core workload from background system resource needs. For example, you may want to isolate a web server from background system processes like metric collection. Or you may want to separate time-critical or latency-sensitive work from long-term asynchronous jobs.

Within a comprehensive resource control framework, it’s even possible to stack different workloads on the same system without resource conflicts.

Overview

The cgroup architecture is comprised of two main components – the cgroup core and a set of subsystem controllers for memory, CPU, IO, PID, and RDMA.

high level hierarchy

Core

The core is where you group processes into logical units and define hierarchical relationships. All cgroups on a system form a single hierarchy or tree, comprised of the root cgroup with child cgroups and subtrees for controlling resource use of partitions, containers, and processes.

The core contains a pseudo-filesystem cgroupfs, where you organize processes in a set of interface files at:

/sys/fs/cgroup

Controllers

The controllers manage distribution of their respective resources along the hierarchy according to their configuration.

For each type of resource controller enabled, a corresponding set of interface files is automatically created. These files are where you specify resource thresholds for memory, CPU, IO, etc. We'll look at these files in detail in the sections for each controller.

Additional info

You’ll want to refer to the following docs for additional details and info:

Case study: The fbtax2 project

Throughout this guide, we'll look at a real-world case study at Facebook that used the concepts, strategies, and tools described here to chalk up significant resource control wins: the fbtax2 project.

The goal of the project was to use cgroup2 to isolate and protect a system's main workload from widely distributed system binaries and other system services that run on many Facebook hosts. The memory reserved for these system binaries was nicknamed the fbtax, which later became the name of the project.

In the following sections, we'll show how fbtax2 used the new cgroup2 features and related tools to score high-impact resource utilization gains:

Measure
Measure
Related Notes
Get a free MyMarkup account to save this article and view it later on any device.
Create account

End User License Agreement

Summary | 13 Annotations
isolate and protect a system's main workload from widely distributed system binaries and other system services that run on many Facebook hosts
2020/06/18 07:35
hierarchy
2020/06/18 07:33
subsystem controllers
2020/06/18 07:33
cgroup core
2020/06/18 07:33
pseudo-filesystem cgroupfs
2020/06/18 07:32
separate time-critical or latency-sensitive work from long-term asynchronous jobs
2020/06/18 07:29
isolate a web server from background system processes
2020/06/18 07:29
Isolating a core workload from background system resource needs
2020/06/18 07:29
tools and strategies
2020/06/18 07:27
isolate, measure, and control the distribution of resources for a collection of processes on a serve
2020/06/18 07:27
resource utilization
2020/06/18 07:26
ensure the services they run
2020/06/23 05:54
maintain continue working well for their users
2020/06/23 05:54