Namespaces and cgroups are the building blocks on which the foundation of containers has been laid. In this post, we will explore what are namespaces and cgroups along with brief descriptions of their various types.
What is a namespace?
You may consider a namespace as a global system wrapper. It means it takes a global system resource like a mount point and it provides a wrapper around it that makes it look to the process living in that namespace like it has its own isolated instance of that resource. Namespaces allow the partitioning of kernel resources ensuring that one set of processes sees only the resources allocated to it while another set of processes sees only the resources allocated to it. Sometimes namespaces and cgroups are referenced interchangeably but this is not accurate. simply put, namespaces limit what resources a process or a set of processes can see whereas cgroups limit what resources a process or a set of processes can use.
There are six different types of namespaces described below:
This is a key security feature as each namespace can be given its own distinct set of user ids and group ids. It allows a process to have a PID of 1 inside a container and a PID of 2000 inside the host system. In fact, user namespaces can be nested up to 32 times. With the implementation of user namespaces in containers, an attacker could gain root access to a container but since that would be limited to the particular container, they would not be able to do anything on the host operating system.
Interprocess communication (IPC) namespace:
This isolates system resources from a process while giving processes created in an IPC namespace visibility to each other for inter-process communication. This offers a way for multiple processes to exchange data.
UNIX Time-Sharing (UTS) namespace:
The UTS namespace allows a single system to have a different host and domain name for different processes. This was implemented to allow the isolation of a different hostname for each container. This allows containers to have their own names so that applications could use container hostnames as default identifiers when communicating with containers as well as allow users to get information about containers using traditional commands like uname or hostname.
The name of the mount namespace is self-explanatory. This controls the mount points visible to containers. Traditionally mounting or unmounting a file system would change the global system environment. However, with the use of the mount namespace, we can now provide isolated lists of mount points for each container. The process of creating a mount namespace is similar to that of creating a chrooted environment.
The PID namespace allows for the isolation of process id numbers. Every time you boot up a Linux system, it will start with just one process with the PID of 1 and that process is the root of the process tree. The PID namespace allows us to create a new process tree for each container. This ability helps satisfy the init system’s need to be PID 1 in that container. This also allows the functionality to suspend the process and move the container to a brand new host with the process still in place so that when we bring the container back up we don’t have to worry about PID number conflicts.
This allows containers to have their own copy of the network stack. This allows every container to have its own routing table, its own firewall rules, and its own network devices.
What are cgroups?
Cgroups or control groups were originally released by google as process containers. However, to avoid any issue with the word containers, the name was changed. Cgroups isolate a process’s ability to have access to a system resource. A process within a cgroup does not have to behave the same way as traditional processes as each subsystem may have its own process hierarchies which are independent of one another. This means that one process can live in several trees. There are several different subsystems that are supported by the Linux kernel and we will now describe some of the more important ones.
This allows you to limit and measure the amount of I/Os for each set of processes. It allows you to throttle limits for each of the groups.
The cpu subsystem allows you to monitor cpu usage for a group of processes allowing you to set weights and keep track of the usage per cpu.
This generates automatic reports on CPU resources used by tasks in a cgroup.
The cpuset subsystem allows you to ping groups of processes to one CPU or to groups of a process allowing you to dedicate CPUs to a particular task.
This allows or denies access to devices by tasks in a cgroup. It allows you to set permissions as to which processes or containers could read or write to a device.
The freezer subsystem suspends or resumes tasks in a cgroup. It can be used to send a sigstop signal to a whole container.
It sets limits of the amount of memory that can be used by tasks in a cgroup and generates automatic reports on memory resource usage. The memory subsystem allows you to keep track of memory usage down to the memory page level.
The net_cls subsystem enables us to tag network packets with a classid that allows the identification of packets originating from a particular cgroup task.
The net_prio subsystem provides a way to set the priority of network packets dynamically.
This concludes our explanation of namespaces and cgroups. The understanding of these concepts is an important step towards understanding the working of different container engines and subsequent container orchestration mechanisms. We hope you found this article to be useful and we look forward to your suggestions and feedback.
Latest posts by Sahil Suri (see all)
- 4 ways to identify your current shell (if it’s bash) - January 20, 2021
- Fixing git/github merge conflicts - January 12, 2021
- Pulling changes from GitHub to Git - January 12, 2021
- Setting up a Basic File server Using simpleHTTPserver - January 7, 2021
- Ansible playbook to replace multiple lines in a file - January 6, 2021