pid numbers

robbnl · September 2020

I have just started the course and currently am studying the processes.
When explaining what a process is, I have a question about this quote:
"For historical reasons, the largest PID has been limited to a 16-bit number, or 32768. It is possible to alter this value by changing /proc/sys/kernel/pid_max, since it may be inadequate for larger servers. As processes are created, eventually they will reach pid_max, at which point they will start again at PID = 300."

My question is, where comes the '300' from when the count is restarted. Is this an arbitrary number for processes started in userspace? Why not 200 or 3? Are there any reserved PID's?

coop · September 2020

There is nothing special about 300; it was just a choice someone made long ago in pre-history. The reason they don't go back to 1 or 2 is that there are long running processes with low numbers so it is an efficiency thing. Choosing a new pid is non trivial because you have to check whether someone already has it. So even with 300 you have to check 300 and if someone has it, go to 301, and then 302 etc. Checking whether a given pid is already being used used to be done very inefficiently (by scanning all processes on the system, the kernel really doesn't work with pids, they are a user-space thing) and it made it very slow to make a new process. It is now done in a very efficient way (using a bitmap, one for each process, and only the bitmap needs to be checked). Try setting pid_max to a lower number than current pids, and then do something like "cat &" and observe the pid spit out on the command line.

robbnl · September 2020

Well, I did some digging into it and found it is declared in the kernel.
I found code from kernel 3.7 where there is a file called pid.c. I expect all subsequent kernel versions have this pid.c file too.
in line48 the declaration says: #define RESERVED_PIDS 300
https://docs.huihoo.com/doxygen/linux/kernel/3.7/pid_8c_source.html

Just to have everything clear, where should I find the current (bitmap) code?

coop · September 2020

That's a very old kernel. Please don't look at it as no one is using that kind of ancient stuff. The actual pid allocation is done in "fork.c" which creates a new process with:

pid = alloc_pid(p->nsproxy->pid_ns_for_children, args->set_tid,
                                args->set_tid_size);

in the 5.9 kernel (but that code has been around a long time I believe)

The function itself is pid.c. However, with the addition of namespaces this has all gotten rather complicated if you want to trace the code, so good luck if you do . In words, the way it works is there is to begin with a page of memory which is 4 KB generally, which therefore has 32K bits. Each bit represents a pid and when a process is created the bit is set, if it is removed it is cleared, and to check if it exists you only need to check a bit (not scan a list of processes and pids). If you increase pid_max to greater than 32K, another page of memory is allocated with another 32K bits etc. Pretty straightforward in principle, but complicated by the use of namespaces. Thus allocating a new process id is O(1) and independent of the load on the machine.

As an aside, the kernel does not use pids for anything really. It uses pointers to a task_struct(), one of which exists for each process or thread on the system. Establishing the correspondence between the structures and the pid is only done for system calls for user space and a lot of code has been written to make sure this is not a bottleneck.

pid numbers

Comments

Categories

Upcoming Training

Kubernetes Administration (LFS458)

Linux System Administration (LFS301)

Open Source Virtualization (LFS462)

Linux Kernel Debugging and Security (LFD440)