Re: pid_t range question

From: Eric W. Biederman
Date: Thu Feb 09 2006 - 13:10:46 EST


Jan Engelhardt <jengelh@xxxxxxxxxxxxxxx> writes:

>>> On Linux, type pid_t is defined as an int if you look
>>> through all the intermediate definitions such as S32_T,
>>> etc. However, it wraps at 32767, the next value being 300.
>
> There is also an aesthetical reason. If pids were allowed to exceed, say,
> ten million, you would need a quite wide field in `ps` for the process
> number which is on "normal desktop user" systems just require 5 or 6
> decimal places. Well, what I mean, just look at this sample ps output:
>
> 17:59 shanghai:../fs/proc # ps
> PID TTY TIME CMD
> 1 - 00:00:00 init [3]
> 4215914607 tty2 00:00:00 bash
> 4215914653 tty2 00:00:00 ps
>
> mingw/msys and cygwin already have this "cosmetic problem" since windows
> "pids" are usually above one million.

Yes. Although this I'm not I'm not certain how bad the cosmetic problem
is. Certainly significant enough that we don't want to change a good
thing when we got it. But if there were real problems a big pid
would solve I don't expect large pid numbers to stop us.

>>> I know the
>>> code "reserves" the first 300 pids.
>
> I cannot confirm that. When I start in "-b" mode and 'use' up all pids by
> repeatedly executing /bin/noop, I someday get pids as low as 10
> again, defined by how many kernel threads there are active before /bin/bash
> started.

Odd. When the search wraps it starts searching at 300.
Still there are no locks around last_pid.

>>I know for certain that proc assumes it can fit pid in
>>the upper bits of an ino_t taking the low 16bits for itself
>>so that may the entire reason for the limit.
>>
> inode number in /proc/XXX/fd creation currently is, IIRC
> ino = (pid << 16) | fd
> which limits both pid to 16 bits and the fdtable to 16 bits. See
> fs/proc/inode-alloc.txt. At best, procfs should start using 64bit inode
> numbers.

Well it does use 64bit inode numbers but only on 64bit systems.
Internally /proc doesn't care about the inode it is only for keep find
and friends from getting confused.

Figuring out how to use find_inode_number would likely be interesting,
and a random inode allocation scheme would be interesting.

Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/