<div dir="ltr"><div dir="ltr">On Mon, Jun 10, 2024 at 8:33 AM Will Senn <<a href="mailto:will.senn@gmail.com" target="_blank">will.senn@gmail.com</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><u></u>
<div><font face="Helvetica, Arial, sans-serif">There's an interesting dive into PID 0 linked to from osnews:<br>
<a href="https://blog.dave.tf/post/linux-pid0/" target="_blank">https://blog.dave.tf/post/linux-pid0/</a><br>
In the article, the author delves into the history of the
scheduler a bit - going back to Unix v4 (his assembly language
skills don't go to PDP variants).<br>
I like the article for two reasons - 1) it's clarity 2) it points
out the self-reinforcing nature of our search ecosystem.<br>
I'm left with the question - how did scheduling work in v0-v4? and
the observation that search really sucks these days.</font></div></blockquote><div><br></div><div> It's an interesting and well-written article, but I think it's not quite correct.</div><div><br></div><div>It links to sched in the V4 code [1] but there's nothing there about pid 0.</div><div>The right place to link would be the code in main that installs the</div><div>"system process" into the process table [2].</div><div><br></div><div>So yes, in V4, the scheduler is a process in a meaningful sense,</div><div>but I don't think pid 0 is a meaningful process identifier for it.</div><div>Nothing actually *identifies* the scheduler by using the number 0.</div><div>After a process has exited and its parent has called wait, </div><div>its process table entry is set to p_pid = 0 [3]. Surely pid 0 does</div><div>not also identify those processes at the same time that it identifies</div><div>the system process. If there are many processes in the table with</div><div>pid 0, it's difficult to see pid 0 as any kind of identifier at all!</div><div><br></div><div>Instead it seems pretty clear that pid 0 represents the concept "no pid". </div><div>This makes sense since the kernel memory started out zeroed,</div><div>so using the zero pid for "nothing here" avoided separate reinitialization.</div><div>The same is true for process status 0 meaning "unused".</div><div>Similarly, inode 0 is "no inode" (useful to mark the end of a directory</div><div>entry list), and disk block number 0 is "no block" (useful to mark</div><div>an unallocated block in a file).</div><div>(Go's emphasis on meaningful zero values is in the same spirit.)<br></div><div><div><div><br></div>Reading the V1 sources seems to confirm this hypothesis: </div><div>V1 does not have a process table for any kernel process,</div><div>and yet it still uses pid 1 for the first process [4].</div></div><div>In V1 the user struct has a u.uno field holding the process number</div><div>as an index into the process table. That field too is 1-indexed,</div><div>because it is convenient for u.uno==0 to mean "no process".</div><div>In particular, swap (analogous here to V4 swtch) understood that if</div><div>called when u.uno==0 the process is exiting and need not be</div><div>saved for reactivation [5]. The kernel goes out of its way to use</div><div>u.uno==0 instead of u.uno==-1: all the code that indexes an array</div><div>by u.uno has to subtract 1 (or 2 for words) from the address being</div><div>indexed to account for the first entry being 1 and not 0.</div><div>Presumably this is because of wanting to use zero value as "no uno".</div><div>(And it's not any less efficient, since the -1 or -2 can be applied to</div><div>the base address during linking.)</div><div><div><br class="gmail-Apple-interchange-newline">The obvious question to ask then is not why pids start at 1</div><div>but why, in contrast to all these examples, uids start at 0.<br></div><div>My guess is that there was simply no need for "no uid" and</div><div>in contrast having the zero value mean "root" worked out nicely.</div><div><br></div></div><div>Perhaps Ken will correct me if I'm reading this all wrong.</div><div><br></div><div>As to the question of how scheduling worked in V1, the swap code</div><div>is walking over runq looking for the highest priority runnable process [6].</div><div>Every process image except the one running was saved on disk,</div><div>so the only decision was which one to read back in.</div><div>This is in contrast to the V4 scheduler, which is juggling multiple</div><div>in-memory process images at once and split out the decisions</div><div>about what to run from the code that moved processes to and</div><div>from the disk.</div><div><br></div><div>Best,</div><div>Russ</div><div><br></div><div><div>[1] <a href="https://github.com/dspinellis/unix-history-repo/blob/Research-V4/sys/ken/slp.c#L89" target="_blank">https://github.com/dspinellis/unix-history-repo/blob/Research-V4/sys/ken/slp.c#L89</a></div><div>[2] <a href="https://github.com/dspinellis/unix-history-repo/blob/Research-V4/sys/ken/main.c#L55" target="_blank">https://github.com/dspinellis/unix-history-repo/blob/Research-V4/sys/ken/main.c#L55</a></div><div>[3] <a href="https://github.com/dspinellis/unix-history-repo/blob/Research-V4/sys/ken/sys1.c#L247" target="_blank">https://github.com/dspinellis/unix-history-repo/blob/Research-V4/sys/ken/sys1.c#L247</a></div><div>[4] <a href="https://github.com/dspinellis/unix-history-repo/blob/Research-V1/u0.s#L200" target="_blank">https://github.com/dspinellis/unix-history-repo/blob/Research-V1/u0.s#L200</a></div></div><div>[5] <a href="https://github.com/dspinellis/unix-history-repo/blob/Research-V1/u3.s#L40" target="_blank">https://github.com/dspinellis/unix-history-repo/blob/Research-V1/u3.s#L40</a></div><div>[6] <a href="https://github.com/dspinellis/unix-history-repo/blob/Research-V1/u3.s#L9" target="_blank">https://github.com/dspinellis/unix-history-repo/blob/Research-V1/u3.s#L9</a></div><div></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>
</div>
</blockquote></div></div>