-
Notifications
You must be signed in to change notification settings - Fork 0
/
processes.html
621 lines (517 loc) · 22.1 KB
/
processes.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
---
layout: default
title: Processes
last-updated: April, 2021
---
<!--
Directory Tree...
From the user's perspective, the directory tree is where you store your
stuff. Using commands, utilities, or other programs, you modify its
contents and structure to your liking.
From the programmer's perspective, the directory tree is a globally
shared, persistent data structure.
<div class="content" id="executable-files">
<h2>Executable Files</h2>
<p>
The Linux kernel supports four distinct kinds of executable files.
https://www.kernel.org/doc/html/latest/admin-guide/binfmt-misc.html
</p>
</div>
Actors: See Also: Examples?
mirror.linuux.org.au/pub/linux.conf.au/2014/Friday/104-D-Bus_in_the_kernel_-_Lennart_Poettering.mp4
-->
<div id="nav_bar_2" class="nav">
<ul>
<li><a href="#introduction"> Introduction </a></li>
<li><a href="#actors"> Actors </a></li>
<li><a href="#process-creation"> Process Creation and System Startup</a></li>
<li><a href="#exec">Executing a Different Program</a></li>
<li><a href="#file-descriptors"> File Descriptors </a></li>
<li><a href="#termination">Process Termination</a></li>
<li><a href="#what-is">Ya But What Is It?</a></li>
<li><a href="#dr-fraser"> Dr. Brian Fraser </a></li>
<li><a href="#references"> References </a></li>
</ul>
</div>
<div id="introduction" class="content">
<h1>Processes</h1>
<div class="quote-text">
"Understanding is the key to success with Linux."
</div>
<div class="quote-ref">
<a href="https://www.tldp.org/LDP/sag/html/intro.html">Linux System Administrator's Guide</a>
</div>
<p>
A <em>program</em> is a file containing a sequence of
instructions to be carried out by the computer,
and a description of initial memory conditions.
A <em>process</em> is an instance of an executing
program.<sup><a href="#references">[1]</a></sup><sup><a href="#references">[2]</a></sup>
This essay will chisel out the fundamentals of
processes, as they exist on Unix-like platforms.
</p>
</div>
<div class="content" id="actors">
<h2> Actors </h2>
<p>
In Unix and its derivatives, processes are the actors of the system:
Left to its own devices, the kernel does nothing.
All Unix-like operating systems depend upon processes to direct
the kernel, and through the kernel, processes change the machine.
</p>
<p>
The kernel is responsible for:
</p>
<ol>
<li>
Allocating hardware resources to processes
</li>
<li>
Creating new processes
</li>
<li>
Maintaining information about each
</li>
</ol>
<p>
Each process has a narrowly defined role, which it is
expected to do well. It might copy a file,
display a window, provide audio services, or display the time.
These many small programs are used together to accomplish
a given task; thereby, the utility of each becomes compounded.
</p>
<h3> Process Attributes </h3>
<div class="aside-right">
<h4>Aside:</h4>
<p>
There is a close relationship between a <em>program</em> and a
<em>process</em>— Namely, a program is a file containing
the instructions that the process carries out.
</p>
</div>
<p>
Associated with each process are the following attributes:
</p>
<ul>
<li>Pathname of executable file</li>
<li>Files currently open</li>
<ul>
<li>Their access modes (read, write)</li>
<li>Location wihin file</li>
</ul>
<li>Current working directory</li>
<li>User ID, Group Id</li>
<li>Environment variables</li>
</ul>
<p>
You can think of each process as acting on behalf of some user,
and that it is executing with that user's <em>permissions.</em>
For example, a process executing as user <em>josh</em> is
executing with <em>josh's permission.</em>
A process is by-default allowed to manipulate files owned by the user
it is acting as:
It can remove owned files,
change owned files' permission bits, etc..
Files created by a process inherit its associated user.
</p>
<div class="aside-right">
<h4>Aside:</h4>
<p>
File ownership restrictions are enforced by the kernel.
</p>
</div>
<p>
Processes acting as user
<em><a href="http://www.linfo.org/root.html">root</a></em>
are special:
They can change any file on the system, without qualification.
That is, they by default transcend the file ownership scheme;
we therefore use <em>superuser</em> synonymously with root user,
and say that a process executing as root is executing with
<em>superuser permissions.</em>
</p>
<p>
In practice, <em>root</em> is the adminstrative user.
It can, for example, format a physical device and make it
available for system-wide use.
Superuser priveledges are typically acquired
through <code>sudo</code>.
</p>
<h3> See Also: </h3>
<ul>
<li><a href="http://linfo.org/root.html">What is root?</a> -- definition by the Linux Information Project (LINFO)</li>
<li><a href="https://www.sudo.ws/man/1.8.3/sudo.man.html"><code>sudo(8)</code></a> - Execute a command as another user</li>
<li><a href="https://man7.org/linux/man-pages/man7/credentials.7.html"><code>credentials(7)</code></a> - Process identifiers</li>
</ul>
</div>
<div class="content" id="process-creation">
<h2 style="margin-bottom: 0;"> Process Creation</h2>
<h4>and System Startup</h4>
<p>
At the end of its boot sequence, the kernel launches a single program,
<code>/sbin/init</code>,<sup><a href="https://en.wikipedia.org/wiki/Init">[3]</a></sup>
and in so doing creates the system's first process.
While init schemes
<a href="https://arxiv.org/abs/0706.2748">have evolved</a>
over the years, its purpose has not changed:
It is a process with superroot permissions that acts as a
manager of the system and trustee among processes.
</p>
<div class="aside-left">
<h4>Aside:</h4>
<p>
You can check your <em>init</em> program with
</p>
<div class="code">
$ ps --pid 1
</div>
<p>
<a href="https://www.freedesktop.org/wiki/Software/systemd/">systemd</a> is very common today.
</p>
</div>
<p>
<code>init</code>'s first responsibility is to bring up userspace.
It does this by launching a sequence of programs, most of which
are daemons, some of which the user interacts with directly.
After this, it operates in the background, managing services,
restarting them if they falter, and logging events for administrative
purposes.
Its final responsibility is to shut the system down cleanly.
</p>
<p>
After executing <code>/sbin/init</code>, process creation
is delegated to processes themselves, and is carried
out through the
<em>fork()</em> system call.
</p>
<p>
A call to <em>fork()</em> instructs the kernel to duplicate
the current process,
and the two resulting processes are very nearly identical.
Of note, they both continue to execute the same program at the
same spot, have the same open files, and have duplicate memory spaces.
The newly created process is referred to as the <em>child</em> process,
the original is called the <em>parent,</em> and we say
that the child process <em>inherits</em> its parent's attributes.
</p>
<p>
Parent and child differ in a few ways.
First, upon birth, each process is given its own
identification number by the kernel.
This is guaranteed to be unique to the process, and
remains constant throughout the process's life;
it serves as a system-wide handle to the process.
Second, they each recieve a different return value
from <em>fork()</em>.
This allows them to act differently within the same
program:
</p>
<pre>
pid_t rv = fork(); // One process executes this line
if(rv == 0) { // Two processes execute this line
// I am child
} else if (rv > 0) {
// I am parent
}
</pre>
<p>
Their executions thereby diverge, and each is free to act
independently.
</p>
<h3>See Also:</h3>
<ul>
<li><code><a href="https://www.man7.org/linux/man-pages/man2/fork.2.html">fork(2)</code></a> - Create a child process</li>
</ul>
</div>
<div class="content" id="exec">
<h2>Executing a Different Program</h2>
<p>
The <em>exec()</em> family of functions change the program
currently being executed.
Properly speaking, there is only one <em>exec()</em> system
call, and that is <a href="https://man7.org/linux/man-pages/man2/execve.2.html"><em>execve()</em></a>.
POSIX-conforming C libraries offer six variants of
it, each with slightly different conveniences.<sup><a href="https://pubs.opengroup.org/onlinepubs/009695399/functions/exec.html">[4]</a></sup>
Under the hood, they each map to
a call to <em>execve()</em>.<sup><a href="https://man7.org/linux/man-pages/man3/exec.3.html">[5]</a></sup>
They are collectively referred to as <em>exec()</em>
because, used properly, they have the same meaning.
</p>
<p>
A simple invocation is:
</p>
<pre>
char *args[] = {"ls", "-l", NULL};
execv("/usr/bin/ls", args);
// Execution of *this* program stops here,
// and the process continues executing at /usr/bin/ls
</pre>
<div class="aside-right">
<h4>Aside:</h4>
<p>
<em>execvp()</em>, for example,
searches the PATH environment variable for
a particular file.
</p>
</div>
<p>
"The <em>execve()</em> system call loads a new program into
a process's memory.
During this operation, the old program
is discarded, and the process's"
</p>
<p>
<em>exec()</em> usually follows <em>fork()</em>:
</p>
<pre>
pid_t rv = fork();
if( rv == 0 ) {
// I am child
char *args[] = {"ls", "-l", NULL};
execv("/usr/bin/ls", args);
} else if ( rv > 0 ) {
// Execution of parent process continues within this program
}
</pre>
<p>
This sequence results in one process launching another program.
Note that there is no need to adjust other
process attributes, and they are preserved across
<em>exec()</em>.
In particular, the PID of the process is left unchanged.
</p>
<h3>See Also:</h3>
<ul>
<li><a href="https://man7.org/linux/man-pages/man1/pstree.1.html"><code>pstree(1)</code></a> - Display a tree of processes</li>
</ul>
</div>
<div id="file-descriptors" class="content">
<h2> File Descriptors</h2>
<p>
Within a process, each file that the process has open is associated
with a number— specifically, a nonnegative integer.
This number is called a <em>file descriptor</em>,
and is returned by the kernel in response to a request to open a file.
Thereafter, the process uses the number to refer to the
file whenever it wishes to act on it.
</p>
<p>
In the following line, a process is instructing the kernel to
open the file <em>rubber_ducky</em> for reading, and
is storing the kernel's response in the variable <em>fd</em>:
</p>
<pre>
int fd = open("rubber_ducky", O_RDONLY);
</pre>
<p>
If this call failed (which it would if <em>./rubber_ducky</em> did not
exist, for example), then <em>fd</em> would be assigned <em>-1,</em> an invalid file
descriptor.
We can therefore check with:
</p>
<pre>
if (fd < 0) {
perror("open"); // Print error message
exit(1); // Terminate process
}
</pre>
<p>
and then continue as though <em>open()</em> had succeeded.
</p>
<p>
In the following line, the kernel is instructed to read at-most
<em>buff_size</em> bytes from <em>fd</em>,
and place the data into the memory denoted by <em>buff:</em>
</p>
<pre>
int num_bytes_read = read(fd, buff, buff_size);
</pre>
<p>
If all goes well, the system call will return with a nonnegative value,
the number of bytes read;
if the kernel encountered an error, it will return with a
negative value.
In either case, this value will be assigned to <em>num_bytes_read</em>,
which may be checked similarly to <em>open().</em>
</p>
<p>
<em>write()</em> is handled analogously, as is closing a file.
Every system call has an entry in Section 2 of the manual.
</p>
<h3>See Also:</h3>
<ul>
<li><code><a href="https://man7.org/linux/man-pages/man2/read.2.html">read(2)</a></code> - Read from a file descriptor</li>
<li><a href="http://www.linfo.org/unix_philosophy.html">UNIX Philosophy Description</a> by The Linux Information Project (LINFO)</li>
</ul>
</div>
<div class="content" id="termination">
<h2> Process Termination </h2>
<p>
A process may terminate for a few reasons:
</p>
<ul class="big-list">
<li>
Explicitly terminates its own execution
<p>
This is the usual (and preferred) route of process
termination:
The program finishes its job or identifies an error,
and explicitly terminates itself.
This may be done by either calling <code>exit</code>, or by
returning from <em>main.</em>
</p>
</li>
<li>
It is told to terminate
<p>
A process may be told to terminate by another process through
a <em>signal,</em> a primitive form of interprocess communication.
This is abnormal process termination.
</p>
<p>
Signals are brutal business.
For the purposes of process termination,
<code>SIGINT</code> is preferred because it
affords the program the oportunity to exit
cleanly.<sup><a href="#references" title="Kerrisk, p389">[1]</a></sup>
At the command line, it is created and sent to
the foreground
process with CTRL+C.
</p>
<p>
Command-line tools for sending signals to arbitrary
processes are <code><a href="https://www.man7.org/linux/man-pages/man1/kill.1.html">kill</a></code>,
which refers to processes by PID, and
<code><a href="https://www.man7.org/linux/man-pages/man1/killall.1.html">killall</a></code>,
which refers to processes by program name.
</p>
</li>
<li>
It does something wrong
<p>
Almost always, this happens when the process tries
to read or write a memory location not assigned to it,
a condition known in Linux as <em>segmentation fault</em>.
The kernel steps in, halts program execution and immediately
terminates the process.
Again, abnormal program termination.
</p>
</li>
</ul>
<h3>See Also:</h3>
<ul>
<li><a href="https://www.man7.org/linux/man-pages/man1/top.1.html"><code>top(1)</code></a> - Display Linux processes</li>
<li><a href="./assets/check.c">check.c</a> - Double-check of the above snippets</li>
</ul>
</div>
<div class="content" id="what-is">
<h2>Ya But What Is It?</h2>
<p>
The short answer is, a section of memory distinct from the
rest of the machine, and some restricted access to the CPU, both
given by the kernel.
The long answer is the primary abstraction presented to programmers,
and the environment within which all programs execute.
The process is one of the most important ideas
of computer science, and we can only scratch the surface of
what it is here.
</p>
<h3> Memory </h3>
<p>
Modern operating systems
provide each process with its own memory in the
form of a distinct <em>address space.</em>
On all modern computers, memory is byte-addressable, so that each
address refers to exactly one byte.
Addresses themselves are simply numbers, and
an address space is a collection of memory addresses.
</p>
<div class="img-right" style="background: #f1f1f1;">
<a style="outline: solid black 2px; " title="Traced by User:Stannered, original by en:User:Dysprosia / BSD (http://opensource.org/licenses/bsd-license.php)" href="https://commons.wikimedia.org/wiki/File:Virtual_address_space_and_physical_address_space_relationship.svg"><img width="256" alt="Virtual address space and physical address space relationship" src="https://upload.wikimedia.org/wikipedia/commons/thumb/3/32/Virtual_address_space_and_physical_address_space_relationship.svg/512px-Virtual_address_space_and_physical_address_space_relationship.svg.png"></a>
</div>
<p>
For each machine, there exists a unique physical address space.
An address in this space refers to an actual byte of memory
(RAM) on the machine;
only the kernel has access to this address space.
The kernel, working in conjunction with hardware,
imposes a layer of indirection between each process's
memory references and the physical address space.
This layer is known as <em>virtual memory,</em>
and consists of a collection of virtual addresses
mapped to physical ones.
</p>
<p>
Virtual memory requires support from both the kernel and CPU:
The kernel to assign address translations
(mappings from virtual addresses to physical addresses),
and the CPU to carry them out.
Each time a process references a virtual address, the
CPU translates it to a physical address, so that for all
intents and purposes, virtual addresses behave exactly
as physical ones.
</p>
<p>
Since each process operates within its own virtual address space,
each process is presented with the illusion
that it has the entire physical address space to itself.
By excluding physical addresses from translation,
the rest of the machine may be implicitly protected:
Other process' memory simply does not exist within the context of the
currently running process, and therefore cannot be read from,
nor written to.
</p>
<h3> CPU </h3>
<p>
Modern CPU's support two distinct modes of operation.
<em>Kernel mode,</em> sometimes called "supervisor mode," is
the unrestricted mode.
Here, all CPU instructions are available, and address translation
is not performed, so that code running in this mode has
unmediated access to the machine.
</p>
<p>
<em>User mode</em> is a proper subset of kernel mode:
A program running
in user mode does not have access to all instructions, and
its memory references are redirected through a translation table.
Executing programs in this mode
is known as <em>limited direct execution.</em><sup><a href="http://pages.cs.wisc.edu/~remzi/OSTEP/cpu-mechanisms.pdf">[3]</a></sup>
It is <em>limited</em> in the sense that each process has access to
only a restricted subset of the CPU's instructions, and
it is <em>direct</em> in the sense that the program runs
directly on hardware.
Like virtual memory, it is intended to make safe and efficient use
of available hardware.
</p>
<p>
Virtual memory and limited direct execution together
form a sandboxed environment within which a program
may safely be executed.
</p>
<h3>See Also:</h3>
<ul>
<li><a href="https://pages.cs.wisc.edu/~remzi/OSTEP/">Operating Systems: Three Easy Pieces</a> by Remzi H. Arpaci-Dusseau and Andrea C. Arpaci-Dusseau </li>
</ul>
</div>
<div class="content" id="dr-fraser">
<h2 style="margin-bottom: 2pt;"> Dr. Brian Fraser: </h2>
<h4 style="margin-bottom: 18pt;"> Fork and Exec Linux Programming </h4>
<iframe class="video" src="https://www.youtube-nocookie.com/embed/l64ySYHmMmY" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
</div>
<div id="references" class="content">
<h2> References </h2>
<ol>
<li>
Kerrisk, Michael. <em>The Linux Programming Interface</em>. San Francisco, CA, No Starch Press, 2010.
</li>
<li>
Bryant, R. E., & O'Hallaron, D. R. (2016). Processes. In <em>Computer Systems: A Programmer's Perspective</em> (3rd ed., p. 732). Upper Saddle River, New Jersey: Pearson.
</li>
<li>
init - Wikipedia (3 March 2021). Retrieved March 30, 2021, from <a href="https://en.wikipedia.org/wiki/Init">https://en.wikipedia.org/wiki/Init</a>
</li>
<li>exec. (2008). <a href="https://pubs.opengroup.org/onlinepubs/009695399/functions/exec.html">https://pubs.opengroup.org/onlinepubs/009695399/functions/exec.html</a>.</li>
<li>exec(3) - Linux manual page. (2021, March 22). <a href="https://man7.org/linux/man-pages/man3/exec.3.html">https://man7.org/linux/man-pages/man3/exec.3.html</a>.</li>
</ol>
</div>