7
System calls
Processes and Threads
For UNIX a process consists of an address space with a shared program text, private data and a private stack, a current program counter, and various system resources such as file connections.
A thread only consists of a program counter and a private stack; several threads share the same address space and the same resources and the operating system normally sees to it that threads and processes get scheduled more or less in parallel for more or less equal time slices.
Threads, therefore, should be less expensive than processes. If threads in the same address space take turns, the operating system only needs to replace the program counter and stack pointer but not the address space map. However, threads can interfere with each other when accessing global variables in the same address space. If threads do not access resources that are known to several threads at once, there is hardly a difference between threads and processes.
Limbo only knows threads -- called processes in Inferno. A thread is created with a spawn statement
and a procedure call is used as the new program counter.
Limbo threads are a more primitve concept than UNIX processes -- and because of modules and pctl()
processes can be fabricated from threads: modules segment the address space, are managed dynamically, and cannot be discovered; pctl() manages the system resources:
| FORKFD | NEWFD | shares, splits, or clears file descriptor space; can pass explicit list | |
| FORKNS | NEWNS | shares, splits, or clears name space | |
| NEWPGRP | starts new process group which can be eliminated with killgrp as a whole |
shell takes it's first argument as a command name, executes it with the remaining arguments as a separate thread, and waits for completion.
{07/shell.b}
implement Command; # naive command execution
include "sys.m"; include "draw.m"; include "sh.m";
init (ctxt: ref Draw->Context, argv: list of string) {
if (argv != nil && tl argv != nil) {
sys := load Sys Sys->PATH; if (sys == nil) exit;
stderr := sys->fildes(2);
cmd := hd tl argv; args := tl tl argv;
cmdMod := load Command cmd+".dis";
if (cmdMod == nil) {
sys->fprint(stderr, "%s.dis: not found\n", cmd);
exit;
}
wait := "/prog/"+string sys->pctl(0, nil)+"/wait";
waitFd := sys->open(wait, sys->OREAD);
if (waitFd == nil) {
sys->fprint(stderr, "%s: not found [%r]\n", wait);
exit;
}
spawn cmdMod->init(ctxt, cmd::args);
buf := array [128] of byte;
n := sys->read(waitFd, buf, len buf);
sys->fprint(stderr, "done: %s\n", string buf[0:n]);
}
}
{}
sh.m declares Command, i.e., init(). All commands called by the Inferno shell must implement this module.
spawn creates the thread and executes the specified procedure call in it. The thread terminates when the procedure call terminates or calls exit().
The thread has a process description in #p
and the owner of a thread can write kill into the ctl file and terminate the thread. This cannot be intercepted.
pctl(0,nil) produces a thread's own process id
. If wait is open before threads are created, reading it will block until a thread created after opening wait is terminated. read() reports process id, module name, and error message if any.
perky$ bind '#p' /prog
perky$ shell /dis/echo hello, world
args: hello, world
done: 15 "Echo":
lab is a simple interactive language to experiment with system calls and security in Inferno. A thread number must be specified before each statement and each thread owns one file descriptor passed implicitely between system calls (statements) such as open and read executed by the same thread.
bind source target [after|before|replace] [create]
chdir path
create path [mode chdir permission]
dirread n-entries
dup old new
exit
export [wait]
fildes num
mount target [after|before|replace] [create]
open path [read|write|rdwr rclose trunc]
pctl forkfd|newns|forkns|newpgrp
pctl newfd [num ...]
pipe
read n-bytes
remove path
seek offset [start|rela|end]
spawn new-task
stat path
unmount [name] target
write [text ...]
wstat path name|permission
Initially there is thread 0. spawn creates a new thread for an unused number, exit terminates the calling thread:
perky$ lab
[5] started
0 spawn 1
[5] spawned [6]
1 exit
[6] exit
0 exit
dup() and pctl() refer to small integers to represent file connections. create() and open() create files and directories and produce connections as ref FD which encapsulate the integers and must be specified for the transfer operations dirread(), read(), and write(), and positioning through seek(). An integer can be converted to a ref FD using fildes().
Each thread in lab hides one ref FD which receives the result of operations like fildes and is used implicitly for operations like read or seek.
cindy$ lab
[7] started
0 spawn 1
[7] spawned [8]
0 open /dev/cons
[7] fd 7: /dev/cons
1 fildes 7
[8] fildes 7
1 seek 10
[8] at 10
0 seek 0 rela
[7] at 10
Threads share their file descriptor numbers and consequently the positions.
0 dup 7 8
[7] dup to fd 8
1 pctl forkfd
[8] pctl forkfd
0 open /dev/null
[7] fd 7: /dev/null
1 fildes 7
[8] fildes 7
1 read 10
hello
[8] hello
forkfd splits the file descriptor spaces. Thread 1 has 7 and 8 connected to the console, whereas thread 0 now has 7 connected to /dev/null and 8 still connected to the console.
1 seek 20
[8] at 20
1 fildes 8
[8] fildes 8
1 seek 0 rela
[8] at 20
1 pctl newfd 1
[8] pctl newfd 1
1 fildes 0
[8] fildes 0: not open
1 fildes 8
[8] fildes 8: not open
newfd clears the file descriptor space; however, a list of individual integers such as 1 for standard output can be retained.
0 fildes 8
[7] fildes 8: not open
This looks like a bug -- thread 0 used dup but file descriptor 8 got lost in the shuffle. There is no close operation; a file connection is released when it is no longer referenced -- but by what: a ref FD or just an integer? How do you not reference an integer? This is tricky for networking connections.
System calls like chdir, create, or open use paths that refer to a hierarchical namespace. [Nonessential replies have been removed below.]
perky$ lab
0 create /tmp/a chdir
0 create /tmp/b chdir
0 create /tmp/a/aa
0 create /tmp/b/bb
This creates two directories and a file in each.
The namespace belongs to a thread and can be shared with others.
0 bind /tmp/b /tmp/b create
0 bind /tmp/a /tmp/b before
0 create /tmp/b/cc
0 open /tmp/b
0 dirread 10
[7] aa ...
0 dirread 10
[7] bb ...
[7] cc ...
0 dirread 10
[7] EOF
The first bind turns /tmp/b into a changeable union-directory, the second one joins /tmp/a before it, but not changeable. A new file is created in the original /tmp/b.
0 unmount /tmp/b
0 open /tmp/b
0 dirread 10
[7] bb ...
[7] cc ...
0 dirread 10
[7] EOF
unmount removes one or all union operations through bind and mount.
perky$ pwd; lab
/usr/ats/07
[9] started
0 chdir /
[9] chdir to /
0 exit
perky$ pwd
/
forkns splits the thread's own namespace off it's creator, newns clears it -- theoretically this offers complete security, but there still are the devices:
perky$ lab
0 spawn 1
1 pctl forkns
1 chdir /tmp
[16] chdir to /tmp
0 chdir .
[15] chdir to /usr/axel
1 stat /dev/cons
[16] cons ...
1 pctl newns
1 stat /dev/cons
[16] stat /dev/cons: does not exist
1 stat #c/cons
[16] cons ...
chdir after forkns no longer affects the creator. Using #c the console can still be reached after newns. File descriptors are not affected by newns -- the output is still produced.
Initially, each thread belongs to the same process group as it's creator. If killgrp is written into #p/<pid>/ctl, all threads in the group are terminated.
perky$ lab
0 spawn 1
1 bind #p /prog
1 open /prog/4/ctl write
1 write killgrp
[4] write:
Thread 1 is in the same group as the boot shell. This operation eliminates practically everything but for thread 1: killgrp does not eliminate the originator.
perky$ lab
0 pctl newpgrp
0 spawn 1
1 bind #p /prog
1 open /prog/4/ctl write
1 write killgrp
3 "Lab":killed
$ [4] write:
permission denied
newpgrp starts a new process group, so the boot shell is saved.
Typically, several threads cooperate and they are put into a process group so that an application can be terminated without precise knowledge as to it's constituent threads.
Unlike /dis/sh.dis, a shell should probably put each command into a process group and split namespace and file descriptor space. (Consequences?)