source: branches/libc-0.6/doc/Fork.os2@ 3057

Last change on this file since 3057 was 1281, checked in by bird, 22 years ago

...

  • Property cvs2svn:cvs-rev set to 1.3
  • Property svn:eol-style set to native
  • Property svn:executable set to *
  • Property svn:keywords set to Author Date Id Revision
File size: 30.8 KB
Line 
1$Id: Fork.os2 1281 2004-03-06 21:48:26Z bird $
2
3Fork Design Draft
4--------------------
5
61.0 Intro
7----------
8
9blah.
10
11
121.1 The SuS fork() Description
13------------------------------
14
15NAME
16
17 fork - create a new process
18
19SYNOPSIS
20
21 #include <unistd.h>
22
23 pid_t fork(void);
24
25DESCRIPTION
26
27 The fork() function shall create a new process. The new process (child process) shall be an exact copy of the calling process (parent process) except as detailed below:
28
29 * The child process shall have a unique process ID.
30 * The child process ID also shall not match any active process
31 group ID.
32 * The child process shall have a different parent process ID,
33 which shall be the process ID of the calling process.
34 * The child process shall have its own copy of the parent's file
35 descriptors. Each of the child's file descriptors shall refer
36 to the same open file description with the corresponding file
37 descriptor of the parent.
38 * The child process shall have its own copy of the parent's open
39 directory streams. Each open directory stream in the child process
40 may share directory stream positioning with the corresponding
41 directory stream of the parent.
42 * [XSI] The child process shall have its own copy of the parent's
43 message catalog descriptors.
44 * The child process' values of tms_utime, tms_stime, tms_cutime, and
45 tms_cstime shall be set to 0.
46 * The time left until an alarm clock signal shall be reset to zero,
47 and the alarm, if any, shall be canceled; see alarm() .
48 * [XSI] All semadj values shall be cleared.
49 * File locks set by the parent process shall not be inherited by
50 the child process.
51 * The set of signals pending for the child process shall be
52 initialized to the empty set.
53 * [XSI] Interval timers shall be reset in the child process.
54 * [SEM] Any semaphores that are open in the parent process shall
55 also be open in the child process.
56 * [ML] The child process shall not inherit any address space memory
57 locks established by the parent process via calls to mlockall()
58 or mlock().
59 * [MF|SHM] Memory mappings created in the parent shall be retained
60 in the child process. MAP_PRIVATE mappings inherited from the
61 parent shall also be MAP_PRIVATE mappings in the child, and any
62 modifications to the data in these mappings made by the parent
63 prior to calling fork() shall be visible to the child. Any
64 modifications to the data in MAP_PRIVATE mappings made by the
65 parent after fork() returns shall be visible only to the parent.
66 Modifications to the data in MAP_PRIVATE mappings made by the
67 child shall be visible only to the child.
68 * [PS] For the SCHED_FIFO and SCHED_RR scheduling policies, the
69 child process shall inherit the policy and priority settings
70 of the parent process during a fork() function. For other s
71 cheduling policies, the policy and priority settings on fork()
72 are implementation-defined.
73 * [TMR] Per-process timers created by the parent shall not be
74 inherited by the child process.
75 * [MSG] The child process shall have its own copy of the message
76 queue descriptors of the parent. Each of the message descriptors
77 of the child shall refer to the same open message queue
78 description as the corresponding message descriptor of the parent.
79 * [AIO] No asynchronous input or asynchronous output operations
80 shall be inherited by the child process.
81 * A process shall be created with a single thread. If a
82 multi-threaded process calls fork(), the new process shall contain
83 a replica of the calling thread and its entire address space,
84 possibly including the states of mutexes and other resources.
85 Consequently, to avoid errors, the child process may only execute
86 async-signal-safe operations until such time as one of the exec
87 functions is called. [THR] Fork handlers may be established by
88 means of the pthread_atfork() function in order to maintain
89 application invariants across fork() calls.
90
91 When the application calls fork() from a signal handler and any of
92 the fork handlers registered by pthread_atfork() calls a function
93 that is not asynch-signal-safe, the behavior is undefined.
94 * [TRC TRI] If the Trace option and the Trace Inherit option are
95 both supported:
96 If the calling process was being traced in a trace stream that
97 had its inheritance policy set to POSIX_TRACE_INHERITED, the
98 child process shall be traced into that trace stream, and the
99 child process shall inherit the parent's mapping of trace event
100 names to trace event type identifiers. If the trace stream in
101 which the calling process was being traced had its inheritance
102 policy set to POSIX_TRACE_CLOSE_FOR_CHILD, the child process
103 shall not be traced into that trace stream. The inheritance
104 policy is set by a call to the posix_trace_attr_setinherited()
105 function.
106 * [TRC] If the Trace option is supported, but the Trace Inherit
107 option is not supported:
108 The child process shall not be traced into any of the trace
109 streams of its parent process.
110 * [TRC] If the Trace option is supported, the child process of
111 a trace controller process shall not control the trace streams
112 controlled by its parent process.
113 * [CPT] The initial value of the CPU-time clock of the child
114 process shall be set to zero.
115 * [TCT] The initial value of the CPU-time clock of the single
116 thread of the child process shall be set to zero.
117
118 All other process characteristics defined by IEEE Std 1003.1-2001 shall
119 be the same in the parent and child processes. The inheritance of
120 process characteristics not defined by IEEE Std 1003.1-2001 is
121 unspecified by IEEE Std 1003.1-2001.
122
123 After fork(), both the parent and the child processes shall be capable
124 of executing independently before either one terminates.
125
126RETURN VALUE
127
128 Upon successful completion, fork() shall return 0 to the child process
129 and shall return the process ID of the child process to the parent
130 process. Both processes shall continue to execute from the fork()
131 function. Otherwise, -1 shall be returned to the parent process, no
132 child process shall be created, and errno shall be set to indicate
133 the error.
134
135ERRORS
136
137 The fork() function shall fail if:
138
139 [EAGAIN]
140 The system lacked the necessary resources to create another
141 process, or the system-imposed limit on the total number of
142 processes under execution system-wide or by a single user
143 {CHILD_MAX} would be exceeded.
144
145 The fork() function may fail if:
146
147 [ENOMEM]
148 Insufficient storage space is available.
149
150
151
152
1532.0 Requirements and Assumptions Of The Implementation
154------------------------------------------------------
155
156The Innotek LIBC fork() implementation will require the following features
157in LIBC to work:
158 1. A shared process management internal to LIBC for communication to the
159 child that a fork() is in progress.
160 2. A very generalized and varied set of fork helper functions to archive
161 maximum flexibility of the implementation.
162 3. Extended versions of some memory related OS/2 APIs must be implemented.
163
164The implementation will further make the following assumption about the
165operation of OS/2:
166 1. DosExecPgm will not return till all DLLs are initated successfully.
167 2. DosQueryMemState() is broken if more than one page is specified.
168 (no idea why/how/where it's broken, but testcase shows it is :/ )
169
170
1713.0 The Shared Process Management
172---------------------------------
173
174The fork() implementation requires a method for telling the child process
175that it's being forked and must take a very different startup route. For
176some other LIBC apis there are need for parent -> child and child -> parent
177information exchange. More specifically, the inheritance of sockets,
178signals, the different scheduler actions of a posix_spawn[p]() call, and
179possibly some process group stuff related to posix_spawn too if we get it
180figured out eventually. All this was parent -> child during spawn/fork. A
181need also exist for child -> parent notification and possibly exchange for
182process termination. It might be necessary to reimplement the different
183wait apis and implement SIGCHLD, it's likely that those tasks will make
184such demands.
185
186The choice is now whether or not to make this shared process management
187specific to each LIBC version as a shared segement or try to make it
188survive normal LIBC updates. Making is specific have advantages in code
189size and memory footprint (no reserved fields), however it have certain
190disadvantages when LIBC is updated. The other option is to use a named
191shared memory object, defining the content with reserved space for later
192extensions so several versions of LIBC with more or less features
193implemented can co use the memory space.
194
195The latter option is prefered since it allows more applications to
196interoperate, it causes less shared memory waste, the shared memory
197can be located in high memory and it would be possible to fork
198processes using multiple versions of LIBC.
199
200The shared memory shall be named \SHAREMEM\INNOTEKLIBC.V01, the version
201number being the one of the shared memory layout and contents, it will
202only be increased when incompatible changes are made.
203
204The shared memory shall be protected by an standard OS/2 mutex semaphore.
205It shall not use any fast R3 semaphore since the the usage frequency is
206low and the result of a messup may be disastrous. Care must be take for
207avoiding creation races and owner died scenarios.
208
209The memory shall have a fixed size, since adding segments is very hard.
210Thus the size must be large enough to cope with a great deal of
211processes, while bearing in mind that OS/2 normally doesn't support more
212than a 1000 processes, with a theoritical max of some 4000 (being the
213max thread count). A very simplistic allocation scheme will be
214implemented. Practically speaking a fixed block size pool would do fine
215for the process structure, while for the misc structures like socket
216lists a linked list based heap would do fine.
217
218The process blocks shall be rounded up to in size adding a reasonable
219amount of space resevered for future extensions. Reserved space must be
220all zeroed.
221
222The fork() specific members of the process block shall be a pointer to
223the shared memory object for the fork operation (the fork handle) and
224list of forkable modules. The fork handle will it self contain
225information indicating whether or not another LIBC version have already
226started fork() handling in the child. The presense of the fork handle
227means that the child is being forked and normal dll init and startup
228will not be executed, but a registered callback will be called to do
229the forking of each module. (more details in section 4.0)
230
231The parent shall before spawn, fork and exec (essentially before DosExecPgm
232or DosStartSession) create a process block for the child to be born and
233link it into an embryo list in the shared memory block. The child shall
234find it's process block by searching the embryo list using the parent pid
235as key. All DosExecPgm and DosStartSession calls shall be serialized within
236one LIBC version. (If some empty headed programmer manages to link together
237a program which may end up using two or more LIBC versions and having two
238or more thread doing DosExecPgm at the very same time, well then he really
239deserves what ever trouble he gets! At least don't blame me!)
240
241Process blocks shall have to stay around after the process terminated
242(for child -> parent term exchange), a cleanup mechanism will be invoked
243whenever a free memory threshold is reached. All processes will register
244exit list handlers to mark the process block as zombie (and later
245perhaps setting error codes and notifying waiters/child-listeners).
246
247
248
2494.0 The fork() Implementation
250-----------------------------
251
252
253The implementation is based on a fork handle and a set of primitives.
254The fork handle is a pointer to an shared memory object allocated for the
255occation and which will be freed before fork() returns. The primitives
256all operates on this handle and will be provided using a callback table
257in order to fully support multiple LIBC versions.
258
259
2604.1 Forkable Executable and DLLs
261--------------------------------
262
263The support for fork() is an optional feature of LIBC. The default
264executable produced with LIBC and GCC is not be forkable. The fork
265support will be based on registration of the DLLs and EXEs in their
266LIBC supplied startup code (crt0/dll0). A set of fork versions of these
267modules exist with the suffix 'fork.o'.
268
269The big differnece between the ordinary crt0/dll0 and the forkable
270crt0/dll0 is a per module structure, a call to register this, and the
271handling of the return code of that call.
272
273The fork module structure:
274 typedef struct __libc_ForkModule
275 {
276 /** Structure version. (Initially 'FMO1' as viewed in hex editor.) */
277 unsigned int iMagic;
278 /** Fork callback function */
279 int (*pfnAtFork)(__LIBC_FORKMODULE *pModule,
280 __LIBC_FORKHANDLE *pForkHandle, enum __LIBC_CALLBACKOPERATION enmOperation);
281 /** Pointer to the _CRT_FORK_PARENT1 set vector.
282 * It's formatted as {priority,callback}. */
283 void *pvParentVector1;
284 /** Pointer to the _CRT_FORK_CHILD1 set vector.
285 * It's formatted as {priority,callback}. */
286 void *pvChildVector1;
287 /** Data segment base address. */
288 void *pvDataSegBase;
289 /** Data segment end address (exclusive). */
290 void *pvDataSegEnd;
291 /** Reserved - must be zero. */
292 int iReserved1;
293 } __LIBC_FORKMODULE, *__LIBC_PFORKMODULE; /* urg! conventions */
294
295
296The fork callback function which crt0/dll0 references when initializing
297the fork modules structure is called _atfork_callback. It takes the fork
298handle, module structure, and an operation enum as arguments. LIBC will
299contain a default implementation of _atfork_callback() which simply
300duplicates the data segment, and processes the two set vectors
301(_CRT_FORK_*1).
302
303crt0/dll0 will register the fork module structure and detect a forked
304child by calling __libc_ForkRegisterModule().
305
306Prototypes:
307 /**
308 * Register a forkable module. Called by crt0 and dll0.
309 *
310 * The call links pModule into the list of forkable modules
311 * which is maintained in the process block.
312 *
313 * @returns 0 on normal process startup.
314 * @returns 1 on forked child process startup.
315 * The caller should respond by not calling any _DLL_InitTerm
316 * or similar constructs.
317 * @returns negative on failure.
318 * The caller should return from the dll init returning FALSE
319 * or DosExit in case of crt0. _atfork_callback() will take
320 * care of necessary module initiation.
321 * @param pModule Pointer to the fork module structure for the
322 * module which is to registered.
323 */
324 int __libc_ForkRegisterModule(__LIBC_FORKMODULE *pModule);
325
326
327
328
329
3304.2 Fork Primitives
331-------------------
332
333These primitives are provided by the fork implementation in the fork
334handle structure. We define a set of these primitives now, if later
335new ones are added the users of these must check that they are
336actually present.
337
338Example:
339 rc = pForkHandle->pOps->pfnDuplicatePages(pModule->pvDataBase, pModule->pvDataEnd, __LIBC_FORK_ONLY_DIRTY);
340 if (rc)
341 return rc; /* failure */
342
343Prototypes:
344 /**
345 * Duplicating a number of pages from pvStart to pvEnd.
346 * @returns 0 on success.
347 * @returns appropriate non-zero error code on failure.
348 * @param pForkHandle Handle of the current fork operation.
349 * @param pvStart Pointer to start of the pages. Rounded down.
350 * @param pvEnd Pointer to end of the pages. Rounded up.
351 * @param fFlags __LIBC_FORK_ONLY_DIRTY means checking whether the
352 * pages are actually dirty before bothering touching
353 * and copying them. (Using the partically broken
354 * DosQueryMemState() API.)
355 * __LIBC_FORK_ALL means not to bother checking, but
356 * just go ahead copying all the pages.
357 */
358 int pfnDuplicatePages(__LIBC_FORKHANDLE *pForkHandle, void *pvStart, void *pvEnd, unsigned fFlags);
359
360 /**
361 * Invoke a function in the child process giving it an chunk of input.
362 * The function is invoked the next time the fork buffer is flushed,
363 * call pfnFlush() if the return code is desired.
364 *
365 * @returns 0 on success.
366 * @returns appropriate non-zero error code on failure.
367 * @param pForkHandle Handle of the current fork operation.
368 * @param pfn Pointer to the function to invoke in the child.
369 * The function gets the fork handle, pointer to
370 * the argument memory chunk and the size of that.
371 * The function must return 0 on success, and non-zero
372 * on failure.
373 * @param pvArg Pointer to a block of memory of size cbArg containing
374 * input to be copied to the child and given to pfn upon
375 * invocation.
376 */
377 int pfnInvoke(int *(pfn)(__LIBC_FORKHANDLE *pForkHandle, void *pvArg, size_t cbArg), void *pvArg, size_t cbArg);
378