source: trunk/src/3rdparty/ptmalloc/README@ 5

Last change on this file since 5 was 2, checked in by Dmitry A. Kuminov, 16 years ago

Initially imported qt-all-opensource-src-4.5.1 from Trolltech.

File size: 6.8 KB
Line 
1ptmalloc3 - a multi-thread malloc implementation
2================================================
3
4Wolfram Gloger ([email protected])
5
6Jan 2006
7
8
9Thanks
10======
11
12This release was partly funded by Pixar Animation Studios. I would
13like to thank David Baraff of Pixar for his support and Doug Lea
14([email protected]) for the great original malloc implementation.
15
16
17Introduction
18============
19
20This package is a modified version of Doug Lea's malloc-2.8.3
21implementation (available seperately from ftp://g.oswego.edu/pub/misc)
22that I adapted for multiple threads, while trying to avoid lock
23contention as much as possible.
24
25As part of the GNU C library, the source files may be available under
26the GNU Library General Public License (see the comments in the
27files). But as part of this stand-alone package, the code is also
28available under the (probably less restrictive) conditions described
29in the file 'COPYRIGHT'. In any case, there is no warranty whatsoever
30for this package.
31
32The current distribution should be available from:
33
34http://www.malloc.de/malloc/ptmalloc3.tar.gz
35
36
37Compilation
38===========
39
40It should be possible to build ptmalloc3 on any UN*X-like system that
41implements the sbrk(), mmap(), munmap() and mprotect() calls. Since
42there are now several source files, a library (libptmalloc3.a) is
43generated. See the Makefile for examples of the compile-time options.
44
45Note that support for non-ANSI compilers is no longer there.
46
47Several example targets are provided in the Makefile:
48
49 o Posix threads (pthreads), compile with "make posix"
50
51 o Posix threads with explicit initialization, compile with
52 "make posix-explicit" (known to be required on HPUX)
53
54 o Posix threads without "tsd data hack" (see below), compile with
55 "make posix-with-tsd"
56
57 o Solaris threads, compile with "make solaris"
58
59 o SGI sproc() threads, compile with "make sproc"
60
61 o no threads, compile with "make nothreads" (currently out of order?)
62
63For Linux:
64
65 o make "linux-pthread" (almost the same as "make posix") or
66 make "linux-shared"
67
68Note that some compilers need special flags for multi-threaded code,
69e.g. with Solaris cc with Posix threads, one should use:
70
71% make posix SYS_FLAGS='-mt'
72
73Some additional targets, ending in `-libc', are also provided in the
74Makefile, to compare performance of the test programs to the case when
75linking with the standard malloc implementation in libc.
76
77A potential problem remains: If any of the system-specific functions
78for getting/setting thread-specific data or for locking a mutex call
79one of the malloc-related functions internally, the implementation
80cannot work at all due to infinite recursion. One example seems to be
81Solaris 2.4. I would like to hear if this problem occurs on other
82systems, and whether similar workarounds could be applied.
83
84For Posix threads, too, an optional hack like that has been integrated
85(activated when defining USE_TSD_DATA_HACK) which depends on
86`pthread_t' being convertible to an integral type (which is of course
87not generally guaranteed). USE_TSD_DATA_HACK is now the default
88because I haven't yet found a non-glibc pthreads system where this
89hack is _not_ needed.
90
91*NEW* and _important_: In (currently) one place in the ptmalloc3
92source, a write memory barrier is needed, named
93atomic_write_barrier(). This macro needs to be defined at the end of
94malloc-machine.h. For gcc, a fallback in the form of a full memory
95barrier is already defined, but you may need to add another definition
96if you don't use gcc.
97
98Usage
99=====
100
101Just link libptmalloc3 into your application.
102
103Some wicked systems (e.g. HPUX apparently) won't let malloc call _any_
104thread-related functions before main(). On these systems,
105USE_STARTER=2 must be defined during compilation (see "make
106posix-explicit" above) and the global initialization function
107ptmalloc_init() must be called explicitly, preferably at the start of
108main().
109
110Otherwise, when using ptmalloc3, no special precautions are necessary.
111
112Link order is important
113=======================
114
115On some systems, when overriding malloc and linking against shared
116libraries, the link order becomes very important. E.g., when linking
117C++ programs on Solaris with Solaris threads [this is probably now
118obsolete], don't rely on libC being included by default, but instead
119put `-lthread' behind `-lC' on the command line:
120
121 CC ... libptmalloc3.a -lC -lthread
122
123This is because there are global constructors in libC that need
124malloc/ptmalloc, which in turn needs to have the thread library to be
125already initialized.
126
127Debugging hooks
128===============
129
130All calls to malloc(), realloc(), free() and memalign() are routed
131through the global function pointers __malloc_hook, __realloc_hook,
132__free_hook and __memalign_hook if they are not NULL (see the malloc.h
133header file for declarations of these pointers). Therefore the malloc
134implementation can be changed at runtime, if care is taken not to call
135free() or realloc() on pointers obtained with a different
136implementation than the one currently in effect. (The easiest way to
137guarantee this is to set up the hooks before any malloc call, e.g.
138with a function pointed to by the global variable
139__malloc_initialize_hook).
140
141You can now also tune other malloc parameters (normally adjused via
142mallopt() calls from the application) with environment variables:
143
144 MALLOC_TRIM_THRESHOLD_ for deciding to shrink the heap (in bytes)
145
146 MALLOC_GRANULARITY_ The unit for allocating and deallocating
147 MALLOC_TOP_PAD_ memory from the system. The default
148 is 64k and this parameter _must_ be a
149 power of 2.
150
151 MALLOC_MMAP_THRESHOLD_ min. size for chunks allocated via
152 mmap() (in bytes)
153
154Tests
155=====
156
157Two testing applications, t-test1 and t-test2, are included in this
158source distribution. Both perform pseudo-random sequences of
159allocations/frees, and can be given numeric arguments (all arguments
160are optional):
161
162% t-test[12] <n-total> <n-parallel> <n-allocs> <size-max> <bins>
163
164 n-total = total number of threads executed (default 10)
165 n-parallel = number of threads running in parallel (2)
166 n-allocs = number of malloc()'s / free()'s per thread (10000)
167 size-max = max. size requested with malloc() in bytes (10000)
168 bins = number of bins to maintain
169
170The first test `t-test1' maintains a completely seperate pool of
171allocated bins for each thread, and should therefore show full
172parallelism. On the other hand, `t-test2' creates only a single pool
173of bins, and each thread randomly allocates/frees any bin. Some lock
174contention is to be expected in this case, as the threads frequently
175cross each others arena.
176
177Performance results from t-test1 should be quite repeatable, while the
178behaviour of t-test2 depends on scheduling variations.
179
180Conclusion
181==========
182
183I'm always interested in performance data and feedback, just send mail
184to [email protected].
185
186Good luck!
Note: See TracBrowser for help on using the repository browser.