You are viewing the version of this documentation from Perl 5.36.1. View the latest version

CONTENTS

NAME

perlthrtut - Tutorial on threads in Perl

DESCRIPTION

This tutorial describes the use of Perl interpreter threads (sometimes referred to as ithreads). In this model, each thread runs in its own Perl interpreter, and any data sharing between threads must be explicit. The user-level interface for ithreads uses the threads class.

NOTE: There was another older Perl threading flavor called the 5.005 model that used the threads class. This old model was known to have problems, is deprecated, and was removed for release 5.10. You are strongly encouraged to migrate any existing 5.005 threads code to the new model as soon as possible.

You can see which (or neither) threading flavour you have by running perl -V and looking at the Platform section. If you have useithreads=define you have ithreads, if you have use5005threads=define you have 5.005 threads. If you have neither, you don't have any thread support built in. If you have both, you are in trouble.

The threads and threads::shared modules are included in the core Perl distribution. Additionally, they are maintained as a separate modules on CPAN, so you can check there for any updates.

What Is A Thread Anyway?

A thread is a flow of control through a program with a single execution point.

Sounds an awful lot like a process, doesn't it? Well, it should. Threads are one of the pieces of a process. Every process has at least one thread and, up until now, every process running Perl had only one thread. With 5.8, though, you can create extra threads. We're going to show you how, when, and why.

Threaded Program Models

There are three basic ways that you can structure a threaded program. Which model you choose depends on what you need your program to do. For many non-trivial threaded programs, you'll need to choose different models for different pieces of your program.

Boss/Worker

The boss/worker model usually has one boss thread and one or more worker threads. The boss thread gathers or generates tasks that need to be done, then parcels those tasks out to the appropriate worker thread.

This model is common in GUI and server programs, where a main thread waits for some event and then passes that event to the appropriate worker threads for processing. Once the event has been passed on, the boss thread goes back to waiting for another event.

The boss thread does relatively little work. While tasks aren't necessarily performed faster than with any other method, it tends to have the best user-response times.

Work Crew

In the work crew model, several threads are created that do essentially the same thing to different pieces of data. It closely mirrors classical parallel processing and vector processors, where a large array of processors do the exact same thing to many pieces of data.

This model is particularly useful if the system running the program will distribute multiple threads across different processors. It can also be useful in ray tracing or rendering engines, where the individual threads can pass on interim results to give the user visual feedback.

Pipeline

The pipeline model divides up a task into a series of steps, and passes the results of one step on to the thread processing the next. Each thread does one thing to each piece of data and passes the results to the next thread in line.

This model makes the most sense if you have multiple processors so two or more threads will be executing in parallel, though it can often make sense in other contexts as well. It tends to keep the individual tasks small and simple, as well as allowing some parts of the pipeline to block (on I/O or system calls, for example) while other parts keep going. If you're running different parts of the pipeline on different processors you may also take advantage of the caches on each processor.

This model is also handy for a form of recursive programming where, rather than having a subroutine call itself, it instead creates another thread. Prime and Fibonacci generators both map well to this form of the pipeline model. (A version of a prime number generator is presented later on.)

What kind of threads are Perl threads?

If you have experience with other thread implementations, you might find that things aren't quite what you expect. It's very important to remember when dealing with Perl threads that Perl Threads Are Not X Threads for all values of X. They aren't POSIX threads, or DecThreads, or Java's Green threads, or Win32 threads. There are similarities, and the broad concepts are the same, but if you start looking for implementation details you're going to be either disappointed or confused. Possibly both.

This is not to say that Perl threads are completely different from everything that's ever come before. They're not. Perl's threading model owes a lot to other thread models, especially POSIX. Just as Perl is not C, though, Perl threads are not POSIX threads. So if you find yourself looking for mutexes, or thread priorities, it's time to step back a bit and think about what you want to do and how Perl can do it.

However, it is important to remember that Perl threads cannot magically do things unless your operating system's threads allow it. So if your system blocks the entire process on sleep(), Perl usually will, as well.

Perl Threads Are Different.

Thread-Safe Modules

The addition of threads has changed Perl's internals substantially. There are implications for people who write modules with XS code or external libraries. However, since Perl data is not shared among threads by default, Perl modules stand a high chance of being thread-safe or can be made thread-safe easily. Modules that are not tagged as thread-safe should be tested or code reviewed before being used in production code.

Not all modules that you might use are thread-safe, and you should always assume a module is unsafe unless the documentation says otherwise. This includes modules that are distributed as part of the core. Threads are a relatively new feature, and even some of the standard modules aren't thread-safe.

Even if a module is thread-safe, it doesn't mean that the module is optimized to work well with threads. A module could possibly be rewritten to utilize the new features in threaded Perl to increase performance in a threaded environment.

If you're using a module that's not thread-safe for some reason, you can protect yourself by using it from one, and only one thread at all. If you need multiple threads to access such a module, you can use semaphores and lots of programming discipline to control access to it. Semaphores are covered in "Basic semaphores".

See also "Thread-Safety of System Libraries".

Thread Basics

The threads module provides the basic functions you need to write threaded programs. In the following sections, we'll cover the basics, showing you what you need to do to create a threaded program. After that, we'll go over some of the features of the threads module that make threaded programming easier.

Basic Thread Support

Thread support is a Perl compile-time option. It's something that's turned on or off when Perl is built at your site, rather than when your programs are compiled. If your Perl wasn't compiled with thread support enabled, then any attempt to use threads will fail.

Your programs can use the Config module to check whether threads are enabled. If your program can't run without them, you can say something like:

use Config;
$Config{useithreads} or
    die('Recompile Perl with threads to run this program.');

A possibly-threaded program using a possibly-threaded module might have code like this:

use Config;
use MyMod;

BEGIN {
    if ($Config{useithreads}) {
        # We have threads
        require MyMod_threaded;
        import MyMod_threaded;
    } else {
        require MyMod_unthreaded;
        import MyMod_unthreaded;
    }
}

Since code that runs both with and without threads is usually pretty messy, it's best to isolate the thread-specific code in its own module. In our example above, that's what MyMod_threaded is, and it's only imported if we're running on a threaded Perl.

A Note about the Examples

In a real situation, care should be taken that all threads are finished executing before the program exits. That care has not been taken in these examples in the interest of simplicity. Running these examples as is will produce error messages, usually caused by the fact that there are still threads running when the program exits. You should not be alarmed by this.

Creating Threads

The threads module provides the tools you need to create new threads. Like any other module, you need to tell Perl that you want to use it; use threads; imports all the pieces you need to create basic threads.

The simplest, most straightforward way to create a thread is with create():

use threads;

my $thr = threads->create(\&sub1);

sub sub1 {
    print("In the thread\n");
}

The create() method takes a reference to a subroutine and creates a new thread that starts executing in the referenced subroutine. Control then passes both to the subroutine and the caller.

If you need to, your program can pass parameters to the subroutine as part of the thread startup. Just include the list of parameters as part of the threads->create() call, like this:

use threads;

my $Param3 = 'foo';
my $thr1 = threads->create(\&sub1, 'Param 1', 'Param 2', $Param3);
my @ParamList = (42, 'Hello', 3.14);
my $thr2 = threads->create(\&sub1, @ParamList);
my $thr3 = threads->create(\&sub1, qw(Param1 Param2 Param3));

sub sub1 {
    my @InboundParameters = @_;
    print("In the thread\n");
    print('Got parameters >', join('<>',@InboundParameters), "<\n");
}

The last example illustrates another feature of threads. You can spawn off several threads using the same subroutine. Each thread executes the same subroutine, but in a separate thread with a separate environment and potentially separate arguments.

new() is a synonym for create().

Waiting For A Thread To Exit

Since threads are also subroutines, they can return values. To wait for a thread to exit and extract any values it might return, you can use the join() method:

use threads;

my ($thr) = threads->create(\&sub1);

my @ReturnData = $thr->join();
print('Thread returned ', join(', ', @ReturnData), "\n");

sub sub1 { return ('Fifty-six', 'foo', 2); }

In the example above, the join() method returns as soon as the thread ends. In addition to waiting for a thread to finish and gathering up any values that the thread might have returned, join() also performs any OS cleanup necessary for the thread. That cleanup might be important, especially for long-running programs that spawn lots of threads. If you don't want the return values and don't want to wait for the thread to finish, you should call the detach() method instead, as described next.

NOTE: In the example above, the thread returns a list, thus necessitating that the thread creation call be made in list context (i.e., my ($thr)). See "$thr->join()" in threads and "THREAD CONTEXT" in threads for more details on thread context and return values.

Ignoring A Thread

join() does three things: it waits for a thread to exit, cleans up after it, and returns any data the thread may have produced. But what if you're not interested in the thread's return values, and you don't really care when the thread finishes? All you want is for the thread to get cleaned up after when it's done.

In this case, you use the detach() method. Once a thread is detached, it'll run until it's finished; then Perl will clean up after it automatically.

use threads;

my $thr = threads->create(\&sub1);   # Spawn the thread

$thr->detach();   # Now we officially don't care any more

sleep(15);        # Let thread run for awhile

sub sub1 {
    my $count = 0;
    while (1) {
        $count++;
        print("\$count is $count\n");
        sleep(1);
    }
}

Once a thread is detached, it may not be joined, and any return data that it might have produced (if it was done and waiting for a join) is lost.

detach() can also be called as a class method to allow a thread to detach itself:

use threads;

my $thr = threads->create(\&sub1);

sub sub1 {
    threads->detach();
    # Do more work
}

Process and Thread Termination

With threads one must be careful to make sure they all have a chance to run to completion, assuming that is what you want.

An action that terminates a process will terminate all running threads. die() and exit() have this property, and perl does an exit when the main thread exits, perhaps implicitly by falling off the end of your code, even if that's not what you want.

As an example of this case, this code prints the message "Perl exited with active threads: 2 running and unjoined":

use threads;
my $thr1 = threads->new(\&thrsub, "test1");
my $thr2 = threads->new(\&thrsub, "test2");
sub thrsub {
   my ($message) = @_;
   sleep 1;
   print "thread $message\n";
}

But when the following lines are added at the end:

$thr1->join();
$thr2->join();

it prints two lines of output, a perhaps more useful outcome.

Threads And Data

Now that we've covered the basics of threads, it's time for our next topic: Data. Threading introduces a couple of complications to data access that non-threaded programs never need to worry about.

Shared And Unshared Data

The biggest difference between Perl ithreads and the old 5.005 style threading, or for that matter, to most other threading systems out there, is that by default, no data is shared. When a new Perl thread is created, all the data associated with the current thread is copied to the new thread, and is subsequently private to that new thread! This is similar in feel to what happens when a Unix process forks, except that in this case, the data is just copied to a different part of memory within the same process rather than a real fork taking place.

To make use of threading, however, one usually wants the threads to share at least some data between themselves. This is done with the