Context Navigation

← Previous Revision
Latest Revision
Next Revision →
Blame
Revision Log

perlipc.pod@ 3368

Visit:

Last change on this file since 3368 was 3181, checked in by bird, 19 years ago
perl 5.8.8
File size: 61.0 KB

Line
1	=head1 NAME
2
3	perlipc - Perl interprocess communication (signals, fifos, pipes, safe subprocesses, sockets, and semaphores)
4
5	=head1 DESCRIPTION
6
7	The basic IPC facilities of Perl are built out of the good old Unix
8	signals, named pipes, pipe opens, the Berkeley socket routines, and SysV
9	IPC calls. Each is used in slightly different situations.
10
11	=head1 Signals
12
13	Perl uses a simple signal handling model: the %SIG hash contains names
14	or references of user-installed signal handlers. These handlers will
15	be called with an argument which is the name of the signal that
16	triggered it. A signal may be generated intentionally from a
17	particular keyboard sequence like control-C or control-Z, sent to you
18	from another process, or triggered automatically by the kernel when
19	special events transpire, like a child process exiting, your process
20	running out of stack space, or hitting file size limit.
21
22	For example, to trap an interrupt signal, set up a handler like this:
23
24	sub catch_zap {
25	my $signame = shift;
26	$shucks++;
27	die "Somebody sent me a SIG$signame";
28	}
29	$SIG{INT} = 'catch_zap'; # could fail in modules
30	$SIG{INT} = \&catch_zap; # best strategy
31
32	Prior to Perl 5.7.3 it was necessary to do as little as you possibly
33	could in your handler; notice how all we do is set a global variable
34	and then raise an exception. That's because on most systems,
35	libraries are not re-entrant; particularly, memory allocation and I/O
36	routines are not. That meant that doing nearly I<anything> in your
37	handler could in theory trigger a memory fault and subsequent core
38	dump - see L</Deferred Signals (Safe Signals)> below.
39
40	The names of the signals are the ones listed out by C<kill -l> on your
41	system, or you can retrieve them from the Config module. Set up an
42	@signame list indexed by number to get the name and a %signo table
43	indexed by name to get the number:
44
45	use Config;
46	defined $Config{sig_name} \|\| die "No sigs?";
47	foreach $name (split(' ', $Config{sig_name})) {
48	$signo{$name} = $i;
49	$signame[$i] = $name;
50	$i++;
51	}
52
53	So to check whether signal 17 and SIGALRM were the same, do just this:
54
55	print "signal #17 = $signame[17]\n";
56	if ($signo{ALRM}) {
57	print "SIGALRM is $signo{ALRM}\n";
58	}
59
60	You may also choose to assign the strings C<'IGNORE'> or C<'DEFAULT'> as
61	the handler, in which case Perl will try to discard the signal or do the
62	default thing.
63
64	On most Unix platforms, the C<CHLD> (sometimes also known as C<CLD>) signal
65	has special behavior with respect to a value of C<'IGNORE'>.
66	Setting C<$SIG{CHLD}> to C<'IGNORE'> on such a platform has the effect of
67	not creating zombie processes when the parent process fails to C<wait()>
68	on its child processes (i.e. child processes are automatically reaped).
69	Calling C<wait()> with C<$SIG{CHLD}> set to C<'IGNORE'> usually returns
70	C<-1> on such platforms.
71
72	Some signals can be neither trapped nor ignored, such as
73	the KILL and STOP (but not the TSTP) signals. One strategy for
74	temporarily ignoring signals is to use a local() statement, which will be
75	automatically restored once your block is exited. (Remember that local()
76	values are "inherited" by functions called from within that block.)
77
78	sub precious {
79	local $SIG{INT} = 'IGNORE';
80	&more_functions;
81	}
82	sub more_functions {
83	# interrupts still ignored, for now...
84	}
85
86	Sending a signal to a negative process ID means that you send the signal
87	to the entire Unix process-group. This code sends a hang-up signal to all
88	processes in the current process group (and sets $SIG{HUP} to IGNORE so
89	it doesn't kill itself):
90
91	{
92	local $SIG{HUP} = 'IGNORE';
93	kill HUP => -$$;
94	# snazzy writing of: kill('HUP', -$$)
95	}
96
97	Another interesting signal to send is signal number zero. This doesn't
98	actually affect a child process, but instead checks whether it's alive
99	or has changed its UID.
100
101	unless (kill 0 => $kid_pid) {
102	warn "something wicked happened to $kid_pid";
103	}
104
105	When directed at a process whose UID is not identical to that
106	of the sending process, signal number zero may fail because
107	you lack permission to send the signal, even though the process is alive.
108	You may be able to determine the cause of failure using C<%!>.
109
110	unless (kill 0 => $pid or $!{EPERM}) {
111	warn "$pid looks dead";
112	}
113
114	You might also want to employ anonymous functions for simple signal
115	handlers:
116
117	$SIG{INT} = sub { die "\nOutta here!\n" };
118
119	But that will be problematic for the more complicated handlers that need
120	to reinstall themselves. Because Perl's signal mechanism is currently
121	based on the signal(3) function from the C library, you may sometimes be so
122	misfortunate as to run on systems where that function is "broken", that
123	is, it behaves in the old unreliable SysV way rather than the newer, more
124	reasonable BSD and POSIX fashion. So you'll see defensive people writing
125	signal handlers like this:
126
127	sub REAPER {
128	$waitedpid = wait;
129	# loathe sysV: it makes us not only reinstate
130	# the handler, but place it after the wait
131	$SIG{CHLD} = \&REAPER;
132	}
133	$SIG{CHLD} = \&REAPER;
134	# now do something that forks...
135
136	or better still:
137
138	use POSIX ":sys_wait_h";
139	sub REAPER {
140	my $child;
141	# If a second child dies while in the signal handler caused by the
142	# first death, we won't get another signal. So must loop here else
143	# we will leave the unreaped child as a zombie. And the next time
144	# two children die we get another zombie. And so on.
145	while (($child = waitpid(-1,WNOHANG)) > 0) {
146	$Kid_Status{$child} = $?;
147	}
148	$SIG{CHLD} = \&REAPER; # still loathe sysV
149	}
150	$SIG{CHLD} = \&REAPER;
151	# do something that forks...
152
153	Signal handling is also used for timeouts in Unix, While safely
154	protected within an C<eval{}> block, you set a signal handler to trap
155	alarm signals and then schedule to have one delivered to you in some
156	number of seconds. Then try your blocking operation, clearing the alarm
157	when it's done but not before you've exited your C<eval{}> block. If it
158	goes off, you'll use die() to jump out of the block, much as you might
159	using longjmp() or throw() in other languages.
160
161	Here's an example:
162
163	eval {
164	local $SIG{ALRM} = sub { die "alarm clock restart" };
165	alarm 10;
166	flock(FH, 2); # blocking write lock
167	alarm 0;
168	};
169	if ($@ and $@ !~ /alarm clock restart/) { die }
170
171	If the operation being timed out is system() or qx(), this technique
172	is liable to generate zombies. If this matters to you, you'll
173	need to do your own fork() and exec(), and kill the errant child process.
174
175	For more complex signal handling, you might see the standard POSIX
176	module. Lamentably, this is almost entirely undocumented, but
177	the F<t/lib/posix.t> file from the Perl source distribution has some
178	examples in it.
179
180	=head2 Handling the SIGHUP Signal in Daemons
181
182	A process that usually starts when the system boots and shuts down
183	when the system is shut down is called a daemon (Disk And Execution
184	MONitor). If a daemon process has a configuration file which is
185	modified after the process has been started, there should be a way to
186	tell that process to re-read its configuration file, without stopping
187	the process. Many daemons provide this mechanism using the C<SIGHUP>
188	signal handler. When you want to tell the daemon to re-read the file
189	you simply send it the C<SIGHUP> signal.
190
191	Not all platforms automatically reinstall their (native) signal
192	handlers after a signal delivery. This means that the handler works
193	only the first time the signal is sent. The solution to this problem
194	is to use C<POSIX> signal handlers if available, their behaviour
195	is well-defined.
196
197	The following example implements a simple daemon, which restarts
198	itself every time the C<SIGHUP> signal is received. The actual code is
199	located in the subroutine C<code()>, which simply prints some debug
200	info to show that it works and should be replaced with the real code.
201
202	#!/usr/bin/perl -w
203
204	use POSIX ();
205	use FindBin ();
206	use File::Basename ();
207	use File::Spec::Functions;
208
209	$\|=1;
210
211	# make the daemon cross-platform, so exec always calls the script
212	# itself with the right path, no matter how the script was invoked.
213	my $script = File::Basename::basename($0);
214	my $SELF = catfile $FindBin::Bin, $script;
215
216	# POSIX unmasks the sigprocmask properly
217	my $sigset = POSIX::SigSet->new();
218	my $action = POSIX::SigAction->new('sigHUP_handler',
219	$sigset,
220	&POSIX::SA_NODEFER);
221	POSIX::sigaction(&POSIX::SIGHUP, $action);
222
223	sub sigHUP_handler {
224	print "got SIGHUP\n";
225	exec($SELF, @ARGV) or die "Couldn't restart: $!\n";
226	}
227
228	code();
229
230	sub code {
231	print "PID: $$\n";
232	print "ARGV: @ARGV\n";
233	my $c = 0;
234	while (++$c) {
235	sleep 2;
236	print "$c\n";
237	}
238	}
239	__END__
240
241
242	=head1 Named Pipes
243
244	A named pipe (often referred to as a FIFO) is an old Unix IPC
245	mechanism for processes communicating on the same machine. It works
246	just like a regular, connected anonymous pipes, except that the
247	processes rendezvous using a filename and don't have to be related.
248
249	To create a named pipe, use the C<POSIX::mkfifo()> function.
250
251	use POSIX qw(mkfifo);
252	mkfifo($path, 0700) or die "mkfifo $path failed: $!";
253
254	You can also use the Unix command mknod(1) or on some
255	systems, mkfifo(1). These may not be in your normal path.
256
257	# system return val is backwards, so && not \|\|
258	#
259	$ENV{PATH} .= ":/etc:/usr/etc";
260	if ( system('mknod', $path, 'p')
261	&& system('mkfifo', $path) )
262	{
263	die "mk{nod,fifo} $path failed";
264	}
265
266
267	A fifo is convenient when you want to connect a process to an unrelated
268	one. When you open a fifo, the program will block until there's something
269	on the other end.
270
271	For example, let's say you'd like to have your F<.signature> file be a
272	named pipe that has a Perl program on the other end. Now every time any
273	program (like a mailer, news reader, finger program, etc.) tries to read
274	from that file, the reading program will block and your program will
275	supply the new signature. We'll use the pipe-checking file test B<-p>
276	to find out whether anyone (or anything) has accidentally removed our fifo.
277
278	chdir; # go home
279	$FIFO = '.signature';
280
281	while (1) {
282	unless (-p $FIFO) {
283	unlink $FIFO;
284	require POSIX;
285	POSIX::mkfifo($FIFO, 0700)
286	or die "can't mkfifo $FIFO: $!";
287	}
288
289	# next line blocks until there's a reader
290	open (FIFO, "> $FIFO") \|\| die "can't write $FIFO: $!";
291	print FIFO "John Smith (smith\@host.org)\n", `fortune -s`;
292	close FIFO;
293	sleep 2; # to avoid dup signals
294	}
295
296	=head2 Deferred Signals (Safe Signals)
297
298	In Perls before Perl 5.7.3 by installing Perl code to deal with
299	signals, you were exposing yourself to danger from two things. First,
300	few system library functions are re-entrant. If the signal interrupts
301	while Perl is executing one function (like malloc(3) or printf(3)),
302	and your signal handler then calls the same function again, you could
303	get unpredictable behavior--often, a core dump. Second, Perl isn't
304	itself re-entrant at the lowest levels. If the signal interrupts Perl
305	while Perl is changing its own internal data structures, similarly
306	unpredictable behaviour may result.
307
308	There were two things you could do, knowing this: be paranoid or be
309	pragmatic. The paranoid approach was to do as little as possible in your
310	signal handler. Set an existing integer variable that already has a
311	value, and return. This doesn't help you if you're in a slow system call,
312	which will just restart. That means you have to C<die> to longjump(3) out
313	of the handler. Even this is a little cavalier for the true paranoiac,
314	who avoids C<die> in a handler because the system I<is> out to get you.
315	The pragmatic approach was to say "I know the risks, but prefer the
316	convenience", and to do anything you wanted in your signal handler,
317	and be prepared to clean up core dumps now and again.
318
319	In Perl 5.7.3 and later to avoid these problems signals are
320	"deferred"-- that is when the signal is delivered to the process by
321	the system (to the C code that implements Perl) a flag is set, and the
322	handler returns immediately. Then at strategic "safe" points in the
323	Perl interpreter (e.g. when it is about to execute a new opcode) the
324	flags are checked and the Perl level handler from %SIG is
325	executed. The "deferred" scheme allows much more flexibility in the
326	coding of signal handler as we know Perl interpreter is in a safe
327	state, and that we are not in a system library function when the
328	handler is called. However the implementation does differ from
329	previous Perls in the following ways:
330
331	=over 4
332
333	=item Long running opcodes
334
335	As Perl interpreter only looks at the signal flags when it about to
336	execute a new opcode if a signal arrives during a long running opcode
337	(e.g. a regular expression operation on a very large string) then
338	signal will not be seen until operation completes.
339
340	=item Interrupting IO
341
342	When a signal is delivered (e.g. INT control-C) the operating system
343	breaks into IO operations like C<read> (used to implement Perls
344	E<lt>E<gt> operator). On older Perls the handler was called
345	immediately (and as C<read> is not "unsafe" this worked well). With
346	the "deferred" scheme the handler is not called immediately, and if
347	Perl is using system's C<stdio> library that library may re-start the
348	C<read> without returning to Perl and giving it a chance to call the
349	%SIG handler. If this happens on your system the solution is to use
350	C<:perlio> layer to do IO - at least on those handles which you want
351	to be able to break into with signals. (The C<:perlio> layer checks
352	the signal flags and calls %SIG handlers before resuming IO operation.)
353
354	Note that the default in Perl 5.7.3 and later is to automatically use
355	the C<:perlio> layer.
356
357	Note that some networking library functions like gethostbyname() are
358	known to have their own implementations of timeouts which may conflict
359	with your timeouts. If you are having problems with such functions,
360	you can try using the POSIX sigaction() function, which bypasses the
361	Perl safe signals (note that this means subjecting yourself to
362	possible memory corruption, as described above). Instead of setting
363	C<$SIG{ALRM}>:
364
365	local $SIG{ALRM} = sub { die "alarm" };
366
367	try something like the following:
368
369	use POSIX qw(SIGALRM);
370	POSIX::sigaction(SIGALRM,
371	POSIX::SigAction->new(sub { die "alarm" }))
372	or die "Error setting SIGALRM handler: $!\n";
373
374	=item Restartable system calls
375
376	On systems that supported it, older versions of Perl used the
377	SA_RESTART flag when installing %SIG handlers. This meant that
378	restartable system calls would continue rather than returning when
379	a signal arrived. In order to deliver deferred signals promptly,
380	Perl 5.7.3 and later do I<not> use SA_RESTART. Consequently,
381	restartable system calls can fail (with $! set to C<EINTR>) in places
382	where they previously would have succeeded.
383
384	Note that the default C<:perlio> layer will retry C<read>, C<write>
385	and C<close> as described above and that interrupted C<wait> and
386	C<waitpid> calls will always be retried.
387
388	=item Signals as "faults"
389
390	Certain signals e.g. SEGV, ILL, BUS are generated as a result of
391	virtual memory or other "faults". These are normally fatal and there
392	is little a Perl-level handler can do with them. (In particular the
393	old signal scheme was particularly unsafe in such cases.) However if
394	a %SIG handler is set the new scheme simply sets a flag and returns as
395	described above. This may cause the operating system to try the
396	offending machine instruction again and - as nothing has changed - it
397	will generate the signal again. The result of this is a rather odd
398	"loop". In future Perl's signal mechanism may be changed to avoid this
399	- perhaps by simply disallowing %SIG handlers on signals of that
400	type. Until then the work-round is not to set a %SIG handler on those
401	signals. (Which signals they are is operating system dependent.)
402
403	=item Signals triggered by operating system state
404
405	On some operating systems certain signal handlers are supposed to "do
406	something" before returning. One example can be CHLD or CLD which
407	indicates a child process has completed. On some operating systems the
408	signal handler is expected to C<wait> for the completed child
409	process. On such systems the deferred signal scheme will not work for
410	those signals (it does not do the C<wait>). Again the failure will
411	look like a loop as the operating system will re-issue the signal as
412	there are un-waited-for completed child processes.
413
414	=back
415
416	If you want the old signal behaviour back regardless of possible
417	memory corruption, set the environment variable C<PERL_SIGNALS> to
418	C<"unsafe"> (a new feature since Perl 5.8.1).
419
420	=head1 Using open() for IPC
421
422	Perl's basic open() statement can also be used for unidirectional
423	interprocess communication by either appending or prepending a pipe
424	symbol to the second argument to open(). Here's how to start
425	something up in a child process you intend to write to:
426
427	open(SPOOLER, "\| cat -v \| lpr -h 2>/dev/null")
428	\|\| die "can't fork: $!";
429	local $SIG{PIPE} = sub { die "spooler pipe broke" };
430	print SPOOLER "stuff\n";
431	close SPOOLER \|\| die "bad spool: $! $?";
432
433	And here's how to start up a child process you intend to read from:
434
435	open(STATUS, "netstat -an 2>&1 \|")
436	\|\| die "can't fork: $!";
437	while (<STATUS>) {
438	next if /^(tcp\|udp)/;
439	print;
440	}
441	close STATUS \|\| die "bad netstat: $! $?";
442
443	If one can be sure that a particular program is a Perl script that is
444	expecting filenames in @ARGV, the clever programmer can write something
445	like this:
446
447	% program f1 "cmd1\|" - f2 "cmd2\|" f3 < tmpfile
448
449	and irrespective of which shell it's called from, the Perl program will
450	read from the file F<f1>, the process F<cmd1>, standard input (F<tmpfile>
451	in this case), the F<f2> file, the F<cmd2> command, and finally the F<f3>
452	file. Pretty nifty, eh?
453
454	You might notice that you could use backticks for much the
455	same effect as opening a pipe for reading:
456
457	print grep { !/^(tcp\|udp)/ } `netstat -an 2>&1`;
458	die "bad netstat" if $?;
459
460	While this is true on the surface, it's much more efficient to process the
461	file one line or record at a time because then you don't have to read the
462	whole thing into memory at once. It also gives you finer control of the
463	whole process, letting you to kill off the child process early if you'd
464	like.
465
466	Be careful to check both the open() and the close() return values. If
467	you're I<writing> to a pipe, you should also trap SIGPIPE. Otherwise,
468	think of what happens when you start up a pipe to a command that doesn't
469	exist: the open() will in all likelihood succeed (it only reflects the
470	fork()'s success), but then your output will fail--spectacularly. Perl
471	can't know whether the command worked because your command is actually
472	running in a separate process whose exec() might have failed. Therefore,
473	while readers of bogus commands return just a quick end of file, writers
474	to bogus command will trigger a signal they'd better be prepared to
475	handle. Consider:
476
477	open(FH, "\|bogus") or die "can't fork: $!";
478	print FH "bang\n" or die "can't write: $!";
479	close FH or die "can't close: $!";
480
481	That won't blow up until the close, and it will blow up with a SIGPIPE.
482	To catch it, you could use this:
483
484	$SIG{PIPE} = 'IGNORE';
485	open(FH, "\|bogus") or die "can't fork: $!";
486	print FH "bang\n" or die "can't write: $!";
487	close FH or die "can't close: status=$?";
488
489	=head2 Filehandles
490
491	Both the main process and any child processes it forks share the same
492	STDIN, STDOUT, and STDERR filehandles. If both processes try to access
493	them at once, strange things can happen. You may also want to close
494	or reopen the filehandles for the child. You can get around this by
495	opening your pipe with open(), but on some systems this means that the
496	child process cannot outlive the parent.
497
498	=head2 Background Processes
499
500	You can run a command in the background with:
501
502	system("cmd &");
503
504	The command's STDOUT and STDERR (and possibly STDIN, depending on your
505	shell) will be the same as the parent's. You won't need to catch
506	SIGCHLD because of the double-fork taking place (see below for more
507	details).
508
509	=head2 Complete Dissociation of Child from Parent
510
511	In some cases (starting server processes, for instance) you'll want to
512	completely dissociate the child process from the parent. This is
513	often called daemonization. A well behaved daemon will also chdir()
514	to the root directory (so it doesn't prevent unmounting the filesystem
515	containing the directory from which it was launched) and redirect its
516	standard file descriptors from and to F</dev/null> (so that random
517	output doesn't wind up on the user's terminal).
518
519	use POSIX 'setsid';
520
521	sub daemonize {
522	chdir '/' or die "Can't chdir to /: $!";
523	open STDIN, '/dev/null' or die "Can't read /dev/null: $!";
524	open STDOUT, '>/dev/null'
525	or die "Can't write to /dev/null: $!";
526	defined(my $pid = fork) or die "Can't fork: $!";
527	exit if $pid;
528	setsid or die "Can't start a new session: $!";
529	open STDERR, '>&STDOUT' or die "Can't dup stdout: $!";
530	}
531
532	The fork() has to come before the setsid() to ensure that you aren't a
533	process group leader (the setsid() will fail if you are). If your
534	system doesn't have the setsid() function, open F</dev/tty> and use the
535	C<TIOCNOTTY> ioctl() on it instead. See L<tty(4)> for details.
536
537	Non-Unix users should check their Your_OS::Process module for other
538	solutions.
539
540	=head2 Safe Pipe Opens
541
542	Another interesting approach to IPC is making your single program go
543	multiprocess and communicate between (or even amongst) yourselves. The
544	open() function will accept a file argument of either C<"-\|"> or C<"\|-">
545	to do a very interesting thing: it forks a child connected to the
546	filehandle you've opened. The child is running the same program as the
547	parent. This is useful for safely opening a file when running under an
548	assumed UID or GID, for example. If you open a pipe I<to> minus, you can
549	write to the filehandle you opened and your kid will find it in his
550	STDIN. If you open a pipe I<from> minus, you can read from the filehandle
551	you opened whatever your kid writes to his STDOUT.
552
553	use English '-no_match_vars';
554	my $sleep_count = 0;
555
556	do {
557	$pid = open(KID_TO_WRITE, "\|-");
558	unless (defined $pid) {
559	warn "cannot fork: $!";
560	die "bailing out" if $sleep_count++ > 6;
561	sleep 10;
562	}
563	} until defined $pid;
564
565	if ($pid) { # parent
566	print KID_TO_WRITE @some_data;
567	close(KID_TO_WRITE) \|\| warn "kid exited $?";
568	} else { # child
569	($EUID, $EGID) = ($UID, $GID); # suid progs only
570	open (FILE, "> /safe/file")
571	\|\| die "can't open /safe/file: $!";
572	while (<STDIN>) {
573	print FILE; # child's STDIN is parent's KID
574	}
575	exit; # don't forget this
576	}
577
578	Another common use for this construct is when you need to execute
579	something without the shell's interference. With system(), it's
580	straightforward, but you can't use a pipe open or backticks safely.
581	That's because there's no way to stop the shell from getting its hands on
582	your arguments. Instead, use lower-level control to call exec() directly.
583
584	Here's a safe backtick or pipe open for read:
585
586	# add error processing as above
587	$pid = open(KID_TO_READ, "-\|");
588
589	if ($pid) { # parent
590	while (<KID_TO_READ>) {
591	# do something interesting
592	}
593	close(KID_TO_READ) \|\| warn "kid exited $?";
594
595	} else { # child
596	($EUID, $EGID) = ($UID, $GID); # suid only
597	exec($program, @options, @args)
598	\|\| die "can't exec program: $!";
599	# NOTREACHED
600	}
601
602
603	And here's a safe pipe open for writing:
604
605	# add error processing as above
606	$pid = open(KID_TO_WRITE, "\|-");
607	$SIG{PIPE} = sub { die "whoops, $program pipe broke" };
608
609	if ($pid) { # parent
610	for (@data) {
611	print KID_TO_WRITE;
612	}
613	close(KID_TO_WRITE) \|\| warn "kid exited $?";
614
615	} else { # child
616	($EUID, $EGID) = ($UID, $GID);
617	exec($program, @options, @args)
618	\|\| die "can't exec program: $!";
619	# NOTREACHED
620	}
621
622	Since Perl 5.8.0, you can also use the list form of C<open> for pipes :
623	the syntax
624
625	open KID_PS, "-\|", "ps", "aux" or die $!;
626
627	forks the ps(1) command (without spawning a shell, as there are more than
628	three arguments to open()), and reads its standard output via the
629	C<KID_PS> filehandle. The corresponding syntax to write to command
630	pipes (with C<"\|-"> in place of C<"-\|">) is also implemented.
631
632	Note that these operations are full Unix forks, which means they may not be
633	correctly implemented on alien systems. Additionally, these are not true
634	multithreading. If you'd like to learn more about threading, see the
635	F<modules> file mentioned below in the SEE ALSO section.
636
637	=head2 Bidirectional Communication with Another Process
638
639	While this works reasonably well for unidirectional communication, what
640	about bidirectional communication? The obvious thing you'd like to do
641	doesn't actually work:
642
643	open(PROG_FOR_READING_AND_WRITING, "\| some program \|")
644
645	and if you forget to use the C<use warnings> pragma or the B<-w> flag,
646	then you'll miss out entirely on the diagnostic message:
647
648	Can't do bidirectional pipe at -e line 1.
649
650	If you really want to, you can use the standard open2() library function
651	to catch both ends. There's also an open3() for tridirectional I/O so you
652	can also catch your child's STDERR, but doing so would then require an
653	awkward select() loop and wouldn't allow you to use normal Perl input
654	operations.
655
656	If you look at its source, you'll see that open2() uses low-level
657	primitives like Unix pipe() and exec() calls to create all the connections.
658	While it might have been slightly more efficient by using socketpair(), it
659	would have then been even less portable than it already is. The open2()
660	and open3() functions are unlikely to work anywhere except on a Unix
661	system or some other one purporting to be POSIX compliant.
662
663	Here's an example of using open2():
664
665	use FileHandle;
666	use IPC::Open2;
667	$pid = open2(Reader, Writer, "cat -u -n" );
668	print Writer "stuff\n";
669	$got = <Reader>;
670
671	The problem with this is that Unix buffering is really going to
672	ruin your day. Even though your C<Writer> filehandle is auto-flushed,
673	and the process on the other end will get your data in a timely manner,
674	you can't usually do anything to force it to give it back to you
675	in a similarly quick fashion. In this case, we could, because we
676	gave I<cat> a B<-u> flag to make it unbuffered. But very few Unix
677	commands are designed to operate over pipes, so this seldom works
678	unless you yourself wrote the program on the other end of the
679	double-ended pipe.
680
681	A solution to this is the nonstandard F<Comm.pl> library. It uses
682	pseudo-ttys to make your program behave more reasonably:
683
684	require 'Comm.pl';
685	$ph = open_proc('cat -n');
686	for (1..10) {
687	print $ph "a line\n";
688	print "got back ", scalar <$ph>;
689	}
690
691	This way you don't have to have control over the source code of the
692	program you're using. The F<Comm> library also has expect()
693	and interact() functions. Find the library (and we hope its
694	successor F<IPC::Chat>) at your nearest CPAN archive as detailed
695	in the SEE ALSO section below.
696
697	The newer Expect.pm module from CPAN also addresses this kind of thing.
698	This module requires two other modules from CPAN: IO::Pty and IO::Stty.
699	It sets up a pseudo-terminal to interact with programs that insist on
700	using talking to the terminal device driver. If your system is
701	amongst those supported, this may be your best bet.
702
703	=head2 Bidirectional Communication with Yourself
704
705	If you want, you may make low-level pipe() and fork()
706	to stitch this together by hand. This example only
707	talks to itself, but you could reopen the appropriate
708	handles to STDIN and STDOUT and call other processes.
709
710	#!/usr/bin/perl -w
711	# pipe1 - bidirectional communication using two pipe pairs
712	# designed for the socketpair-challenged
713	use IO::Handle; # thousands of lines just for autoflush :-(
714	pipe(PARENT_RDR, CHILD_WTR); # XXX: failure?
715	pipe(CHILD_RDR, PARENT_WTR); # XXX: failure?
716	CHILD_WTR->autoflush(1);
717	PARENT_WTR->autoflush(1);
718
719	if ($pid = fork) {
720	close PARENT_RDR; close PARENT_WTR;
721	print CHILD_WTR "Parent Pid $$ is sending this\n";
722	chomp($line = <CHILD_RDR>);
723	print "Parent Pid $$ just read this: `$line'\n";
724	close CHILD_RDR; close CHILD_WTR;
725	waitpid($pid,0);
726	} else {
727	die "cannot fork: $!" unless defined $pid;
728	close CHILD_RDR; close CHILD_WTR;
729	chomp($line = <PARENT_RDR>);
730	print "Child Pid $$ just read this: `$line'\n";
731	print PARENT_WTR "Child Pid $$ is sending this\n";
732	close PARENT_RDR; close PARENT_WTR;
733	exit;
734	}
735
736	But you don't actually have to make two pipe calls. If you
737	have the socketpair() system call, it will do this all for you.
738
739	#!/usr/bin/perl -w
740	# pipe2 - bidirectional communication using socketpair
741	# "the best ones always go both ways"
742
743	use Socket;
744	use IO::Handle; # thousands of lines just for autoflush :-(
745	# We say AF_UNIX because although *_LOCAL is the
746	# POSIX 1003.1g form of the constant, many machines
747	# still don't have it.
748	socketpair(CHILD, PARENT, AF_UNIX, SOCK_STREAM, PF_UNSPEC)
749	or die "socketpair: $!";
750
751	CHILD->autoflush(1);
752	PARENT->autoflush(1);
753
754	if ($pid = fork) {
755	close PARENT;
756	print CHILD "Parent Pid $$ is sending this\n";
757	chomp($line = <CHILD>);
758	print "Parent Pid $$ just read this: `$line'\n";
759	close CHILD;
760	waitpid($pid,0);
761	} else {
762	die "cannot fork: $!" unless defined $pid;
763	close CHILD;
764	chomp($line = <PARENT>);
765	print "Child Pid $$ just read this: `$line'\n";
766	print PARENT "Child Pid $$ is sending this\n";
767	close PARENT;
768	exit;
769	}
770
771	=head1 Sockets: Client/Server Communication
772
773	While not limited to Unix-derived operating systems (e.g., WinSock on PCs
774	provides socket support, as do some VMS libraries), you may not have
775	sockets on your system, in which case this section probably isn't going to do
776	you much good. With sockets, you can do both virtual circuits (i.e., TCP
777	streams) and datagrams (i.e., UDP packets). You may be able to do even more
778	depending on your system.
779
780	The Perl function calls for dealing with sockets have the same names as
781	the corresponding system calls in C, but their arguments tend to differ
782	for two reasons: first, Perl filehandles work differently than C file
783	descriptors. Second, Perl already knows the length of its strings, so you
784	don't need to pass that information.
785
786	One of the major problems with old socket code in Perl was that it used
787	hard-coded values for some of the constants, which severely hurt
788	portability. If you ever see code that does anything like explicitly
789	setting C<$AF_INET = 2>, you know you're in for big trouble: An
790	immeasurably superior approach is to use the C<Socket> module, which more
791	reliably grants access to various constants and functions you'll need.
792
793	If you're not writing a server/client for an existing protocol like
794	NNTP or SMTP, you should give some thought to how your server will
795	know when the client has finished talking, and vice-versa. Most
796	protocols are based on one-line messages and responses (so one party
797	knows the other has finished when a "\n" is received) or multi-line
798	messages and responses that end with a period on an empty line
799	("\n.\n" terminates a message/response).
800
801	=head2 Internet Line Terminators
802
803	The Internet line terminator is "\015\012". Under ASCII variants of
804	Unix, that could usually be written as "\r\n", but under other systems,
805	"\r\n" might at times be "\015\015\012", "\012\012\015", or something
806	completely different. The standards specify writing "\015\012" to be
807	conformant (be strict in what you provide), but they also recommend
808	accepting a lone "\012" on input (but be lenient in what you require).
809	We haven't always been very good about that in the code in this manpage,
810	but unless you're on a Mac, you'll probably be ok.
811
812	=head2 Internet TCP Clients and Servers
813
814	Use Internet-domain sockets when you want to do client-server
815	communication that might extend to machines outside of your own system.
816
817	Here's a sample TCP client using Internet-domain sockets:
818
819	#!/usr/bin/perl -w
820	use strict;
821	use Socket;
822	my ($remote,$port, $iaddr, $paddr, $proto, $line);
823
824	$remote = shift \|\| 'localhost';
825	$port = shift \|\| 2345; # random port
826	if ($port =~ /\D/) { $port = getservbyname($port, 'tcp') }
827	die "No port" unless $port;
828	$iaddr = inet_aton($remote) \|\| die "no host: $remote";
829	$paddr = sockaddr_in($port, $iaddr);
830
831	$proto = getprotobyname('tcp');
832	socket(SOCK, PF_INET, SOCK_STREAM, $proto) \|\| die "socket: $!";
833	connect(SOCK, $paddr) \|\| die "connect: $!";
834	while (defined($line = <SOCK>)) {
835	print $line;
836	}
837
838	close (SOCK) \|\| die "close: $!";
839	exit;
840
841	And here's a corresponding server to go along with it. We'll
842	leave the address as INADDR_ANY so that the kernel can choose
843	the appropriate interface on multihomed hosts. If you want sit
844	on a particular interface (like the external side of a gateway
845	or firewall machine), you should fill this in with your real address
846	instead.
847
848	#!/usr/bin/perl -Tw
849	use strict;
850	BEGIN { $ENV{PATH} = '/usr/ucb:/bin' }
851	use Socket;
852	use Carp;
853	my $EOL = "\015\012";
854
855	sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" }
856
857	my $port = shift \|\| 2345;
858	my $proto = getprotobyname('tcp');
859
860	($port) = $port =~ /^(\d+)$/ or die "invalid port";
861
862	socket(Server, PF_INET, SOCK_STREAM, $proto) \|\| die "socket: $!";
863	setsockopt(Server, SOL_SOCKET, SO_REUSEADDR,
864	pack("l", 1)) \|\| die "setsockopt: $!";
865	bind(Server, sockaddr_in($port, INADDR_ANY)) \|\| die "bind: $!";
866	listen(Server,SOMAXCONN) \|\| die "listen: $!";
867
868	logmsg "server started on port $port";
869
870	my $paddr;
871
872	$SIG{CHLD} = \&REAPER;
873
874	for ( ; $paddr = accept(Client,Server); close Client) {
875	my($port,$iaddr) = sockaddr_in($paddr);
876	my $name = gethostbyaddr($iaddr,AF_INET);
877
878	logmsg "connection from $name [",
879	inet_ntoa($iaddr), "]
880	at port $port";
881
882	print Client "Hello there, $name, it's now ",
883	scalar localtime, $EOL;
884	}
885
886	And here's a multithreaded version. It's multithreaded in that
887	like most typical servers, it spawns (forks) a slave server to
888	handle the client request so that the master server can quickly
889	go back to service a new client.
890
891	#!/usr/bin/perl -Tw
892	use strict;
893	BEGIN { $ENV{PATH} = '/usr/ucb:/bin' }
894	use Socket;
895	use Carp;
896	my $EOL = "\015\012";
897
898	sub spawn; # forward declaration
899	sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" }
900
901	my $port = shift \|\| 2345;
902	my $proto = getprotobyname('tcp');
903
904	($port) = $port =~ /^(\d+)$/ or die "invalid port";
905
906	socket(Server, PF_INET, SOCK_STREAM, $proto) \|\| die "socket: $!";
907	setsockopt(Server, SOL_SOCKET, SO_REUSEADDR,
908	pack("l", 1)) \|\| die "setsockopt: $!";
909	bind(Server, sockaddr_in($port, INADDR_ANY)) \|\| die "bind: $!";
910	listen(Server,SOMAXCONN) \|\| die "listen: $!";
911
912	logmsg "server started on port $port";
913
914	my $waitedpid = 0;
915	my $paddr;
916
917	use POSIX ":sys_wait_h";
918	sub REAPER {
919	my $child;
920	while (($waitedpid = waitpid(-1,WNOHANG)) > 0) {
921	logmsg "reaped $waitedpid" . ($? ? " with exit $?" : '');
922	}
923	$SIG{CHLD} = \&REAPER; # loathe sysV
924	}
925
926	$SIG{CHLD} = \&REAPER;
927
928	for ( $waitedpid = 0;
929	($paddr = accept(Client,Server)) \|\| $waitedpid;
930	$waitedpid = 0, close Client)
931	{
932	next if $waitedpid and not $paddr;
933	my($port,$iaddr) = sockaddr_in($paddr);
934	my $name = gethostbyaddr($iaddr,AF_INET);
935
936	logmsg "connection from $name [",
937	inet_ntoa($iaddr), "]
938	at port $port";
939
940	spawn sub {
941	$\|=1;
942	print "Hello there, $name, it's now ", scalar localtime, $EOL;
943	exec '/usr/games/fortune' # XXX: `wrong' line terminators
944	or confess "can't exec fortune: $!";
945	};
946
947	}
948
949	sub spawn {
950	my $coderef = shift;
951
952	unless (@_ == 0 && $coderef && ref($coderef) eq 'CODE') {
953	confess "usage: spawn CODEREF";
954	}
955
956	my $pid;
957	if (!defined($pid = fork)) {
958	logmsg "cannot fork: $!";
959	return;
960	} elsif ($pid) {
961	logmsg "begat $pid";
962	return; # I'm the parent
963	}
964	# else I'm the child -- go spawn
965
966	open(STDIN, "<&Client") \|\| die "can't dup client to stdin";
967	open(STDOUT, ">&Client") \|\| die "can't dup client to stdout";
968	## open(STDERR, ">&STDOUT") \|\| die "can't dup stdout to stderr";
969	exit &$coderef();
970	}
971
972	This server takes the trouble to clone off a child version via fork() for
973	each incoming request. That way it can handle many requests at once,
974	which you might not always want. Even if you don't fork(), the listen()
975	will allow that many pending connections. Forking servers have to be
976	particularly careful about cleaning up their dead children (called
977	"zombies" in Unix parlance), because otherwise you'll quickly fill up your
978	process table.
979
980	We suggest that you use the B<-T> flag to use taint checking (see L<perlsec>)
981	even if we aren't running setuid or setgid. This is always a good idea
982	for servers and other programs run on behalf of someone else (like CGI
983	scripts), because it lessens the chances that people from the outside will
984	be able to compromise your system.
985
986	Let's look at another TCP client. This one connects to the TCP "time"
987	service on a number of different machines and shows how far their clocks
988	differ from the system on which it's being run:
989
990	#!/usr/bin/perl -w
991	use strict;
992	use Socket;
993
994	my $SECS_of_70_YEARS = 2208988800;
995	sub ctime { scalar localtime(shift) }
996
997	my $iaddr = gethostbyname('localhost');
998	my $proto = getprotobyname('tcp');
999	my $port = getservbyname('time', 'tcp');
1000	my $paddr = sockaddr_in(0, $iaddr);
1001	my($host);
1002
1003	$\| = 1;
1004	printf "%-24s %8s %s\n", "localhost", 0, ctime(time());
1005
1006	foreach $host (@ARGV) {
1007	printf "%-24s ", $host;
1008	my $hisiaddr = inet_aton($host) \|\| die "unknown host";
1009	my $hispaddr = sockaddr_in($port, $hisiaddr);
1010	socket(SOCKET, PF_INET, SOCK_STREAM, $proto) \|\| die "socket: $!";
1011	connect(SOCKET, $hispaddr) \|\| die "bind: $!";
1012	my $rtime = ' ';
1013	read(SOCKET, $rtime, 4);
1014	close(SOCKET);
1015	my $histime = unpack("N", $rtime) - $SECS_of_70_YEARS;
1016	printf "%8d %s\n", $histime - time, ctime($histime);
1017	}
1018
1019	=head2 Unix-Domain TCP Clients and Servers
1020
1021	That's fine for Internet-domain clients and servers, but what about local
1022	communications? While you can use the same setup, sometimes you don't
1023	want to. Unix-domain sockets are local to the current host, and are often
1024	used internally to implement pipes. Unlike Internet domain sockets, Unix
1025	domain sockets can show up in the file system with an ls(1) listing.
1026
1027	% ls -l /dev/log
1028	srw-rw-rw- 1 root 0 Oct 31 07:23 /dev/log
1029
1030	You can test for these with Perl's B<-S> file test:
1031
1032	unless ( -S '/dev/log' ) {
1033	die "something's wicked with the log system";
1034	}
1035
1036	Here's a sample Unix-domain client:
1037
1038	#!/usr/bin/perl -w
1039	use Socket;
1040	use strict;
1041	my ($rendezvous, $line);
1042
1043	$rendezvous = shift \|\| 'catsock';
1044	socket(SOCK, PF_UNIX, SOCK_STREAM, 0) \|\| die "socket: $!";
1045	connect(SOCK, sockaddr_un($rendezvous)) \|\| die "connect: $!";
1046	while (defined($line = <SOCK>)) {
1047	print $line;
1048	}
1049	exit;
1050
1051	And here's a corresponding server. You don't have to worry about silly
1052	network terminators here because Unix domain sockets are guaranteed
1053	to be on the localhost, and thus everything works right.
1054
1055	#!/usr/bin/perl -Tw
1056	use strict;
1057	use Socket;
1058	use Carp;
1059
1060	BEGIN { $ENV{PATH} = '/usr/ucb:/bin' }
1061	sub spawn; # forward declaration
1062	sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" }
1063
1064	my $NAME = 'catsock';
1065	my $uaddr = sockaddr_un($NAME);
1066	my $proto = getprotobyname('tcp');
1067
1068	socket(Server,PF_UNIX,SOCK_STREAM,0) \|\| die "socket: $!";
1069	unlink($NAME);
1070	bind (Server, $uaddr) \|\| die "bind: $!";
1071	listen(Server,SOMAXCONN) \|\| die "listen: $!";
1072
1073	logmsg "server started on $NAME";
1074
1075	my $waitedpid;
1076
1077	use POSIX ":sys_wait_h";
1078	sub REAPER {
1079	my $child;
1080	while (($waitedpid = waitpid(-1,WNOHANG)) > 0) {
1081	logmsg "reaped $waitedpid" . ($? ? " with exit $?" : '');
1082	}
1083	$SIG{CHLD} = \&REAPER; # loathe sysV
1084	}
1085
1086	$SIG{CHLD} = \&REAPER;
1087
1088
1089	for ( $waitedpid = 0;
1090	accept(Client,Server) \|\| $waitedpid;
1091	$waitedpid = 0, close Client)
1092	{
1093	next if $waitedpid;
1094	logmsg "connection on $NAME";
1095	spawn sub {
1096	print "Hello there, it's now ", scalar localtime, "\n";
1097	exec '/usr/games/fortune' or die "can't exec fortune: $!";
1098	};
1099	}
1100
1101	sub spawn {
1102	my $coderef = shift;
1103
1104	unless (@_ == 0 && $coderef && ref($coderef) eq 'CODE') {
1105	confess "usage: spawn CODEREF";
1106	}
1107
1108	my $pid;
1109	if (!defined($pid = fork)) {
1110	logmsg "cannot fork: $!";
1111	return;
1112	} elsif ($pid) {
1113	logmsg "begat $pid";
1114	return; # I'm the parent
1115	}
1116	# else I'm the child -- go spawn
1117
1118	open(STDIN, "<&Client") \|\| die "can't dup client to stdin";
1119	open(STDOUT, ">&Client") \|\| die "can't dup client to stdout";
1120	## open(STDERR, ">&STDOUT") \|\| die "can't dup stdout to stderr";
1121	exit &$coderef();
1122	}
1123
1124	As you see, it's remarkably similar to the Internet domain TCP server, so
1125	much so, in fact, that we've omitted several duplicate functions--spawn(),
1126	logmsg(), ctime(), and REAPER()--which are exactly the same as in the
1127	other server.
1128
1129	So why would you ever want to use a Unix domain socket instead of a
1130	simpler named pipe? Because a named pipe doesn't give you sessions. You
1131	can't tell one process's data from another's. With socket programming,
1132	you get a separate session for each client: that's why accept() takes two
1133	arguments.
1134
1135	For example, let's say that you have a long running database server daemon
1136	that you want folks from the World Wide Web to be able to access, but only
1137	if they go through a CGI interface. You'd have a small, simple CGI
1138	program that does whatever checks and logging you feel like, and then acts
1139	as a Unix-domain client and connects to your private server.
1140
1141	=head1 TCP Clients with IO::Socket
1142
1143	For those preferring a higher-level interface to socket programming, the
1144	IO::Socket module provides an object-oriented approach. IO::Socket is
1145	included as part of the standard Perl distribution as of the 5.004
1146	release. If you're running an earlier version of Perl, just fetch
1147	IO::Socket from CPAN, where you'll also find modules providing easy
1148	interfaces to the following systems: DNS, FTP, Ident (RFC 931), NIS and
1149	NISPlus, NNTP, Ping, POP3, SMTP, SNMP, SSLeay, Telnet, and Time--just
1150	to name a few.
1151
1152	=head2 A Simple Client
1153
1154	Here's a client that creates a TCP connection to the "daytime"
1155	service at port 13 of the host name "localhost" and prints out everything
1156	that the server there cares to provide.
1157
1158	#!/usr/bin/perl -w
1159	use IO::Socket;
1160	$remote = IO::Socket::INET->new(
1161	Proto => "tcp",
1162	PeerAddr => "localhost",
1163	PeerPort => "daytime(13)",
1164	)
1165	or die "cannot connect to daytime port at localhost";
1166	while ( <$remote> ) { print }
1167
1168	When you run this program, you should get something back that
1169	looks like this:
1170
1171	Wed May 14 08:40:46 MDT 1997
1172
1173	Here are what those parameters to the C<new> constructor mean:
1174
1175	=over 4
1176
1177	=item C<Proto>
1178
1179	This is which protocol to use. In this case, the socket handle returned
1180	will be connected to a TCP socket, because we want a stream-oriented
1181	connection, that is, one that acts pretty much like a plain old file.
1182	Not all sockets are this of this type. For example, the UDP protocol
1183	can be used to make a datagram socket, used for message-passing.
1184
1185	=item C<PeerAddr>
1186
1187	This is the name or Internet address of the remote host the server is
1188	running on. We could have specified a longer name like C<"www.perl.com">,
1189	or an address like C<"204.148.40.9">. For demonstration purposes, we've
1190	used the special hostname C<"localhost">, which should always mean the
1191	current machine you're running on. The corresponding Internet address
1192	for localhost is C<"127.1">, if you'd rather use that.
1193
1194	=item C<PeerPort>
1195
1196	This is the service name or port number we'd like to connect to.
1197	We could have gotten away with using just C<"daytime"> on systems with a
1198	well-configured system services file,[FOOTNOTE: The system services file
1199	is in I</etc/services> under Unix] but just in case, we've specified the
1200	port number (13) in parentheses. Using just the number would also have
1201	worked, but constant numbers make careful programmers nervous.
1202
1203	=back
1204
1205	Notice how the return value from the C<new> constructor is used as
1206	a filehandle in the C<while> loop? That's what's called an indirect
1207	filehandle, a scalar variable containing a filehandle. You can use
1208	it the same way you would a normal filehandle. For example, you
1209	can read one line from it this way:
1210
1211	$line = <$handle>;
1212
1213	all remaining lines from is this way:
1214
1215	@lines = <$handle>;
1216
1217	and send a line of data to it this way:
1218
1219	print $handle "some data\n";
1220
1221	=head2 A Webget Client
1222
1223	Here's a simple client that takes a remote host to fetch a document
1224	from, and then a list of documents to get from that host. This is a
1225	more interesting client than the previous one because it first sends
1226	something to the server before fetching the server's response.
1227
1228	#!/usr/bin/perl -w
1229	use IO::Socket;
1230	unless (@ARGV > 1) { die "usage: $0 host document ..." }
1231	$host = shift(@ARGV);
1232	$EOL = "\015\012";
1233	$BLANK = $EOL x 2;
1234	foreach $document ( @ARGV ) {
1235	$remote = IO::Socket::INET->new( Proto => "tcp",
1236	PeerAddr => $host,
1237	PeerPort => "http(80)",
1238	);
1239	unless ($remote) { die "cannot connect to http daemon on $host" }
1240	$remote->autoflush(1);
1241	print $remote "GET $document HTTP/1.0" . $BLANK;
1242	while ( <$remote> ) { print }
1243	close $remote;
1244	}
1245
1246	The web server handing the "http" service, which is assumed to be at
1247	its standard port, number 80. If the web server you're trying to
1248	connect to is at a different port (like 1080 or 8080), you should specify
1249	as the named-parameter pair, C<< PeerPort => 8080 >>. The C<autoflush>
1250	method is used on the socket because otherwise the system would buffer
1251	up the output we sent it. (If you're on a Mac, you'll also need to
1252	change every C<"\n"> in your code that sends data over the network to
1253	be a C<"\015\012"> instead.)
1254
1255	Connecting to the server is only the first part of the process: once you
1256	have the connection, you have to use the server's language. Each server
1257	on the network has its own little command language that it expects as
1258	input. The string that we send to the server starting with "GET" is in
1259	HTTP syntax. In this case, we simply request each specified document.
1260	Yes, we really are making a new connection for each document, even though
1261	it's the same host. That's the way you always used to have to speak HTTP.
1262	Recent versions of web browsers may request that the remote server leave
1263	the connection open a little while, but the server doesn't have to honor
1264	such a request.
1265
1266	Here's an example of running that program, which we'll call I<webget>:
1267
1268	% webget www.perl.com /guanaco.html
1269	HTTP/1.1 404 File Not Found
1270	Date: Thu, 08 May 1997 18:02:32 GMT
1271	Server: Apache/1.2b6
1272	Connection: close
1273	Content-type: text/html
1274
1275	<HEAD><TITLE>404 File Not Found</TITLE></HEAD>
1276	<BODY><H1>File Not Found</H1>
1277	The requested URL /guanaco.html was not found on this server.<P>
1278	</BODY>
1279
1280	Ok, so that's not very interesting, because it didn't find that
1281	particular document. But a long response wouldn't have fit on this page.
1282
1283	For a more fully-featured version of this program, you should look to
1284	the I<lwp-request> program included with the LWP modules from CPAN.
1285
1286	=head2 Interactive Client with IO::Socket
1287
1288	Well, that's all fine if you want to send one command and get one answer,
1289	but what about setting up something fully interactive, somewhat like
1290	the way I<telnet> works? That way you can type a line, get the answer,
1291	type a line, get the answer, etc.
1292
1293	This client is more complicated than the two we've done so far, but if
1294	you're on a system that supports the powerful C<fork> call, the solution
1295	isn't that rough. Once you've made the connection to whatever service
1296	you'd like to chat with, call C<fork> to clone your process. Each of
1297	these two identical process has a very simple job to do: the parent
1298	copies everything from the socket to standard output, while the child
1299	simultaneously copies everything from standard input to the socket.
1300	To accomplish the same thing using just one process would be I<much>
1301	harder, because it's easier to code two processes to do one thing than it
1302	is to code one process to do two things. (This keep-it-simple principle
1303	a cornerstones of the Unix philosophy, and good software engineering as
1304	well, which is probably why it's spread to other systems.)
1305
1306	Here's the code:
1307
1308	#!/usr/bin/perl -w
1309	use strict;
1310	use IO::Socket;
1311	my ($host, $port, $kidpid, $handle, $line);
1312
1313	unless (@ARGV == 2) { die "usage: $0 host port" }
1314	($host, $port) = @ARGV;
1315
1316	# create a tcp connection to the specified host and port
1317	$handle = IO::Socket::INET->new(Proto => "tcp",
1318	PeerAddr => $host,
1319	PeerPort => $port)
1320	or die "can't connect to port $port on $host: $!";
1321
1322	$handle->autoflush(1); # so output gets there right away
1323	print STDERR "[Connected to $host:$port]\n";
1324
1325	# split the program into two processes, identical twins
1326	die "can't fork: $!" unless defined($kidpid = fork());
1327
1328	# the if{} block runs only in the parent process
1329	if ($kidpid) {
1330	# copy the socket to standard output
1331	while (defined ($line = <$handle>)) {
1332	print STDOUT $line;
1333	}
1334	kill("TERM", $kidpid); # send SIGTERM to child
1335	}
1336	# the else{} block runs only in the child process
1337	else {
1338	# copy standard input to the socket
1339	while (defined ($line = <STDIN>)) {
1340	print $handle $line;
1341	}
1342	}
1343
1344	The C<kill> function in the parent's C<if> block is there to send a
1345	signal to our child process (current running in the C<else> block)
1346	as soon as the remote server has closed its end of the connection.
1347
1348	If the remote server sends data a byte at time, and you need that
1349	data immediately without waiting for a newline (which might not happen),
1350	you may wish to replace the C<while> loop in the parent with the
1351	following:
1352
1353	my $byte;
1354	while (sysread($handle, $byte, 1) == 1) {
1355	print STDOUT $byte;
1356	}
1357
1358	Making a system call for each byte you want to read is not very efficient
1359	(to put it mildly) but is the simplest to explain and works reasonably
1360	well.
1361
1362	=head1 TCP Servers with IO::Socket
1363
1364	As always, setting up a server is little bit more involved than running a client.
1365	The model is that the server creates a special kind of socket that
1366	does nothing but listen on a particular port for incoming connections.
1367	It does this by calling the C<< IO::Socket::INET->new() >> method with
1368	slightly different arguments than the client did.
1369
1370	=over 4
1371
1372	=item Proto
1373
1374	This is which protocol to use. Like our clients, we'll
1375	still specify C<"tcp"> here.
1376
1377	=item LocalPort
1378
1379	We specify a local
1380	port in the C<LocalPort> argument, which we didn't do for the client.
1381	This is service name or port number for which you want to be the
1382	server. (Under Unix, ports under 1024 are restricted to the
1383	superuser.) In our sample, we'll use port 9000, but you can use
1384	any port that's not currently in use on your system. If you try
1385	to use one already in used, you'll get an "Address already in use"
1386	message. Under Unix, the C<netstat -a> command will show
1387	which services current have servers.
1388
1389	=item Listen
1390
1391	The C<Listen> parameter is set to the maximum number of
1392	pending connections we can accept until we turn away incoming clients.
1393	Think of it as a call-waiting queue for your telephone.
1394	The low-level Socket module has a special symbol for the system maximum, which
1395	is SOMAXCONN.
1396
1397	=item Reuse
1398
1399	The C<Reuse> parameter is needed so that we restart our server
1400	manually without waiting a few minutes to allow system buffers to
1401	clear out.
1402
1403	=back
1404
1405	Once the generic server socket has been created using the parameters
1406	listed above, the server then waits for a new client to connect
1407	to it. The server blocks in the C<accept> method, which eventually accepts a
1408	bidirectional connection from the remote client. (Make sure to autoflush
1409	this handle to circumvent buffering.)
1410
1411	To add to user-friendliness, our server prompts the user for commands.
1412	Most servers don't do this. Because of the prompt without a newline,
1413	you'll have to use the C<sysread> variant of the interactive client above.
1414
1415	This server accepts one of five different commands, sending output
1416	back to the client. Note that unlike most network servers, this one
1417	only handles one incoming client at a time. Multithreaded servers are
1418	covered in Chapter 6 of the Camel.
1419
1420	Here's the code. We'll
1421
1422	#!/usr/bin/perl -w
1423	use IO::Socket;
1424	use Net::hostent; # for OO version of gethostbyaddr
1425
1426	$PORT = 9000; # pick something not in use
1427
1428	$server = IO::Socket::INET->new( Proto => 'tcp',
1429	LocalPort => $PORT,
1430	Listen => SOMAXCONN,
1431	Reuse => 1);
1432
1433	die "can't setup server" unless $server;
1434	print "[Server $0 accepting clients]\n";
1435
1436	while ($client = $server->accept()) {
1437	$client->autoflush(1);
1438	print $client "Welcome to $0; type help for command list.\n";
1439	$hostinfo = gethostbyaddr($client->peeraddr);
1440	printf "[Connect from %s]\n", $hostinfo ? $hostinfo->name : $client->peerhost;
1441	print $client "Command? ";
1442	while ( <$client>) {
1443	next unless /\S/; # blank line
1444	if (/quit\|exit/i) { last; }
1445	elsif (/date\|time/i) { printf $client "%s\n", scalar localtime; }
1446	elsif (/who/i ) { print $client `who 2>&1`; }
1447	elsif (/cookie/i ) { print $client `/usr/games/fortune 2>&1`; }
1448	elsif (/motd/i ) { print $client `cat /etc/motd 2>&1`; }
1449	else {
1450	print $client "Commands: quit date who cookie motd\n";
1451	}
1452	} continue {
1453	print $client "Command? ";
1454	}
1455	close $client;
1456	}
1457
1458	=head1 UDP: Message Passing
1459
1460	Another kind of client-server setup is one that uses not connections, but
1461	messages. UDP communications involve much lower overhead but also provide
1462	less reliability, as there are no promises that messages will arrive at
1463	all, let alone in order and unmangled. Still, UDP offers some advantages
1464	over TCP, including being able to "broadcast" or "multicast" to a whole
1465	bunch of destination hosts at once (usually on your local subnet). If you
1466	find yourself overly concerned about reliability and start building checks
1467	into your message system, then you probably should use just TCP to start
1468	with.
1469
1470	Note that UDP datagrams are I<not> a bytestream and should not be treated
1471	as such. This makes using I/O mechanisms with internal buffering
1472	like stdio (i.e. print() and friends) especially cumbersome. Use syswrite(),
1473	or better send(), like in the example below.
1474
1475	Here's a UDP program similar to the sample Internet TCP client given
1476	earlier. However, instead of checking one host at a time, the UDP version
1477	will check many of them asynchronously by simulating a multicast and then
1478	using select() to do a timed-out wait for I/O. To do something similar
1479	with TCP, you'd have to use a different socket handle for each host.
1480
1481	#!/usr/bin/perl -w
1482	use strict;
1483	use Socket;
1484	use Sys::Hostname;
1485
1486	my ( $count, $hisiaddr, $hispaddr, $histime,
1487	$host, $iaddr, $paddr, $port, $proto,
1488	$rin, $rout, $rtime, $SECS_of_70_YEARS);
1489
1490	$SECS_of_70_YEARS = 2208988800;
1491
1492	$iaddr = gethostbyname(hostname());
1493	$proto = getprotobyname('udp');
1494	$port = getservbyname('time', 'udp');
1495	$paddr = sockaddr_in(0, $iaddr); # 0 means let kernel pick
1496
1497	socket(SOCKET, PF_INET, SOCK_DGRAM, $proto) \|\| die "socket: $!";
1498	bind(SOCKET, $paddr) \|\| die "bind: $!";
1499
1500	$\| = 1;
1501	printf "%-12s %8s %s\n", "localhost", 0, scalar localtime time;
1502	$count = 0;
1503	for $host (@ARGV) {
1504	$count++;
1505	$hisiaddr = inet_aton($host) \|\| die "unknown host";
1506	$hispaddr = sockaddr_in($port, $hisiaddr);
1507	defined(send(SOCKET, 0, 0, $hispaddr)) \|\| die "send $host: $!";
1508	}
1509
1510	$rin = '';
1511	vec($rin, fileno(SOCKET), 1) = 1;
1512
1513	# timeout after 10.0 seconds
1514	while ($count && select($rout = $rin, undef, undef, 10.0)) {
1515	$rtime = '';
1516	($hispaddr = recv(SOCKET, $rtime, 4, 0)) \|\| die "recv: $!";
1517	($port, $hisiaddr) = sockaddr_in($hispaddr);
1518	$host = gethostbyaddr($hisiaddr, AF_INET);
1519	$histime = unpack("N", $rtime) - $SECS_of_70_YEARS;
1520	printf "%-12s ", $host;
1521	printf "%8d %s\n", $histime - time, scalar localtime($histime);
1522	$count--;
1523	}
1524
1525	Note that this example does not include any retries and may consequently
1526	fail to contact a reachable host. The most prominent reason for this
1527	is congestion of the queues on the sending host if the number of
1528	list of hosts to contact is sufficiently large.
1529
1530	=head1 SysV IPC
1531
1532	While System V IPC isn't so widely used as sockets, it still has some
1533	interesting uses. You can't, however, effectively use SysV IPC or
1534	Berkeley mmap() to have shared memory so as to share a variable amongst
1535	several processes. That's because Perl would reallocate your string when
1536	you weren't wanting it to.
1537
1538	Here's a small example showing shared memory usage.
1539
1540	use IPC::SysV qw(IPC_PRIVATE IPC_RMID S_IRWXU);
1541
1542	$size = 2000;
1543	$id = shmget(IPC_PRIVATE, $size, S_IRWXU) \|\| die "$!";
1544	print "shm key $id\n";
1545
1546	$message = "Message #1";
1547	shmwrite($id, $message, 0, 60) \|\| die "$!";
1548	print "wrote: '$message'\n";
1549	shmread($id, $buff, 0, 60) \|\| die "$!";
1550	print "read : '$buff'\n";
1551
1552	# the buffer of shmread is zero-character end-padded.
1553	substr($buff, index($buff, "\0")) = '';
1554	print "un" unless $buff eq $message;
1555	print "swell\n";
1556
1557	print "deleting shm $id\n";
1558	shmctl($id, IPC_RMID, 0) \|\| die "$!";
1559
1560	Here's an example of a semaphore:
1561
1562	use IPC::SysV qw(IPC_CREAT);
1563
1564	$IPC_KEY = 1234;
1565	$id = semget($IPC_KEY, 10, 0666 \| IPC_CREAT ) \|\| die "$!";
1566	print "shm key $id\n";
1567
1568	Put this code in a separate file to be run in more than one process.
1569	Call the file F<take>:
1570
1571	# create a semaphore
1572
1573	$IPC_KEY = 1234;
1574	$id = semget($IPC_KEY, 0 , 0 );
1575	die if !defined($id);
1576
1577	$semnum = 0;
1578	$semflag = 0;
1579
1580	# 'take' semaphore
1581	# wait for semaphore to be zero
1582	$semop = 0;
1583	$opstring1 = pack("s!s!s!", $semnum, $semop, $semflag);
1584
1585	# Increment the semaphore count
1586	$semop = 1;
1587	$opstring2 = pack("s!s!s!", $semnum, $semop, $semflag);
1588	$opstring = $opstring1 . $opstring2;
1589
1590	semop($id,$opstring) \|\| die "$!";
1591
1592	Put this code in a separate file to be run in more than one process.
1593	Call this file F<give>:
1594
1595	# 'give' the semaphore
1596	# run this in the original process and you will see
1597	# that the second process continues
1598
1599	$IPC_KEY = 1234;
1600	$id = semget($IPC_KEY, 0, 0);
1601	die if !defined($id);
1602
1603	$semnum = 0;
1604	$semflag = 0;
1605
1606	# Decrement the semaphore count
1607	$semop = -1;
1608	$opstring = pack("s!s!s!", $semnum, $semop, $semflag);
1609
1610	semop($id,$opstring) \|\| die "$!";
1611
1612	The SysV IPC code above was written long ago, and it's definitely
1613	clunky looking. For a more modern look, see the IPC::SysV module
1614	which is included with Perl starting from Perl 5.005.
1615
1616	A small example demonstrating SysV message queues:
1617
1618	use IPC::SysV qw(IPC_PRIVATE IPC_RMID IPC_CREAT S_IRWXU);
1619
1620	my $id = msgget(IPC_PRIVATE, IPC_CREAT \| S_IRWXU);
1621
1622	my $sent = "message";
1623	my $type_sent = 1234;
1624	my $rcvd;
1625	my $type_rcvd;
1626
1627	if (defined $id) {
1628	if (msgsnd($id, pack("l! a*", $type_sent, $sent), 0)) {
1629	if (msgrcv($id, $rcvd, 60, 0, 0)) {
1630	($type_rcvd, $rcvd) = unpack("l! a*", $rcvd);
1631	if ($rcvd eq $sent) {
1632	print "okay\n";
1633	} else {
1634	print "not okay\n";
1635	}
1636	} else {
1637	die "# msgrcv failed\n";
1638	}
1639	} else {
1640	die "# msgsnd failed\n";
1641	}
1642	msgctl($id, IPC_RMID, 0) \|\| die "# msgctl failed: $!\n";
1643	} else {
1644	die "# msgget failed\n";
1645	}
1646
1647	=head1 NOTES
1648
1649	Most of these routines quietly but politely return C<undef> when they
1650	fail instead of causing your program to die right then and there due to
1651	an uncaught exception. (Actually, some of the new I<Socket> conversion
1652	functions croak() on bad arguments.) It is therefore essential to
1653	check return values from these functions. Always begin your socket
1654	programs this way for optimal success, and don't forget to add B<-T>
1655	taint checking flag to the #! line for servers:
1656
1657	#!/usr/bin/perl -Tw
1658	use strict;
1659	use sigtrap;
1660	use Socket;
1661
1662	=head1 BUGS
1663
1664	All these routines create system-specific portability problems. As noted
1665	elsewhere, Perl is at the mercy of your C libraries for much of its system
1666	behaviour. It's probably safest to assume broken SysV semantics for
1667	signals and to stick with simple TCP and UDP socket operations; e.g., don't
1668	try to pass open file descriptors over a local UDP datagram socket if you
1669	want your code to stand a chance of being portable.
1670
1671	=head1 AUTHOR
1672
1673	Tom Christiansen, with occasional vestiges of Larry Wall's original
1674	version and suggestions from the Perl Porters.
1675
1676	=head1 SEE ALSO
1677
1678	There's a lot more to networking than this, but this should get you
1679	started.
1680
1681	For intrepid programmers, the indispensable textbook is I<Unix
1682	Network Programming, 2nd Edition, Volume 1> by W. Richard Stevens
1683	(published by Prentice-Hall). Note that most books on networking
1684	address the subject from the perspective of a C programmer; translation
1685	to Perl is left as an exercise for the reader.
1686
1687	The IO::Socket(3) manpage describes the object library, and the Socket(3)
1688	manpage describes the low-level interface to sockets. Besides the obvious
1689	functions in L<perlfunc>, you should also check out the F<modules> file
1690	at your nearest CPAN site. (See L<perlmodlib> or best yet, the F<Perl
1691	FAQ> for a description of what CPAN is and where to get it.)
1692
1693	Section 5 of the F<modules> file is devoted to "Networking, Device Control
1694	(modems), and Interprocess Communication", and contains numerous unbundled
1695	modules numerous networking modules, Chat and Expect operations, CGI
1696	programming, DCE, FTP, IPC, NNTP, Proxy, Ptty, RPC, SNMP, SMTP, Telnet,
1697	Threads, and ToolTalk--just to name a few.

Note: See TracBrowser for help on using the repository browser.

Download in other formats:

Original Format