Getopt::Long - Extended processing of command line options
use Getopt::Long;
my $data = "file.dat";
my $length = 24;
my $verbose;
GetOptions ("length=i" => \$length, # numeric
"file=s" => \$data, # string
"verbose" => \$verbose) # flag
or die("Error in command line arguments\n");
The Getopt::Long module implements an extended getopt function called GetOptions(). It parses the command line from @ARGV
, recognizing and removing specified options and their possible values.
This function adheres to the POSIX syntax for command line options, with GNU extensions. In general, this means that options have long names instead of single letters, and are introduced with a double dash "--". Support for bundling of command line options, as was the case with the more traditional single-letter approach, is provided but not enabled by default.
Command line operated programs traditionally take their arguments from the command line, for example filenames or other information that the program needs to know. Besides arguments, these programs often take command line options as well. Options are not necessary for the program to work, hence the name 'option', but are used to modify its default behaviour. For example, a program could do its job quietly, but with a suitable option it could provide verbose information about what it did.
Command line options come in several flavours. Historically, they are preceded by a single dash -
, and consist of a single letter.
-l -a -c
Usually, these single-character options can be bundled:
-lac
Options can have values, the value is placed after the option character. Sometimes with whitespace in between, sometimes not:
-s 24 -s24
Due to the very cryptic nature of these options, another style was developed that used long names. So instead of a cryptic -l
one could use the more descriptive --long
. To distinguish between a bundle of single-character options and a long one, two dashes are used to precede the option name. Early implementations of long options used a plus +
instead. Also, option values could be specified either like
--size=24
or
--size 24
The +
form is now obsolete and strongly deprecated.
Getopt::Long is the Perl5 successor of newgetopt.pl
. This was the first Perl module that provided support for handling the new style of command line options, in particular long option names, hence the Perl5 name Getopt::Long. This module also supports single-character options and bundling.
To use Getopt::Long from a Perl program, you must include the following line in your Perl program:
use Getopt::Long;
This will load the core of the Getopt::Long module and prepare your program for using it. Most of the actual Getopt::Long code is not loaded until you really call one of its functions.
In the default configuration, options names may be abbreviated to uniqueness, case does not matter, and a single dash is sufficient, even for long option names. Also, options may be placed between non-option arguments. See "Configuring Getopt::Long" for more details on how to configure Getopt::Long.
The most simple options are the ones that take no values. Their mere presence on the command line enables the option. Popular examples are:
--all --verbose --quiet --debug
Handling simple options is straightforward:
my $verbose = ''; # option variable with default value (false)
my $all = ''; # option variable with default value (false)
GetOptions ('verbose' => \$verbose, 'all' => \$all);
The call to GetOptions() parses the command line arguments that are present in @ARGV
and sets the option variable to the value 1
if the option did occur on the command line. Otherwise, the option variable is not touched. Setting the option value to true is often called enabling the option.
The option name as specified to the GetOptions() function is called the option specification. Later we'll see that this specification can contain more than just the option name. The reference to the variable is called the option destination.
GetOptions() will return a true value if the command line could be processed successfully. Otherwise, it will write error messages using die() and warn(), and return a false result.
Getopt::Long supports two useful variants of simple options: negatable options and incremental options.
A negatable option is specified with an exclamation mark !
after the option name:
my $verbose = ''; # option variable with default value (false)
GetOptions ('verbose!' => \$verbose);
Now, using --verbose
on the command line will enable $verbose
, as expected. But it is also allowed to use --noverbose
, which will disable $verbose
by setting its value to 0
. Using a suitable default value, the program can find out whether $verbose
is false by default, or disabled by using --noverbose
.
(If both --verbose
and --noverbose
are given, whichever is given last takes precedence.)
An incremental option is specified with a plus +
after the option name:
my $verbose = ''; # option variable with default value (false)
GetOptions ('verbose+' => \$verbose);
Using --verbose
on the command line will increment the value of $verbose
. This way the program can keep track of how many times the option occurred on the command line. For example, each occurrence of --verbose
could increase the verbosity level of the program.
Usually programs take command line options as well as other arguments, for example, file names. It is good practice to always specify the options first, and the other arguments last. Getopt::Long will, however, allow the options and arguments to be mixed and 'filter out' all the options before passing the rest of the arguments to the program. To stop Getopt::Long from processing further arguments, insert a double dash --
on the command line:
--size 24 -- --all
In this example, --all
will not be treated as an option, but passed to the program unharmed, in @ARGV
.
For options that take values it must be specified whether the option value is required or not, and what kind of value the option expects.
Three kinds of values are supported: integer numbers, floating point numbers, and strings.
If the option value is required, Getopt::Long will take the command line argument that follows the option and assign this to the option variable. If, however, the option value is specified as optional, this will only be done if that value does not look like a valid command line option itself.
my $tag = ''; # option variable with default value
GetOptions ('tag=s' => \$tag);
In the option specification, the option name is followed by an equals sign =
and the letter s
. The equals sign indicates that this option requires a value. The letter s
indicates that this value is an arbitrary string. Other possible value types are i
for integer values, and f
for floating point values. Using a colon :
instead of the equals sign indicates that the option value is optional. In this case, if no suitable value is supplied, string valued options get an empty string ''
assigned, while numeric options are set to 0
.
(If the same option appears more than once on the command line, the last given value is used. If you want to take all the values, see below.)
Options sometimes take several values. For example, a program could use multiple directories to search for library files:
--library lib/stdlib --library lib/extlib
To accomplish this behaviour, simply specify an array reference as the destination for the option:
GetOptions ("library=s" => \@libfiles);
Alternatively, you can specify that the option can have multiple values by adding a "@", and pass a reference to a scalar as the destination:
GetOptions ("library=s@" => \$libfiles);
Used with the example above, @libfiles
c.q. @$libfiles
would contain two strings upon completion: "lib/stdlib"
and "lib/extlib"
, in that order. It is also possible to specify that only integer or floating point numbers are acceptable values.
Often it is useful to allow comma-separated lists of values as well as multiple occurrences of the options. This is easy using Perl's split() and join() operators:
GetOptions ("library=s" => \@libfiles);
@libfiles = split(/,/,join(',',@libfiles));
Of course, it is important to choose the right separator string for each purpose.
Warning: What follows is an experimental feature.
Options can take multiple values at once, for example
--coordinates 52.2 16.4 --rgbcolor 255 255 149
This can be accomplished by adding a repeat specifier to the option specification. Repeat specifiers are very similar to the {...}
repeat specifiers that can be used with regular expression patterns. For example, the above command line would be handled as follows:
GetOptions('coordinates=f{2}' => \@coor, 'rgbcolor=i{3}' => \@color);
The destination for the option must be an array or array reference.
It is also possible to specify the minimal and maximal number of arguments an option takes. foo=s{2,4}
indicates an option that takes at least two and at most 4 arguments. foo=s{1,}
indicates one or more values; foo:s{,}
indicates zero or more option values.
If the option destination is a reference to a hash, the option will take, as value, strings of the form key=
value. The value will be stored with the specified key in the hash.
GetOptions ("define=s" => \%defines);
Alternatively you can use:
GetOptions ("define=s%" => \$defines);
When used with command line options:
--define os=linux --define vendor=redhat
the hash %defines
(or %$defines
) will contain two keys, "os"
with value "linux"
and "vendor"
with value "redhat"
. It is also possible to specify that only integer or floating point numbers are acceptable values. The keys are always taken to be strings.
Ultimate control over what should be done when (actually: each time) an option is encountered on the command line can be achieved by designating a reference to a subroutine (or an anonymous subroutine) as the option destination. When GetOptions() encounters the option, it will call the subroutine with two or three arguments. The first argument is the name of the option. (Actually, it is an object that stringifies to the name of the option.) For a scalar or array destination, the second argument is the value to be stored. For a hash destination, the second argument is the key to the hash, and the third argument the value to be stored. It is up to the subroutine to store the value, or do whatever it thinks is appropriate.
A trivial application of this mechanism is to implement options that are related to each other. For example:
my $verbose = ''; # option variable with default value (false)
GetOptions ('verbose' => \$verbose,
'quiet' => sub { $verbose = 0 });
Here --verbose
and --quiet
control the same variable $verbose
, but with opposite values.
If the subroutine needs to signal an error, it should call die() with the desired error message as its argument. GetOptions() will catch the die(), issue the error message, and record that an error result must be returned upon completion.
If the text of the error message starts with an exclamation mark !
it is interpreted specially by GetOptions(). There is currently one special command implemented: die("!FINISH")
will cause GetOptions() to stop processing options, as if it encountered a double dash --
.
Here is an example of how to access the option name and value from within a subroutine:
GetOptions ('opt=i' => \&handler);
sub handler {
my ($opt_name, $opt_value) = @_;
print("Option name is $opt_name and value is $opt_value\n");
}
Often it is user friendly to supply alternate mnemonic names for options. For example --height
could be an alternate name for --length
. Alternate names can be included in the option specification, separated by vertical bar |
characters. To implement the above example:
GetOptions ('length|height=f' => \$length);
The first name is called the primary name, the other names are called aliases. When using a hash to store options, the key will always be the primary name.
Multiple alternate names are possible.
Without additional configuration, GetOptions() will ignore the case of option names, and allow the options to be abbreviated to uniqueness.
GetOptions ('length|height=f' => \$length, "head" => \$head);
This call will allow --l
and --L
for the length option, but requires a least --hea
and --hei
for the head and height options.
Each option specifier consists of two parts: the name specification and the argument specification.
The name specification contains the name of the option, optionally followed by a list of alternative names separated by vertical bar characters. The name is made up of alphanumeric characters, hyphens, underscores. If pass_through
is disabled, a period is also allowed in option names.
length option name is "length"
length|size|l name is "length", aliases are "size" and "l"
The argument specification is optional. If omitted, the option is considered boolean, a value of 1 will be assigned when the option is used on the command line.
The argument specification can be
The option does not take an argument and may be negated by prefixing it with "no" or "no-". E.g. "foo!"
will allow --foo
(a value of 1 will be assigned) as well as --nofoo
and --no-foo
(a value of 0 will be assigned). If the option has aliases, this applies to the aliases as well.
Using negation on a single letter option when bundling is in effect is pointless and will result in a warning.
The option does not take an argument and will be incremented by 1 every time it appears on the command line. E.g. "more+"
, when used with --more --more --more
, will increment the value three times, resulting in a value of 3 (provided it was 0 or undefined at first).
The +
specifier is ignored if the option destination is not a scalar.
The option requires an argument of the given type. Supported types are:
String. An arbitrary sequence of characters. It is valid for the argument to start with -
or --
.
Integer. An optional leading plus or minus sign, followed by a sequence of digits.
Extended integer, Perl style. This can be either an optional leading plus or minus sign, followed by a sequence of digits, or an octal string (a zero, optionally followed by '0', '1', .. '7'), or a hexadecimal string (0x
followed by '0' .. '9', 'a' .. 'f', case insensitive), or a binary string (0b
followed by a series of '0' and '1').
Real number. For example 3.14
, -6.23E24
and so on.
The desttype can be @
or %
to specify that the option is list or a hash valued. This is only needed when the destination for the option value is not otherwise specified. It should be omitted when not needed.
The repeat specifies the number of values this option takes per occurrence on the command line. It has the format {
[ min ] [ ,
[ max ] ] }
.
min denotes the minimal number of arguments. It defaults to 1 for options with =
and to 0 for options with :
, see below. Note that min overrules the =
/ :
semantics.
max denotes the maximum number of arguments. It must be at least min. If max is omitted, but the comma is not, there is no upper bound to the number of argument values taken.
Like =
, but designates the argument as optional. If omitted, an empty string will be assigned to string values options, and the value zero to numeric options.
Note that if a string argument starts with -
or --
, it will be considered an option on itself.
Like :i
, but if the value is omitted, the number will be assigned.
If the number is octal, hexadecimal or binary, behaves like :o
.