Test::Harness::Beyond - Beyond make test
Test::Harness is responsible for running test scripts, analysing their output and reporting success or failure. When I type make test (or ./Build test) for a module, Test::Harness is usually used to run the tests (not all modules use Test::Harness but the majority do).
To start exploring some of the features of Test::Harness I need to switch from make test to the prove command (which ships with Test::Harness). For the following examples I'll also need a recent version of Test::Harness installed; 3.14 is current as I write.
For the examples I'm going to assume that we're working with a 'normal' Perl module distribution. Specifically I'll assume that typing make or ./Build causes the built, ready-to-install module code to be available below ./blib/lib and ./blib/arch and that there's a directory called 't' that contains our tests. Test::Harness isn't hardwired to that configuration but it saves me from explaining which files live where for each example.
Back to prove; like make test it runs a test suite - but it provides far more control over which tests are executed, in what order and how their results are reported. Typically make test runs all the test scripts below the 't' directory. To do the same thing with prove I type:
prove -rb t
The switches here are -r to recurse into any directories below 't' and -b which adds ./blib/lib and ./blib/arch to Perl's include path so that the tests can find the code they will be testing. If I'm testing a module of which an earlier version is already installed I need to be careful about the include path to make sure I'm not running my tests against the installed version rather than the new one that I'm working on.
Unlike make test, typing prove doesn't automatically rebuild my module. If I forget to make before prove I will be testing against older versions of those files - which inevitably leads to confusion. I either get into the habit of typing
make && prove -rb t
or - if I have no XS code that needs to be built I use the modules below lib instead
prove -Ilib -r t
So far I've shown you nothing that make test doesn't do. Let's fix that.
If I have failing tests in a test suite that consists of more than a handful of scripts and takes more than a few seconds to run it rapidly becomes tedious to run the whole test suite repeatedly as I track down the problems.
I can tell prove just to run the tests that are failing like this:
prove -b t/this_fails.t t/so_does_this.t
That speeds things up but I have to make a note of which tests are failing and make sure that I run those tests. Instead I can use prove's --state switch and have it keep track of failing tests for me. First I do a complete run of the test suite and tell prove to save the results:
prove -rb --state=save t
That stores a machine readable summary of the test run in a file called '.prove' in the current directory. If I have failures I can then run just the failing scripts like this:
prove -b --state=failed
I can also tell prove to save the results again so that it updates its idea of which tests failed:
prove -b --state=failed,save
As soon as one of my failing tests passes it will be removed from the list of failed tests. Eventually I fix them all and prove can find no failing tests to run:
Files=0, Tests=0, 0 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU)
Result: NOTESTS
As I work on a particular part of my module it's most likely that the tests that cover that code will fail. I'd like to run the whole test suite but have it prioritize these 'hot' tests. I can tell prove to do this:
prove -rb --state=hot,save t
All the tests will run but those that failed most recently will be run first. If no tests have failed since I started saving state all tests will run in their normal order. This combines full test coverage with early notification of failures.
The --state switch supports a number of options; for example to run failed tests first followed by all remaining tests ordered by the timestamps of the test scripts - and save the results - I can use
prove -rb --state=failed,new,save t
See the prove documentation (type prove --man) for the full list of state options.
When I tell prove to save state it writes a file called '.prove' ('_prove' on Windows) in the current directory. It's a YAML document so it's quite easy to write tools of your own that work on the saved test state - but the format isn't officially documented so it might change without (much) warning in the future.
If my tests take too long to run I may be able to speed them up by running multiple test scripts in parallel. This is particularly effective if the tests are I/O bound or if I have multiple CPU cores. I tell prove to run my tests in parallel like this:
prove -rb -j 9 t
The -j switch enables parallel testing; the number that follows it is the maximum number of tests to run in parallel. Sometimes tests that pass when run sequentially will fail when run in parallel. For example if two different test scripts use the same temporary file or attempt to listen on the same socket I'll have problems running them in parallel. If I see unexpected failures I need to check my tests to work out which of them are trampling on the same resource and rename temporary files or add locks as appropriate.
To get the most performance benefit I want to have the test scripts that take the longest to run start first - otherwise I'll be waiting for the one test that takes nearly a minute to complete after all the others are done. I can use the --state switch to run the tests in slowest to fastest order:
prove -rb -j 9 --state=slow,save t
The Test Anything Protocol (http://testanything.org/) isn't just for Perl. Just about any language can be used to write tests that output TAP. There are TAP based testing libraries for C, C++, PHP, Python and many others. If I can't find a TAP library for my language of choice it's easy to generate valid TAP. It looks like this:
1..3
ok 1 - init OK
ok 2 - opened file
not ok 3 - appended to file
The first line is the plan - it specifies the number of tests I'm going to run so that it's easy to check that the test script didn't exit before running all the expected tests. The following lines are the test results - 'ok' for pass, 'not ok' for fail. Each test has a number and, optionally, a description. And that's it. Any language that can produce output like that on STDOUT can be used to write tests.
Recently I've been rekindling a two-decades-old interest in Forth. Evidently I have a masochistic streak that even Perl can't satisfy. I want to write tests in Forth and run them using prove (you can find my gforth TAP experiments at https://svn.hexten.net/andy/Forth/Testing/). I can use the --exec switch to tell prove to run the tests using gforth like this:
prove -r --exec gforth t
Alternately, if the language used to write my tests allows a shebang line I can use that to specify the interpreter. Here's a test written in PHP:
#!/usr/bin/php
<?php
print "1..2\n";
print "ok 1\n";
print "not ok 2\n";
?>
If I save that as t/phptest.t the shebang line will ensure that it runs correctly along with all my other tests.
Subtle interdependencies between test programs can mask problems - for example an earlier test may neglect to remove a temporary file that affects the behaviour of a later test. To find this kind of problem I use the --shuffle and --reverse options to run my tests in random or reversed order.
If I need a feature that prove doesn't provide I can easily write my own.
Typically you'll want to change how TAP gets input into and output from the parser.