Chapter 8. Testing

Table of Contents

1. Why Bother?
1.1. Projects for which Aegis' Testing is Most Suitable
1.2. Projects for which Aegis' Testing is Useful
1.3. Projects for which Aegis' Testing is Least Useful
2. Writing Tests
2.1. Contributors
2.2. General Guidelines
2.3. Bourne Shell
2.4. Perl
2.5. Batch Testing

This chapter discusses testing, and using Aegis to manage your tests and testing.

1. Why Bother?

Writing tests is extra work, compared to the way many small (and some not-so-small) software shops operate. For this reason, the testing requirement may be turned off.

The win is that the tests hang around forever, catching minor and major slips before they become embarrassing "features" in a released product. Prevention is cheaper than cure in this case, the tests save work down the track.

All of the "extra work" of writing tests is a long-term win, where old problems never again reappear. All of the "extra work" of reviewing changes means that another pair of eyes sees the code and finds potential problems before they manifest themselves in shipped product. All of the "extra work" of integration ensures that the baseline always works, and is always self-consistent. All of the "extra work" of having a baseline and separate development directories allows multiple parallel development, with no inter-developer interference; and the baseline always works, it is never in an "in-between" state. In each case, not doing this "extra work" is a false economy.

The existence of these tests, though, is what determines which projects are most suited to Aegis and which are not. It should be noted that suitability is a continuous scale, not black-and-white. With effort and resources, almost anything fits.

1.1. Projects for which Aegis' Testing is Most Suitable

Projects most suited to supervision by Aegis are straight programs. What the non-systems-programmers out there call "tools" and sometimes "applications". These are programs which take a pile of input, chew on it, and emit a pile of output. The tests can then compare actual outputs with expected outputs.

As an example, you could be writing a sed look-alike, a public domain clone of the Unix™ sed utility. You could write tests which exercise every feature (insertion, deletion, etc.) and generate the expected output with the real Unix™ sed. You write the code, and run the tests; you can immediately see if the output matches expectations.

This is a simple example. More complex examples exist, such as Aegis itself. The Aegis program is used to supervise its own development. Tests consist of sequences of commands and expected results are tested for.

Other types of software have been developed using Aegis: compilers and interpreters, client-server model software, magnetic tape utilities, graphics software such as a ray-tracer. The range is vast, but it is not all types of software.

1.2. Projects for which Aegis' Testing is Useful

For many years there have been full-screen applications on text terminals. In more recent times there is increasing use of graphical interfaces.

In developing these types of programs it is still possible to use Aegis, but several options need to be explored.

1.2.1. Testing Via Emulators

There are screen emulators for both full-screen text and X11 available. Using these emulators, it is possible to test the user interface, and test via the user interface. As yet, the author knows of no freely available emulators suitable for testing via Aegis. If you find one, please let me know.

1.2.2. Limited Testing

You may choose to use Aegis simply for its ability to provide controlled access to a large source. You still get the history and change mechanisms, the baseline model, the enforced review. You simply don't test all changes, because figuring out what is on the screen, and testing it against expectations, is too hard.

If the program has a command line interface, in addition to the full-screen or GUI interface, the functionality accessible from the command line may be tested using Aegis.

It is possible that "limited testing" actually means "no testing", if you have no functionality accessible from the command line.

1.2.3. Testing Mode

Another alternative is to provide hooks into your program allowing you to substitute a file for user input, and to be able to trigger the dump of a "screen image". The simulated user input can then be fed to the program, and the screen dump (in some terminal-independent form) can be compared against expectations.

This is easier for full-screen applications, than for X11 applications. You need to judge the cost-benefit trade-off. Cost of development, cost of storage space for X11 images, cost of not testing.

1.2.4. Manual Tests

The Aegis program provides a manual test facility. It was originally intended for programs which required some physical action from a user, such as "unplug Ethernet cable now" or "mount tape XG356B now". It can also be used to have a user confirm that some on-screen activity has happened.

The problem with manual tests is that they simply don't happen. It is far more pleasant to say "run the automatic tests" and go for a cup of coffee, than to wait while the computer thinks of mindless things to ask you to do. This is human nature: if it can be automated, it is more likely to happen.

1.2.5. Unit Tests

Many folks think of testing as taking the final product and testing it. It is also possible to build specialized unit tests, which exercise specific portions of the code. These tests can then be administrated by Aegis, even if the full-blown GUI cannot be.

1.3. Projects for which Aegis' Testing is Least Useful

Another class of software is things like operating system kernels and firmware; things which are "stand alone". This isolated nature makes it the most difficult to test: to test it you want to provide physical input and watch the physical output. By its very nature, it is hard to put into a shell script, and thus hard to write an Aegis test for.

The above chapter was written in 1991. At this writing (1999) there are projects like ×Linux and operating systems like VxWorks. These are all embedded, and all have excellent network and download support. It is entirely possible (with design support!) to write automatically testable embedded systems.

1.3.1. Operating Systems

It is not impossible, just that few of us have the resources to do it. You need to have a test system and a testing system: the test system has all of its input and outputs connected to the outputs and inputs of the testing system. That is, the testing system controls and drives the test system, and watches what happens.

For example, in the olden days before everyone had PC and graphics terminals, there were only serial interfaces available. Many operating system vendors tested their products by using computers connected to each serial line to simulate "user input". The system can be rebooted this way, and using dual-ported disks allows different versions of a kernel to be tried, or other test conditions created.

For software houses which write kernels, or device drivers for kernels, or some other kernel work, this is bad news: the Aegis program is probably not for you. It is possible, but there may be more cost-effective development strategies. Of course, you could always use the rest of Aegis, and ignore the testing part.

However, Aegis has been used quite successfully to develop Linux kernel modules. With suitable sudo(1) configuration to permit access to insmod(1) &co, developers can write test scripts which load device drivers, try them out, and unload them again, all without universal root access.

Also, the advent of modern tools, such as VMware, which allow one operating system to "host" another, may also permit straightforward testing of kernels and operating systems.

1.3.2. Firmware

Firmware is a similar deal: you need some way to download the code to be tested into the test system, and write-protect it to simulate ROM, and have the necessary hardware to drive the inputs and watch the outputs.

As you can see, this is generally not available to run-of-the-mill software houses, but then they rarely write firmware, either. Those that do write firmware usually have the download capabilities, and some kind of remote operation facility.

However, this omits the possibility of not only cross compiling your code for the target system, but also compiling your code to run natively on the build system. The firmware (in the host incarnation) then falls into one of the categories above, and may be readily tested. This does not relieve you of also testing the firmware, but it increases the probability that the firmware isn't completely useless before you download it.

By using an object oriented language, such as C++, the polymorphism necessary to cope with multiple environments can be elegantly hidden behind a pure abstract base class. Alternatively, by using a consistent API, you can accomplish the necessary sleight-of-hand at link time.

The unit test method mentioned earlier is also very useful for firmware, even if the device "as a whole" cannot be tested.

2. Writing Tests

This section describes a number of general guidelines for writing better tests, and some pitfalls to be avoided.

There are also a number of suggestions for portability of tests in specific scripting languages; this will definitely be important if you are writing software to publish on WWW or for FTP. Portability is often required within an organization, also. Examples include a change in company policy from one 386 Unix™ to another (e.g. company doesn't like Linux, now you must use AT&T's SVR4 offering), or the development team use gcc until the company finds out and forces you to use the prototype-less compiler supplied with the operating system, or even that the software being developed must run under both Unix™ and Windows NT.

Note, also, that when using Aegis' heterogeneous build support, portability will again feature prominently.

2.1. Contributors

I'd like to thank Steven Knight <knight@baldmt.com> for writing portions of this information.

If other readers have additional testing techniques, or use other scripting languages, contributions are welcome.

2.2. General Guidelines

This section lists a number of general guidelines for all aegis tests, regardless of implementation language. Use this section to guide how you write tests if the scripting language you choose is not specifically covered in greater detail below.

2.2.1. Choice of Scripting Language

The aegis program uses the test_command field of the project aegis.conf file to specify how tests are executed. The default value of the test_command field:

test_command = "$shell $file_name";

specifies that tests be Bourne shell scripts. You may, however, change the value of test_command to specify some other scripting language interpreter, which allows you to write your test scripts in whatever scripting language is appropriate for your project. The Perl or Python scripting languages, for example, could be used to create test scripts that are portable to systems other than Unix™ systems.

This means that if you can write it in your scripting language of choice, you can test it. This includes such things as client-server model interfaces, and multi-user synchronization testing.

2.2.2. No Execute Permission

Under aegis, script files do not have execute permission set, so they should always be invoked by passing the script file to the interpreter, not executing the test directly:

sh filename
perl filename

2.2.3. No Command-Line Arguments

Tests should not expect command line arguments. Tests are not passed the name of the project nor the number of the change.

2.2.4. Identifying the Scripting Language

Even though aegis does not execute the test script directly, it is a good idea to put some indication of its scripting language into the test script. See the sections below for suggested "magic number" identification of scripts in various languages.

2.2.5. Current Directory

Tests are always run with the current directory set to either the development directory of the change under test when testing a change, or the integration directory when integrating a change, or the baseline when performing independent tests.

A test must not make assumptions about where it is being executed from, except to the extent that it is somewhere a build has been performed. A test must not assume that the current directory is writable, and must not try to write to it, as this could damage the source code of a change under development, potentially destroying weeks of work.

2.2.6. Check Exit Status and Return Values

A test script should check the exit status or return value of every single command or function call, even those which cannot fail. Checking the exit status or return value of every statement in the script ensures that strange permission settings, or disk space problems, will cause the test to fail, rather than plow on and produce spurious results. See the sections below for specific suggestions on checking exit status or return values in various scripting languages.

2.2.7. Temporary Directory

Tests should create a temporary subdirectory in the operating system's temporary directory (typically /tmp on Unix™ systems) and then change its working directory (cd) to this directory. This isolates any vandalism that the program under test may indulge in, and serves as a place to write temporary files.

At the end of the test, it is sufficient to change directory out of the temporary subdirectory and then remove the entire temporary subdirectory hierarchy, rather than track and remove all test files which may or may not be created.

Some Unix™ systems provide other temporary directories, such as /var/tmp which may provide a better location for a temporary subdirectory for testing (more file system space available, administrator preference, etc.). Test scripts wishing to accomodate alternate temporary directories should use the TMPDIR environment variable (or some other environment variable appropriate to the operating system hosting the tests) as the location for creating their temporary subdirectory, with /tmp as a reasonable default if TMPDIR is not set.

2.2.8. Trap Interrupts

Test scripts should catch appropriate interrupts (1 2 3 and 15 on Unix™ systems) and cause the test to fail. The interrupt handler should perform any cleanup the test requires, such as removing the temporary subdirectory.

2.2.9. PAGER

If the program under test invokes pagers on its output, a la more(1) et al, it should be coded to use the PAGER environment variable. Tests of such programs should always set PAGER to cat so that tests always behave the same, irrespective of invocation method (either by aegis or from the command line).

2.2.10. Auxiliary Files

If a test requires extra files as input or output to a command, it must construct them itself from in-line data. (See the sections below for more specific information about how to use in-line data in various scripting languages to create files.)

It is almost impossible to determine the location of an auxiliary file, if that auxiliary file is part of the project source. It could be in either the change under test or the baseline.

2.2.11. New Test Templates

Regardless of your choice of scripting language, it is possible to specify most of the repetitious items above in a file template used every time a user creates a new test. See the aent(1) command for more information.

Having the machine do it for you means that you are more likely to do it.

2.3. Bourne Shell

The Bourne shell is available on all flavors of the Unix™ operating system, which allows Bourne shell scripts to be written portably across those systems. Here are some specific guidelines for writing aegis tests using Bourne shell scripts.

2.3.1. Magic Number

Some indication that the test is a Bourne shell script is a good idea. While many systems accept that a first line starting with a colon is a Bourne shell "magic number", a more widely understood "magic number" is

#! /bin/sh

as the first line of the script file.

2.3.2. Check Exit Status

A Bourne shell test script should check the exit status of every single command, even those which cannot fail. Do not rely on, or use, the set -e shell option (it provides no ability to clean up on error).

Checking the exit status involves testing the contents of the $? shell variable. Do not use an if statement wrapped around an execution of the program under test as this will miss core dumps and other terminations caused by signals.

2.3.3. Temporary Directory

Bourne shell test scripts should create a temporary subdirectory in /tmp (or the directory specified by the TMPDIR environment variable) and then cd into this directory. At the end of the test, or on interrupt, the script should cd out of the temporary subdirectory and then rm -rf it.

2.3.4. Trap Interrupts

Use the trap statement to catch interrupts 1 2 3 and 15 and cause the test to fail. This should perform any cleanup the test requires, such as removing the temporary directory.

2.3.5. Auxiliary Files

If a test requires extra files as input or output to a command, it must construct them itself, using here documents:

cat <<EOF >file
contents
of the
file
EOF

See sh(1) for more information.

2.3.6. [ test ]

You should always use the test command, rather than the square bracket form, as many systems do not have the square bracket form, if you publish to USENET or for FTP.

2.3.7. Other Bourne Shell Portability Issues

The above list covers the most common Bourne shell issues that are relevant to most aegis tests. The documentation for the GNU autoconf utility, however, contains a more exhaustive list of Bourne shell portability issues. If you want (or need) to make your tests as portable as possible, see the documentation for GNU autoconf.

2.4. Perl

Perl is a popular open-source scripting language available on a number of operating systems. Here are some specific guidelines for writing aegis tests using Perl scripts.

2.4.1. Magic Number

Some indication that the test is a Perl script is a good idea. Because Perl is not installed in the same location on all Unix™ systems, a first-line "magic number" such as:

#! /usr/local/bin/perl

that hard-codes the Perl path name will not be portable if you publish your tests.

If the env(1) program is available, a more portable "magic number" for Perl is:

#! /usr/bin/env perl

2.4.2. Check Return Values

A Perl test script should check the return value from every subroutine, even those which cannot fail.

A Perl test script should also check the exit status of every command it executes. Checking the exit status involves testing the contents of the $? variable. See the Perl documentation on "Predefined Variables" for details.

2.4.3. Temporary Directory

Perl test scripts should create a temporary subdirectory in /tmp (or the directory specified by the $ENV{TMPDIR} environment variable) and then chdir into this directory. At the end of the test, or on interrupt, the script should chdir out of the temporary subdirectory and then remove it and its hierarchy. A portable way to do this within a Perl script:

use File::Find;
finddepth(sub { if (-d $_) {
		rmdir($_)
	} else {
		unlink($_)
	} },
	$dir);

2.4.4. Trap Interrupts

Use Perl's $SIG hash to catch interrupts for HUP, INT, QUIT and TERM and cause the test to fail. This should perform any cleanup the test requires, such as removing the temporary directory. A very simple example:

$SIG{HUP} =
$SIG{INT} =
$SIG{QUIT} =
$SIG{TERM} =
	sub { &cleanup; exit(2) };

2.4.5. Auxiliary Files

If a test requires extra files as input or output to a command, it must construct them itself, using in-line data such as here documents See the Perl documentation for more information.

2.4.6. Exit Values

Aegis expects tests to exit with a status of 0 for success, 1 for failure, and 2 for no result. The following code fragment will map all failed (non-zero) exit values to an exit status of 1, regardless of what Perl module called exit:

END { $? = 1 if $? }

A more complete example could check conditions and set the exit status to 2 to indicate NO RESULT.

2.4.7. Modules

Perl supports the ability to re-use modules of common routines, and to search several directories for modules. This makes it convenient to write modules to share code among the tests in a project.

Any modules that are used by your test scripts (other than the standard modules included by Perl) should be checked in to the project as source files. Test scripts should then import the module(s) via the normal Perl mechanism:

use MyTest;

When a test is run, the module file may actually be in the baseline directory, not the development or integration directories. To make sure that the test invocation finds the module, the test_command field in the project aegis.conf file should use the Perl -I option to search first the local directory and then the baseline:

test_command =
    "perl -I. -I${BaseLine} \
    ${File_Name}"

or, alternatively, if you had created your Perl test modules in a subdirectory named aux

test_command =
    "perl -I./aux -I${BaseLine}/aux \
    ${File_Name}"

For details on the conventions involved in writing your own modules, consult the Perl documentation or other reference work.

Actually, you need to use the ${search_path} substitution. I'll have to fix this one day.

2.4.8. The Test::Cmd Module

A Test::Cmd module is available on CPAN (the Comprehensive Perl Archive Network) that makes it easy to write Perl scripts that conform to aegis test requirements. The Test::Cmd module supports most of the guidelines mentioned above, including creating a temporary subdirectory, cleaning up the temporary subdirectory on exit or interrupt, writing auxiliary files from in-line contents, and provides methods for exiting on success, failure, or no result. The following example illustrates some of its capabilities:

#! /usr/bin/env perl
use Test::Cmd;
$test = Test::Cmd->new(prog
	=> 'program_under_test',
		workdir => '');
$ret = $test->write('aux_file', <<EOF);
contents of file
EOF
$test->no_result(! $ret =>
	sub { print STDERR
	"Couldn't write file: $!\\n"});
$test->run(args => 'aux_file');
$test->fail($? != 0);
$test->pass;

The various methods supplied by the Test::Cmd module have a number of options to control their behavior.

The Test::Cmd module manipulates file and path names using the operating-system-independent File::Spec module, so the Test::Cmd module can be used to write tests that are portable to any operating system that runs Perl and the program under test.

The Test::Cmd module is available on CPAN. See the module's documentation for details.

2.4.9. The Test and Test::Harness Modules

Perl supplies two modules, Test and Test::Harness, to support its own testing infrastructure. Perl's tests use different conventions than aegis tests; specifically, Perl tests do not use the exit status to indicate the success or failure of the test, like aegis expects. The Test::Harness module expects that Perl tests report the success or failure of individual sub-tests on standard output, and always exit with a status of 0 to indicate the script tested everything it was supposed to.

This difference makes it awkward to use the Test and Test::Harness modules for aegis tests. In some circumstances, though, you may be forced to write tests using the Test and Test::Harness modules--for example, if you use aegis to develop a Perl module for distribution--but still wish to have the tests conform to aegis conventions during development.

This can be done by writing each test to use an environment variable to control whether its exit status should conform to aegis or Perl conventions. This is easy when using the Test module to write tests, as its onfail method provides an appropriate place to set the exit status to non-zero if the appropriate environment variable is set. The following code fragment at or near the beginning of each Perl test script accomplishes this:

use Test;
BEGIN { plan tests => 3,
    onfail => sub {
        $? = 1 if $ENV{AEGIS_TEST}
        }
    }

(See the documentation for the Test module for information about using it to write tests.)

There then needs to be a wrapper Perl script around the execution of the tests to set the environment variable. The following script (called mytest.pl for the sake of example) sets the AEGIS_TEST environment variable expected by the previous code fragment:

use Test::Harness;
$ENV{AEGIS_TEST} = 1;
open STDOUT, ">/dev/null" || exit (2);
runtests(@ARGV);
END { $? = 1 if $?;
    print STDERR $?
        ? "FAILED" : "PASSED",
        "\n"; }

It also makes its output more nearly conform to aegis' examples by redirecting standard output to /dev/null and restricting its reporting of results to a simple FAILED or PASSED on standard error output.

The last piece of the puzzle is to modify the test_command field of the project aegis.conf file to have the mytest.pl script call the test script:

test_command =
    "perl -I. -I${BaseLine} mytest.pl \
    ${File_Name}"

The Test and Test::Harness modules are part of the standard Perl distribution and do not need to be downloaded from anywhere. Because these modules are part of the standard distribution, they can be used by test scripts without being checked in to the project.

2.4.10. Granularity

By Steven Knight <knight@baldmt.com>

The granularity of Perl and Aegis tests mesh very well at the individual test file (.t) level. Aegis and Test::Harness are simply different harnesses that expect slightly different conventions from the tests they execute: Aegis uses the exit code to communicate an aggregate pass/fail/no result status, Test::Harness examines the output from tests to decide if a failure occurred.

It's actually pretty easy to accomodate both conventions. You can do this as easily as setting the test_command variable in the project configuration file to something like the following:

test_command =
	"perl -MTest::Harness -e 'runtests(\"$fn\"); \
	END {$$? = 1 if $$? }'";

In reality, you'll likely need to add variable expansions to generate -I or other Perl options for the full Aegis search path. The END block takes care of mapping any non-zero Test::Harness exit code to the '1' that Aegis expects to indicate a failure.

The only thing you really lose here is the Test::Harness aggregation of results and timing at the end of a multi-test run. This is more than offset by having Aegis track which tests need to be run for a given change.

Alternatively, you can execute the .t files directly, not through Test::Harness::runtests. This is easily accomodated using the onfail method from the standard Perl Test module in each test. Here's a standard opening block for .t tests

use Test;
BEGIN { $| = 1; plan tests => 19,
	onfail => sub { $? = 1 if $ENV{AEGIS_TEST} }
}
END {print "not ok 1\n" unless $loaded;}
use Test::Cmd;
$loaded = 1;
ok(1);

That's it (modulo specifying the appropriate number of tests). My .t tests now use the proper exit status to report a failure back to Aegis. The only other piece is configuring the project's "test_command" value to set the AEGIS_TEST environment variable.

You can also use an intermediate script that also redirects the tests's STDOUT to /dev/null, if you are used to and like the coarser PASSED/FAILED status.

2.5. Batch Testing

The usual “test_command” field of the project aegis.conf file runs a single test at a time. When you have a multi-CPU machine, or are able to distribute the testing load across a range of machines, it is often desirable to do so. The “batch_test_command” of the project configuration file is for this purpose. See aepconf(5) for more information.