Chapter 8. How to use Aegis with Python

Table of Contents

1. Handling Aegis search paths
1.1. The Aegis model vs. the Python model
1.2. The solution
1.3. Why setting PYTHONPATH to the Aegis search path will not work
2. The build step
3. Testing
4. Running your programs

This section describes how to use Aegis to supervise the development of Python programs. Some of the remarks in this section may also be helpful to people who use Aegis to supervise development in other non-compiled languages.

This section is contributed courtesy of Tangible Business Software, www.tbs.co.za. Python-specific questions relating to this section may be sent to Pieter Nagel at <pnagel@tbs.co.za>.

1. Handling Aegis search paths

1.1. The Aegis model vs. the Python model

Aegis' view of a project is that it consists of a hierarchy of project baselines. Each baseline consists of only those files that were modified as part of that (sub)project, plus all files that were built by the DMT (see the section of the User Guide called The Dependency Maintenance Tool). Aegis expects the DMT to be able to collect the entire project into one piece by searching up this baseline search path for all needed files.

This works fine when using statically compiled languages such as C. The build process "projects" source files from various Aegis baselines onto one or more executables. When these are run they do not need to search through the Aegis search path for parts of themselves; they are already complete.

Python programs, however, are never compiled and linked into a single executable. One could say that a Python program is re-linked each time it is run. This means that the Python program will need to be able to find its components at run-time. More importantly, it will need to avoid importing the old versions of files from the baseline when newer versions are present in the development or integration directories.

1.2. The solution

The simplest (and only recommended) way to marry Aegis and Python is to configure Aegis to keep all of the project's files visible in a the development and integration directories, at all times. That way Aegis' search path becomes irrelevant to Python.

Use Aegis version 3.23 or later, and set the following in the project config file:

create_symlinks_before_build
	= true;
remove_symlinks_after_integration_build
	= false;

The second directive is not available in earlier versions of Aegis.

If you keep your Python files in a subdirectory of your project, such as src/python, you will need that directory's relative in your PYTHONPATH whenever Aegis executes your code for testing, i.e. by setting

test_command="\
PYTHONPATH=$$PYTHONPATH:src/python \
python ...";

in your project configuration file (example split across multiple lines for formatting only).

It may seem strange to you that we are not substituting the Aegis Search_Path variable into PYTHONPATH at all - it does at first seem to be the solution that is called for. The reason why we don't is very simple: it does not work. It is worth stressing the following rule:

Never inject any absolute path of any Aegis baseline into the Python search path.

1.3. Why setting PYTHONPATH to the Aegis search path will not work

The reason why PYTHONPATH does not work as Aegis expects is due to the way Python searches for packages. For a full explanation of what packages are, you can see Section 6.4 of the Python Tutorial, but the crucial point is that a Python package consists of a directory with an __init__.py file in which the other files in that directory which should be treated as part of that package are listed.

When Python imports anything from a package, Python first searches for the __init__.py file and remembers the absolute path of the directory where it found it. It will thereafter search for all other parts of the package within the same directory. Without the create_symlinks_before_build and remove_symlinks_after_integration_build settings enabled, all the needed files are not guaranteed to be present in one directory at all times, however; they will most likely be spread out over the entire Aegis search path.

The result is that if you were to try and use the approach of setting the PYTHONPATH to the Aegis search path, package import will mysteriously fail under (at least) two conditions:

  • Whenever you modify a file in a package without modifying the accompanying __init__.py, Python will find the __init__.py file in the baseline and import the old files from there.

  • Whenever you modify the __init__.py and leave some other file in the package unmodified, Aegis will find the __init__.py in the development/integration directory but fail to find the unmodified files there.

2. The build step

Python programs do not need to be built, compiled, or linked before they can be run, but Aegis requires a build step as part of the development cycle.

One perfectly valid option is to explicitly declare the build step to be a no-op, by setting

build_command = "true";

in the project configuration file. true(1) is a Unix command which is guaranteed to always succeed.

In practice, however, there often are housekeeping chores that can be done as part of the build process, so you can just as well go ahead and create a Makefile, Cook recipe, or script that performs these tasks and make that your build step.

Here are some examples of tasks that can be performed during the build step:

  • Setting the executable flag on your main scripts. Aegis does not store file modes, but it is often convenient to have one or more of the Python source files in your project be executable, so that one does not need to invoke Python explicitly to run them.

  • Delete unwanted Python object files (.pyc and .pyo files). These could arise when you aerm and delete a Python script, but forget to delete the accompanying Python object file(s). Other files will then mysteriously succeed in importing the removed scripts, where you would expect them to fail. Your build process could use ael -cf and ael -pf) to get a list of 'allowed' scripts, and remove all Python object files which do not correspond to any of these.

  • Auto generate your packages __init__.py files. Python wants you to declare your intent to have a directory treated as a package by creating the __init__.py file (otherwise a stray directory with a common name like 'string', 'test', 'common' or 'foo' could shadow like-named packages later on in the search path). But since Aegis is, by definition, an authoritative source on what is and what is not part of your program it can just as well declare your intent for you.

3. Testing

Testing under Aegis using Python is no different from any other language, only much more fun. Python's run-time type checking makes it much easier to develop software from loosely-coupled components. Such components are much more suited to unit testing than strongly-coupled components are.

If the testing script which Aegis invokes is part of your project, there is one important PYTHONPATH-related caveat: when Aegis runs the tests, it specifies them with an absolute path. When Python runs any scripts with an absolute path, it prepends that path to its search path, and the danger is that the baseline directory (with the old, unchanged versions of files) is prepended to the search path when doing regression testing.

The solution is to use code like this to remove the test's absolute path from the Python path:

selfdir = os.path.abspath(sys.argv[0])
if selfdir in sys.path:
	sys.path.remove(selfdir)

Instead of copying these lines into each new test file, you may want to centralize that code in a test harness which imports and runs the tests on Aegis' behalf. This harness can also serve as a central place where you can translate test success or failure into the exit statuses Aegis expects.

The test harness must take care to import the file to be tested without needing to add the absolute path of the file to sys.path. Use imp.find_module and imp.find_module.

I can strongly recommend PyUnit, the Python Unit Testing Framework by Steve Purcell, available from http://pyunit.sourceforge.net. It is based on Kent Beck and Erich Gamma's JUnit framework for Java, and is becoming the de-facto standard testing framework for Python.

One bit of advice when using PyUnit: like Aegis, PyUnit also distinguishes between test failures as opposed to test errors, but I find it best to report PyUnit test errors as Aegis test failures. This is to ensure that baseline tests fail as Aegis expects them to. PyUnit will consider a test which raises anything other than a AssertionError to be an 'error', but in practice baseline test failures are often AttributeError exceptions which arise when the test invokes methods not present in the old code. This is a legitimate way to verify, as Aegis wants us to, that the test does indeed invoke and test functionality which is not present the old code.

4. Running your programs

Of course you will at some stage want to run the program(s) you are developing.

The simplest approach is to have your program's main script be located at the top of your Python source tree (src/python) in our example. Whenever you run that script, Python will automatically add the directory it was found in to the Python path, and will find all your other files from there.

You can safely let your shell's PATH environment variable point to that script's location, since the shell PATH and the PYTHONPATH do not mutually interfere.

Just avoid the temptation to set the absolute path of that script into your PYTHONPATH, or otherwise your development code and baseline code will interfere with each other. This is not an Aegis-specific problem, though, since there would be potential confusion on any system, in any language, where two versions of one program are simultaneously visible from the same search path.