Some notes on Python relevant to ALMA and ALMA data reduction

Python path vs system path

Both of these are search-paths, i.e., a list of paths (separated by the colon character) which are searched in sequence for a file containing a program with specific name. The Python path is used when searching for Python modules, i.e., programs written in Python and dynamic libraries that can be loaded into Python. However if you want to execute a program as a sub-processes, it will be searched for in the standard path, not the Python path.

You can see these paths as follows:

import sys
# This is the Python path
sys.path

import os
# This is the normal, system, path
os.environ["PATH"]

The Python path is initialised based on the PYTHONPATH environment variable and some standard system directories. The normal, system, path is based on the PATH environment variable.

All this can be relevant to ALMA data reduction process because the WVR phase calibration program is an external command line program. If you want to call it from Python, it will be searched for in the os.environ["PATH"] and not in the sys.path list.

There are two options for telling Python where to look for the wvrgcal program:

  1. You can adjust the PATH environment variable before starting Python or CASA:

    PATH=$PATH:/users/bnikolic/p/wvr-0-15-1/bin/ casapy
    

    This way of calling casapy adjusts the environment variable and executes casapy in one go

  2. Adjust the system PATH after starting casapy:

    os.environ["PATH"]+=":/users/bnikolic/p/wvr-0-15-1/bin/"
    

Use relative paths when opening files

If you use absolute paths in your scripts (e.g., /users/bnikolic/test.dat) then almost certainly the script will not be runnable by others without some modification.

It is usually better to use paths relative to the current directory (e.g., test.dat, or anything else that does not start with /) or, relative to the user home directory:

os.path.join(os.environ["HOME"], "test.dat")

Documenting functions

Functions should be documented using the doc string facility rather than the code comments. In other words:

defun testfn(p1, p2):
     """
     A test function
     """

Is preferable to:

defun testfn(p1, p2):
     # A test function

The reason for this is that the correct form is correctly parsed by Python, and the user can retrieve the documentation by typing help(testfn) without looking at the source code.

The getattr function

The getattr function is very useful in some situation. Here is the help string:

>>> help(getattr)
Help on built-in function getattr in module __builtin__:

getattr(...)
    getattr(object, name[, default]) -> value

    Get a named attribute from an object; getattr(x, 'y') is equivalent to x.y.
    When a default argument is given, it is returned when the attribute doesn't
    exist; without it, an exception is raised in that case.

What it allows you to do is get a member (“attribute”) of an object with a name to be determined at run-time. So, instead of:

exec 'j=ephem.%s()' %src

One can do:

j=getattr(ephem, src)

Why is this so much better? Imagine for example that src is “J1000+100” – if you use the exec approach it will be misinterpreted as a an addition operation!

Avoid import’s within functions

In most situations it is best to import all of the required modules at the beginning of the script or module and not in individual functions. In this way missing modules will be detected at the time the script/module is parsed and not only after a (possibly computationally expensive) has been run.