A Better Interface Between Scientists and Data Reduction Software

This is a talk that I have at the NRAO ALMA Software Development
workshop
(https://science.nrao.edu/facilities/alma/naasc-workshops/almasoft2011/index). The
slides are here:

http://www.mrao.cam.ac.uk/~bn204/publications/2011/2011-10-cv-workshop-slides.pdf

And this is the abstract

I this talk I will argue that, although it has much improved in recent
years, the typical way we interact with data reduction software (that
is, the software that turns the bulky observed measurements into a
smaller data set of images, spectra, etc, that is then further
interpreted) is still not ideal. Some particularly important
shortcomings are:

- Operations on data are performed only ever strictly in the sequence
supplied by the user (either interactively or through
functions/scripts)

- Commands available to the user often combine instructions about what
needs to be done with the details of how this is done

- Information on (potential) problems in the data is difficult to
share across different observations

The consequences of these apparently simple shortcomings are that data
reduction is much more labour and computer intensive than it needs to
be. Since with ALMA, EVLA and forthcoming SKA precursors we are
likely to be short both of available labour and computing power, I
suggest that investing in better interfaces is essential.

In the second part of this talk I will present a set of straw-man
requirements for a better interface between scientists and data
reduction software and show how these requirements would resolve the
shortcomings listed above. I will discuss a possible way of
implementing these and show that such a system can easily build on top
of existing data reduction systems such as CASA. I will argue that the
three benefits of such a system would be:

- Increasing the quantity of science from each scientist and kWh of
compute power by improving their efficiency

- Improving the quality of science by easing repeatability and
reducing scope for error

- Making it easier for user scientists to send instructions to the
observatory on how their data needs to be reduced -- an essential
feature for forthcoming telescopes with extremely high data rates

blog comments powered by Disqus