Chapter 1
Introduction

 1.1 This cookbook
 1.2 Before you start: computing resources
 1.3 Before you start: software
  1.3.1 Data formats
  1.3.2 Initialising Starlink
  1.3.3 KAPPA and SMURF for data processing
  1.3.4 GAIA for viewing your map
  1.3.5 ORAC-DR for running the pipeline
  1.3.6 PICARD for post-reduction processing
  1.3.7 How to get help
 1.4 Processing options

1.1 This cookbook

This guide is designed to instruct SCUBA-2 users on the best ways to reduce and visualise their data using Starlink packages: Smurf[4], Kappa[7], Gaia[10], ORAC-DR [3] and Picard[11].

This guide covers the following topics.

Throughout this document, a percent sign (%) is used to represent the Unix shell prompt. What follows each % will be the text that you should type to initiate the described action.

1.2 Before you start: computing resources

Before reducing SCUBA-2 data using the Dynamic Iterative Map-Maker you should confirm you have sufficient computing resources for your type of map.

We recommend the following:




Reduction type Memory


Large maps (pong) 96 GB
Small maps (daisy) 32–64 GB
Blank fields 32–64 GB



Why these recommendations?

For large-area maps it is important to process a full observation in a single chunk. See the text box on Chunking for an explanation of chunking. For large maps, using normal map-maker parameters a machine having 96 GB is acceptable. It is important that the memory is as fast as can be afforded, as RAM speed has a direct linear effect on processing time given that the time-series data are continually being funnelled through the CPU.

For blank-field surveys or smaller regions of the sky you can usefully run the map-maker with less memory and 32 to 64 GB is reasonable depending on the specifics of your data set. smurf is multi-threaded so multiple cores do help although above eight cores the price/performance gains tend to drop off.

If you have a very large machine (128 GB and 24 cores) you may be able to run two instances of the map-maker in parallel without chunking, depending on the nature of the data. Use the SMURF_THREADS1 environment variable to restrict each map-maker to half the available cores.

1.3 Before you start: software

This manual uses software from the Starlink packages: Smurf [4], Kappa [7], Gaia [10], ORAC-DR [3] and Picard [11]. Starlink software must be installed on your system, and Starlink aliases and environment variables must be defined before attempting to reduce any SCUBA-2 data.

1.3.1 Data formats

Data files for SCUBA-2 use the Starlink N-dimensional Data Format (NDF, see Jenness et al. 2014[14]), a hierarchical format which allows additional data and metadata to be stored within a single file. Kappa contains many commands for examining and manipulating NDF structures. The introductory sections of the Kappa document (SUN/95) contain much useful information on the contents of an NDF structure and how to manipulate them.

A single NDF structure describes a single data array with associated meta-data. NDFs are usually stored within files of type “.sdf”. In most cases (but not all), a single .sdf file will contain just one top-level NDF structure, and the NDF can be referred to simply by giving the name of the file (with or without the “.sdf” prefix). In many cases, a top-level NDF containing JCMT data will contain other “extension” NDFs buried inside them at a lower level. For instance, raw files contain a number of NDF components which store observation-specific data necessary for subsequent processing. The contents of these (and other NDF) files may be listed with Hdstrace. Each file holding raw JCMT data on disk is also known as a ‘sub-scan’.

The main components of any NDF structure are:

The Convert package contains commands fits2ndf and ndf2fits that allow interchange between FITS and NDF format.

1.3.2 Initialising Starlink

The commands and environment variables needed to start up the required Starlink packages (Smurf[4], Kappa, etc.) must first be defined. For C shells (csh, tcsh), do:

  % setenv STARLINK_DIR <path to the starlink installation>
  % source $STARLINK_DIR/etc/login
  % source $STARLINK_DIR/etc/cshrc

before using any Starlink commands. For Bourne shells (sh, bash, zsh), do:

  % export STARLINK_DIR=<path to the starlink installation>
  % source $STARLINK_DIR/etc/profile

1.3.3 KAPPA and SMURF for data processing

The Sub-Millimetre User Reduction Facility, or Smurf, contains the Dynamic Iterative Map-Maker, which will process raw SCUBA-2 data into images (see SUN/258). Kappa meanwhile is an application package comprising general-purpose commands mostly for manipulating and visualising NDF data (see SUN/95). Before starting any data reduction you will want to initiate both Smurf and Kappa.

  % smurf
  % kappa

After entering the above commands, you can access the help information for either package by typing smurfhelp or kaphelp respectively in a terminal, or by using the showme facility to access the hypertext documentation. See Section 1.3.7 for more information.

Tip:
The .sdf extension on filenames need not be specified when running most Starlink commands (the exception is Picard).

1.3.4 GAIA for viewing your map

Image visualisation can be done with Gaia (see SUN/214). Gaia is a GUI-driven image and data-cube display and analysis tool, which incorporates facilities such as source detection, three-dimensional visualisation, photometry and the ability to query and overlay on-line or local catalogues.

  % gaia map.sdf

Alternatively, the Kappa package includes many visualisation commands that can be run from the shell comand-line or incorporated easily into your own scripts—see Appendix “Classified KAPPA commands” in SUN/95. These tools are particularly useful for creating more complex composite plots including multiple images, line-plots, etc, such as the multi-image plots in Section 5.3.2.

1.3.5 ORAC-DR for running the pipeline

The ORAC-DR Data Reduction Pipeline [3] (hereafter just Orac-dr) is an automated reduction pipeline. Orac-dr uses Smurf and Kappa (along with other Starlink tools) to perform an automated reduction of the raw data following pre-defined recipes to produce calibrated maps. The following commands initialise Orac-dr ready to process 850 μm and 450 μm data respectively.

  % oracdr_scuba2_850
  % oracdr_scuba2_450

For more information on available recipes and intructions for running the pipeline see Chapter 4.

1.3.6 PICARD for post-reduction processing

Picard uses a pipeline system similar to ORAC-DR for post-processing and analysis of reduced data. Picard documentation can be found at Orac-dr web page, or at SUN/265. All Picard recipes follow the same structure and are run like so:

  % picard -recpars <recipe_params_file> RECIPE <input_files>

where <recipe_param_file> is a text file containing the relevant recipe parameters. RECIPE is the name of the recipe to be run (note the caps). The list of files to be processed is given by <input_files>. These must be in the current directory or a directory defined by the environment variable ORAC_DATA_IN. A number of Picard recipes will be demonstrated in Chapter 8.

Other command-line options include -log xsf where the log file is written to any combination of the screen [s], a file [f] or an X-window [x]. s or sf is recommended as the recipes are short and the X-window automatically closes upon completion.

You do not specify an output filename for Picard, instead the output is generated by adding a recipe depending suffix to the input filename. If there is more than one input file then the name of the last file is used.

You can create a file which lists the input files to be passed to Picard for processing. this file is read by Picard via the Linux/Unix cat command. For example:

  % picard -log s RECIPE_NAME ‘cat myfilestoprocess.lis‘

To execute the cat command you must enclose it in back quotes. You must also include the .sdf extension on any files passed to Picard.

Tip:
Unlike other Starlink packages, the .sdf extension must be included when supplying the names of Starlink data files to Picard.

Tip:
If the environment variable ORAC_DATA_OUT is defined, any files created by Picard will be written in that location. Check there if new files are expected but do not appear in your working directory.

1.3.7 How to get help





Help
command

Description

Usage




showme

If you know the name of the Starlink document you want to view, use showme. When run, it launches a new webpage or tab displaying the hypertext version of the document.

% showme sun95




findme

findme searches Starlink documents for a keyword. When run, it launches a new webpage or tab listing the results.

% findme kappa




docfind

docfind searches the internal list files for keywords. It then searches the document titles. The result is displayed using the Unix more command.

% docfind kappa




Run routines with prompts

You can run any routine with the option prompt after the command. This will prompt for every parameter available. If you then want a further description of any parameter type ? at the relevant prompt.

% makemap prompt 
%̃ REF - Ref. NDF /!/ > ?




Google

A simple Google search such as “starlink kappa fitslist” will usually return links to the appropriatre documents. However, be aware that the results may include links to out of date versions of the document hosted at non-Starlink sites. Always look for results in "www.starlink.ac.uk/docs (or "www.starlink.ac.uk/devdocs for the cutting-edge development version of the document).





1.4 Processing options

You have two options for processing your data:

(1)
running the automated pipeline (Orac-dr), or
(2)
performing each step manually.

The pipeline approach is simpler and works well if you have a lot of data to process. Performing each step by hand allows more fine-tuned control of certain processing and analysis steps, and is especially useful for refining the parameters used by the map-maker. However, once the optimal parameters have been determined, it is possible to pass them to the pipeline to process other observations using the same configuration. Chapter 3 and Chapter 5 discuss the manual approach; to use the science pipeline, skip straight to Section 4.

The JCMT will produce pipeline-reduced files for each observation and group of repeat observations for each night. These are reduced using the ORAC-DR pipeline with the recipe specified in the MSB. Chapter 4 gives instruction on retrieving reduced data from the JCMT Science Archive at CADC.

1SMURF_THREADS should be set to an integer value indicating the number of threads to be used by each process.