## Chapter 4The SCUBA-2 Pipeline

### 4.1 Pipeline overview

SCUBA-2 data reduction pipelines have been developed based on the existing Orac-dr pipeline (Cavanagh et al., 2008[3]) used for ACSIS. There are three distinct pipelines currently utilised by SCUBA-2. Users will likely only need to run the science pipeline. The other two pipelines are designed to run at the JCMT—the quick-look (QL) and summit pipelines. The latter two are run in real time at the JCMT during data acquisition.

• The science pipeline has access to all the data observed for a given project and adopts a best-possible reduction approach. Images are made for each complete observation which are combined to create the final image. Users wishing to reduce their own data should use this pipeline. This pipeline is responsible for producing the reduced data that is accessible to users via CADC.
• The QL runs quality assurance checks on the data as it comes in. For science data it calculates the noise between 2 Hz and 10 Hz, along with the NEP and effective NEP, for each 30-second scan. These values undergo quality-assurance checks to ensure SCUBA-2 is within an acceptable operating range.
• The summit pipeline is designed to provide a quick map of the data, it does this by running fewer iterations and chunking the data more. This is a useful guide to observers who wish to check the quality of their data.

The manual for the SCUBA-2 pipeline can be found at SUN/264, while the pipeline software comes as part of the Starlink suite.

### 4.2 The Science Pipeline

The science pipeline will perform the following:

• Run the iterative map-maker.
• Apply the FCF to calibrate to mJy/beam.
• Co-add multiple observations of the same object.
• Apply the matched-filter (blank-field configuration file only)
• Run a source-finding algorithm.

#### 4.2.1 Pipeline recipes

When a project is initially created and MSBs (Minimum Scheduling Blocks) are created in the Observing Tool, the PI can select a pipeline recipe to assign to the data. When the data are run through the science pipeline this recipe is then called by default. This can be overridden on the command line—see Section 4.4. Described below are the five main Orac-dr science recipes.

#### 4.2.2 REDUCE_SCAN

configuration file: dimmconfig_jsa_generic.lis

This recipe uses the configuration file dimmconfig_jsa_generic for makemap, unless the sources is identified as a calibrator. After all observations have been processed the data are co-added and calibrated in mJy/beam using the default FCF. The noise and NEFD properties for the co-add are calculated and written to log files (log.noise and log.nefd respectively). Finally, the Cupid task findclumps is run using the FellWalker algorithm (Berry, 2015[2]) to create a source catalogue.

For calibrators, dimmconfig_bright_compact.lis is used and FCFs are derived from the map.

#### 4.2.3 REDUCE_SCAN_CHECKRMS

Configuration file: dimmconfig_jsa_generic.lis

This recipe is the same as REDUCE_SCAN, but includes extra performance estimations determined by SCUBA2_CHECK_RMS (see Picard’s SCUBA2_CHECK_RMS). These extra metrics are written to a log file log.checkrms. Running SCUBA2_CHECK_RMS in the pipeline, rather than as a standalone Picard recipe, allows it to calculate results for co-added maps.

#### 4.2.4 REDUCE_SCAN_EXTENDED_SOURCES

Configuration file: dimmconfig_bright_extended.lis

This is the recipe for processing extended sources. Multiple observations are co-added and the output map is calibrated in units of mJy/arcsec${}^{2}$. This recipe also performs a source-finder routine; the results are written as a FITS catalogue (with file extension .FIT) which can be read as a local catalogue into Gaia.

#### 4.2.5 REDUCE_SCAN_FAINT_POINT_SOURCES

Configuration file: dimmconfig_blank_field.lis

This is the recipe for processing maps containing faint compact sources. This time the configuration file called by makemap is dimmconfig_blank_field.lis and the map calibrated in mJy/beam. The output map is further processed with a matched filter, then the S/N is taken to enhance point sources. A map is written out at each step. This recipe also performs a source finder routine; the results are written as a FITS catalogue (with file extension .FIT) which can be read as a local catalogue into Gaia.

#### 4.2.6 REDUCE_SCAN_ISOLATED_SOURCE

Configuration file: dimmconfig_bright_compact.lis

This is the recipe used for processing calibrator data. It can also be used for any map of a single bright, isolated source at the tracking position.

This reduction constrains the map to zero beyond a radius of 1 arc-min from the source centre. See Section 3.7.3

#### 4.2.7 FAINT_POINT_SOURCES_JACKKNIFE

Configuration file: dimmconfig_blank_field.lis

This recipe uses a jack-knife method to remove residual low-spatial frequency noise and create an optimal matched-filtered output map. The map-maker is run twice, first as a standard reduction using dimmconfig_blank_field.lis (and calibrated in mJy/beam), and the second time with a fake source added to the time series. This creates a signal map and an effective PSF map. A jack-knife map is generated from two halves of the dataset and the maps are ‘whitened’ by the removal of the residual 1/f noise. The whitened signal map is processed with the matched filter using the whitened PSF map as the PSF input. The data are calibrated in mJy/beam using a corrected FCF. See Section 7.1.2 for a more-detailed description of this recipe and the files produced.

### 4.3 Running the Science Pipeline

 Step 1: Initialise ORAC-DR For 850 micron data, this is done by:   % oracdr_scuba2_850 -cwd For 450 micron data, this is done by:   % oracdr_scuba2_450 -cwd Set environment variables These ensure the data are read from and written to the right places. Many are set automatically when the pipeline is initialised but others must be set manually. Details of the optional variables are given in SUN/264 but the three main ones are: STARLINK_DIR – Location of your Starlink installation. ORAC_DATA_IN – The location where the data should be read from. If you are supplying a text file listing the raw data this should be the location of that file. ORAC_DATA_OUT – The location where the data products should be written. Also used as the location for a user-specified configuration file. Run the pipeline This is done by:   % oracdr  -loop file -files  where the list of files that you wish to reduce can be an individual or multiple observations.

Tip:
If you run with -verbose on the command line then you will obtain all messages from the Starlink engines (rather than just ORAC-DR messages). This is particularly useful for understanding what is occurring during the map-maker stage of reduction. This is particularly recommended for new users.

When executing the Orac-drcommand a new Xwindow will appear within which will contain the pipeline output, as shown in Figures 4.14.4.

### 4.4 Changing the defaults

#### 4.4.1 Changing ORAC-DR’s behavior

ORAC-DR’s behavior cab be changed on the command line. For help simply type

% oracdr -help

To run the pipeline and obtain all messages from the Starlink engines (rather than just ORAC-DR messages) you will need to run with verbose (recommended)

% oracdr -loop file -files <list_of_files>  -verbose

To run the pipeline and have the results sent to the screen (s) and to a file (f—the file produced is usually called .oracdr_NNNN.log where NNNN is the current process ID. It is written to \$ORAC_DATA_OUT and is a hidden file) is specified using the -log command.

% oracdr -files <list_of_files> -loop file  -log sf -verbose

#### 4.4.2 Changing the pipeline recipe

You can override the recipe set in the header by listing any different one on the command line when starting Orac-dr. For example

% oracdr -file <list_of_files> -loop file -log sf REDUCE_SCAN_CHECKRMS

You can find out which recipe is set in the data header via the FITS header RECIPE keyword in any of your raw files. For example both of these options will return the same result:

% fitsval s8a20120725_00045_0003 RECIPE
% fitslist s8a20120725_00045_0003 | grep RECIPE

#### 4.4.3 Changing the configuration file

Although each recipe calls one of the standard configuration files you can specify your own. You will need to create a recipe parameter file. This file will set the parameter MAKEMAP_CONFIG to be your new configuration file. The first line must be the name of the recipe used in the reduction.

For example, to run the pipeline with REDUCE_SCAN_CHECKRMS with a configuration file called myconfig.lis, the recipe parameter file (mypars.ini) will look like this.

[REDUCE_SCAN_CHECKRMS]
MAKEMAP_CONFIG = myconfig.lis

Then run the pipeline calling the parameter file via the -recpars option.

% oracdr -file <list_of_files> -loop file -log sf -recpars myparams.ini REDUCE_SCAN_CHECKRMS

#### 4.4.4 Parameter file options

To supply both a new configuration file and a different set of clump-finding parameters we would update the parameter file mypars.ini to look like:

[REDUCE_SCAN]
MAKEMAP_CONFIG = mynewconfig.lis
FINDCLUMPS_CFG = myfellwalkerparams.lis

Other options we can change in the parameter file include—changing the pixel size

[REDUCE_SCAN]
MAKEMAP_PIXSIZE = 2

changing output units to mJy/beam

[REDUCE_SCAN]
CALUNITS = beam

changing output units to mJy/arcsec

[REDUCE_SCAN]
CALUNITS = arcsec

### 4.5 What to look out for

Once the map-maker has completed you can open your output map using Gaia (see Figure 4.5). The excerpt in Chapter 5 shows the output written to the terminal as you run the map-maker. There are a number of clues in this output that indicate the status of the reduction.

The number of input files
The first to note is the number of input files; it is worth checking this matches your expected number. Also summarised are the source name, UT date and scan number.
Map dimension
Next the basic dimensions of the data being processed are listed near the start of the first iteration. The example above has 4 arcsec pixels—the default at 850$\mu$m.
Chunking
The map-maker then determines if the raw data should be split and processed in more than one chunk. In this map the data are reduced in one continuous piece: Continuous chunk 1 / 1. Chunking is where the map-maker processes sub-sections of the time-series data independently and should be avoided if possible—see the text box on Chunking.
##### Quality statistics

At the beginning of the reduction, the main purpose of QUALITY flagging is to indicate how many bolometers are being used. In the example above you can see that from a total of 5120 bolometers, 1842 were turned off during data acquisition (BADDA). In addition, 136 bolometers exceeded the acceptable noise threshold (NOISE), while tiny fractions of the data were flagged because the telescope was moving too slowly (STAT) or the sample are adjacent to a step that was removed (DCJUMP).

The total number of bad bolometers (BADBOL) is 1984. Accounting for these, and the small numbers of additionally flagged samples, 3128.22 effective bolometers are available after initial cleaning1.

After each subsequent iteration a new ‘Quality’ report is produced, indicating how the flags have changed. An important flag that appears in the ‘Quality’ report following the first iteration is COM: the DIMM rejects bolometers (or portions of their time series) if they differ significantly from the common-mode (average) of the remaining bolometers.

You may note that compared with the initial report, the total number of samples with good ‘Quality’ (Total samples available for map) has dropped from 18634826 to 18273302 (about a 2 per cent decrease) as additional samples were flagged in each iteration.

Be aware that some large reductions may take many iterations to reach convergence and you may find significantly fewer bolometers remaining resulting in higher noise than expected.

##### Convergence

The convergence criteria maptol is updated for each iteration. The convergence can be checked from the line reporting
smf_iteratemap: *** NORMALIZED MAP CHANGE: 0.10559 (mean) 2.81081 (max)

The number to look out for is the mean value. This will have to drop below your required maptol for convergence to be achieved.

The default configuration file used in this example executes a maximum of five iterations, but stops sooner if the change in maptol drops below 0.05 (i.e. numiter =$-$5). In this example it stops after five iterations.

Tip:
You can interrupt the processing at any stage with a single Ctrl-C. The map-maker will complete the iteration then write out a final science map. Entering Ctrl-C twice will kill the process immediately.

### 4.6 Pipeline output

The pipeline will produce a group file for each object being processed. If the pipeline is given data from multiple nights, all those data will be included in the group co-add using inverse variance weighting.

The final maps in your output directory will have the suffix _reduced. Maps will be made for individual observations, which will start with an s (e.g. s20140620_00030_850_reduced.sdf). Group maps, which may contain co-added observations from a single night, are also produced which have the prefix gs and the date/scan of the first input file (e.g. gs20140620_30_850_reduced.sdf).

Note: A group file is always created, even if only a single observation is being processed.

Additionally, PNG images are made of the reduced files at a variety of resolutions.

Another useful feature is that the pipeline will generate a log files to record various useful quantities. The standard log files from reducing science data are:

• log.noise—noise in the map for each observation and the co-add (calculated from the median of the error array), and
• log.nefd—NEFD calculated for each observation and for the co-added map(s).