Chapter 6
Running the Pipeline

 6.1 Checking and changing the science recipe
 6.2 Setting recipe parameters (optional)
  6.2.1 Setting recipe parameters by object name
  6.2.2 Setting recipe parameters by ORAC-DR internal header
 6.3 Setting quality-assurance parameters (optional)
 6.4 Specifying bad receptors (recommended)
 6.5 Starting the pipeline
 6.6 Pipeline output
  6.6.1 Why have multiple reduced files been generated?

6.1 Checking and changing the science recipe

You can find out which recipe is set in the data header via the RECIPE keyword in the FITS header of any of your raw files. You can use either of the options below:

  % fitsval a20130609_00059_01_0001 RECIPE
  % fitslist a20130609_00059_01_0001 | grep RECIPE

You can override the recipe set in the FITS header by listing any different one on the command line when starting Orac-dr. For example:

  % oracdr -files mylist -log x REDUCE_SCIENCE_GRADIENT

6.2 Setting recipe parameters (optional)

You can tailor the recipe parameters by supplying a .ini file (called myparams.ini in the following example). This file contains the recipe name (which must match the one assigned to your data, whether from the header or any different one specified on the command line) followed by the options you wish to specify in the following format:

  [REDUCE_SCIENCE_NARROWLINE]
  MOMENTS_LOWER_VELOCITY = -30.0
  MOMENTS_UPPER_VELOCITY = 155.0
  PIXEL_SCALE = 6.0
  SPREAD_METHOD = gauss
  SPREAD_WIDTH = 9
  SPREAD_FWHM_OR_ZERO = 6
  REBIN = 0.635,2.0

Notice the two values given for the REBIN option. This means that two maps will be produced, one at each of the specified velocity resolutions. See Appendix G for a full list of recipe parameters.

The .ini file can contain an arbitrary number of [] constructs for different recipes, qualified by object name or Orac-dr internal headers.

While the use of recipe parameters is optional, once you have some experience with the pipeline you are encouraged to tailor the recipe parameters for your reductions to get the best out of the recipe and your data.

6.2.1 Setting recipe parameters by object name

You can further select the recipe parameters by object name, as given by the OBJECT FITS header converted to uppercase with spaces removed. Here are some examples.

  [REDUCE_SCIENCE_BROADLINE:NGC253]
  REBIN = 10.0,20.0,30.0
  PIXEL_SCALE = 15.0

This would apply the two recipe parameters to the NGC 253 target when processed by REDUCE_SCIENCE_BROADLINE, while

  [REDUCE_SCIENCE_BROADLINE:ARP*]
  REBIN = 10.0,20.0,30.0
  PIXEL_SCALE = 15.0

would apply the same parameter values for any beginning with the Arp designation.

6.2.2 Setting recipe parameters by ORAC-DR internal header

The pipeline translates a selection of metadata from the FITS headers into Orac-dr internal headers. You can choose different sets of recipe parameters from the internal headers. The available headers can be inspected with translatehdr supplied with one of your raw data files.

  % $STARLINK_DIR/Perl/bin/translatehdr a20151213_00067_01_0001.sdf

Here is an example of how to select by internal header.

  [REDUCE_SCIENCE_NARROWLINE#SAMPLE_MODE=GRID]
  LV_IMAGE = 1
  LV_AXIS = skylon

would apply these parameters only to data whose sample mode was grid.

You can select by an arbitrary number of internal headers to narrow the selection. The internal headers must follow any object name as in the following example.

  [REDUCE_SCIENCE_NARROWLINE:HH21#SPECIES=C-18-O#BANDWIDTH_MODE=250MHzx4096]
  FINAL_LOWER_VELOCITY = -100
  FINAL_UPPER_VELOCITY = 100
  MOMENTS_LOWER_VELOCITY = 1.2
  MOMENTS_UPPER_VELOCITY = 6
  FLATFIELD = 0

SPECIES defines the molecule. The rationale for this might be disable estimation of the flat field because the signal is much weaker for the molecule C18O. Other common values are CO for 12CO and 13-CO for 13CO. There is also a TRANSITION header to select by the transition.

The velocity limits for the moments might be set narrower than for other molecules. You might want to set a different velocity range in the final spectral cubes (set by the FINAL_LOWER_VELOCITY and FINAL_UPPER_VELOCITY parameters) depending on the spectral resolution given by BANDWIDTH_MODE (see Figure 3.2.7 for a list).

6.3 Setting quality-assurance parameters (optional)

There is a set of quality-assurance (QA) parameters that are applied by default when you run the pipeline. These are found in the text file $ORAC_DATA_CAL/qa.ini.

You can set your own QA parameters by creating a local .ini file (called myqa.ini in the following examples). You can then call this local QA file via -calib qaparams=myqa.ini when starting the pipeline.

The example below illustrates the format of this QA file and highlights some of the main parameters you may consider tweaking.

  [default]
  BADPIX_MAP=0.1
  GOODRECEP=8
  TSYSBAD=550
  FLAGTSYSBAD=0.5
  TSYSMAX=550
  TSYSVAR=1.0
  RMSVAR_RCP=0.5

The text in the square brackets describes the data to which the QA parameters should be applied. This is followed by the QA parameters and their values.

The parameters listed under the [default] header will get picked up for all observations unless overridden by other header descriptions. Extra header may describe a frequency range (e.g. [default 200:300]), a particular molecule and transition (e.g. [default CO32]), an instrument (e.g. [RXA]), or a legacy survey (e.g. [GBS]). You can also include combinations (e.g. [GBS_13CO32]). Note that for any data taken by RxA or RxW, GOODRECEP must be set to 1.

The example file below sets up a different set of parameters for two transitions of CO. Any QA parameters not explicitly set will revert to those specified in the default qa.ini file.

  [C18O32]
  RES_CHAN=10
  VELRES=1.0
  TSYSBAD=650
  
  [CO21]
  RES_CHAN=10
  VELRES=0.5
  TSYSBAD=550
  GOODRECEP=1

See Appendix H for a description of all available QA parameters.

6.4 Specifying bad receptors (recommended)

If certain receptors are known to be bad for the full length of an observations (e.g. dead receptors), processing time can be reduced by not having the pipeline attempt to reduce those data. The ability to mask bad receptors also provides an additional tool for those wishing to examine data in more detail (e.g. when looking at data from an instrument that has yet to be fully commissioned).

Information regarding bad receptors is stored in a master text file called index.bad_receptors, located in your ORAC_DATA_CAL directory. This file lists dates and the corresponding receptors which are not operational. By default, when the pipeline runs it searches this file and discards the appropriate receptors. However, this master list is grossly incomplete so you should select another method for specifying bad receptors.

We recommend using the index option, called via -calib bad_receptors=index when starting the pipeline. This creates an index file called index.bad_receptors_qa in your ORAC_DATA_OUT directory, where any bad receptors identified during your reductions are indexed. This file is appended each time you run the pipeline and in this way you build up your own independent list.

If an index file exists, then, by default, the pipeline will look for bad-receptor information in both $ORAC_DATA_CAL/index.bad_receptors and $ORAC_DATA_OUT/index.bad_receptors_qa.


bad_receptors

Description



masterorindex

Use both the master index.bad_receptors and pipeline-generated index.bad_receptors_qa files [Default].

master

Use the master index.bad_receptors file in $ORAC_DATA_CAL.

index

Use the index.bad_receptors_qa index file in $ORAC_DATA_OUT as generated by the pipeline.

file

Reads a file called bad_receptors.lis, which should contain a space-separated list of receptor names in the first line. This file must be located in $ORAC_DATA_OUT and invoked by -calib bad_receptors=file.

list

A colon-separated list of receptor names can be supplied, such as -calib bad_receptors=H01:H06, or -calib bad_receptors=NU1L,NU1U. You can append any of the other options to the end of your list, such as -calib bad_receptors=H14:index




Table 6.1: The options available for the -calib bad_receptors flag when running the pipeline.

You can inquire which receptors are present in your data by inspecting a data file’s FITS header. This can be done using fitslist or fitsval to look at the RECPTORS keyword.

  % fitsval a20210520_00060_02_raw0001 RECPTORS
  H01 H02 H03 H04 H05 H06 H07 H08 H09 H10 H11 H12 H13 H15

These receptors may not necessarily all contain good data. For reduced spectral cubes generated by makecube, the RECPUSED keyword reports only the receptors actually used to form the spectral cube. See Section 7.2.3.

6.5 Starting the pipeline

(1)
Initialise the pipeline software. This may be as simple as
  % oracdr_acsis
(2)
Define environment variables to ensure the data are read from, and written to, the right place. Many are set automatically when the pipeline is initialised but others must be set manually. Details of the optional variables are given in SUN/260 but you should specify where to find the raw data and where to write any files that are created.
  % setenv ORAC_DATA_IN <path to raw data>
  % setenv ORAC_DATA_OUT <where you want reduced data to go>

These extra definitions can be avoided if you initialise with the following command.

  % oracdr_acsis -cwd

The cwd option requests that the current working directory is used for output, which is most likely what you require.

If you are supplying a text file listing the raw data $ORAC_DATA_IN should be the location of the files listed, unless they are given as absolute paths.

If you wish to keep the intermediate files produced by the pipeline you should also set ORAC_KEEP.

  % setenv ORAC_KEEP 1

However, this still excludes temporary files made during a recipe step, whose names have the oractemp prefix. To retain these set

  % setenv ORAC_KEEP temp

Retention of temporary and intermediate will require plenty of storage, but it can be invaluable during debugging.

(3)
Now you are ready to run the pipeline. In this example, the recipe name on the end ensures that all data being processed using REDUCE_SCIENCE_NARROWLINE.
  % oracdr -files list.txt -batch -log xf -calib qaparams=myqa.ini \
    bad_receptors=index -recpars mypar.ini -nodisplay REDUCE_SCIENCE_NARROWLINE

Below is a list of commonly used command line options when running the Orac-dr pipeline from outside the EAO. For a full list of all possible options and their descriptions see SUN/230.


Flag

Description



Supplying data


-files <mylist >

Input data provided by a text file. Supply the file name (relative to current directory, or the full path) of an ASCII text file containing a list of all observation files to be reduced, one file per line.



Looping


-loop file

Loop over all the lines in the file supplied by the -files option. (This is the default if a list of files has been given.)



Group processing


-batch

Delays group processing until the individual files have been reduced. (This is the default if not in QL or summit mode, or processing today’s data.)

-onegroup

Forces all the observations into the same group. Useful if building up a map from rasters with different target co-ordinates or from different dates.



Supplying parameter files


-calib

Under the calibration flag you can specify QA parameters by including qaparams=<myqa.ini> and bad receptors via bad_receptors=index. If you do not set the qaparams option but have a file called qa.ini in your output directory, the QA settings will come from this file.

-recpars

Allows you to provide recipe parameters as a .ini file.



Logs/Display


-nodisplay

Shorten processing times by turning off all graphical output.

-log sxfh

Send the log to a screen (s), an xwindow (x), file (f) or html (h). The logfile (f) has the name .oracdr_NNNN.log where NNNN is the current process ID. Only include the options you want. This defaults to xf.




Table 6.2: The options available on the command line when running the Orac-dr pipeline from outside the EAO.

6.6 Pipeline output

The pipeline will produce a group file for each object being processed. If the pipeline is given data from multiple nights, all those data will be included in the group co-add using inverse variance weighting.

Files created from individual observations are prefixed with a, while group (co-added) maps are prefixed with ga. Table 6.3 describes the files created. In addition PNG images are made of the reduced files at a variety of resolutions.

Tip:
Even if just a single observation is processed, a group file is created so long as the observation passes the QA check.

6.6.1 Why have multiple reduced files been generated?

You will find the pipeline splits large reduced cubes (e.g. from raster maps) into smaller tiles of <512 MB. This is to save on computer memory. It does this from the central position outwards so you may find some of the smaller edge tiles are strange, often narrow, rectangles. These tiles can be pasted together with the Kappa command paste but beware of the final file size. For example,

  % paste ga20140601_15_1_reduced\*.sdf tiledmap

Note, there is a recipe parameter CHUNKSIZE which adjusts the maximum tile size. If you make it larger than 512 it is possible to generate a single reduced cube.




Default Files


cube001

Baselined cube

integ

Integrated intensity image

rimg

Representative image (same as integ file), used to form rimg PNG

sp001

Spectrum taken from position of peak intensity in the integ file

rsp

Representative spectrum (same as sp001), used to form rsp PNG

iwc

Intensity weighted co-ordinate image

noise

Noise map

reduced00n

Final trimmed, baselined cube of the nth tile. There may be a few of these as large maps will be split into tiles of 512 MB.

rmslo

Low-frequency noise

rmshi

High-frequency noise



Extra files kept with ORAC_KEEP = 1


em001

Ends of the frequency axis have been trimmed off

thr001

Time-series data with large spikes removed

ts001

Time-sorted time-series data

tss001

Time-series data with median DC signal removed

blmask001

Baseline mask cube. Regions of emission have been masked out.

bl001

Baselined cube

linteg00n

Integrated intensity image formed around the nth line.




Table 6.3: Table of the files written out by the science pipeline. Each of these files is generated for both the individual observations and the group files.

In addition to these NDFs a number of log files are created.




Default Files


.oracdr_*.log

The Orac-dr log file.

log.flat

Flat-field ratios for each receptor and observation.

log.group

The files contributing to each group.

log.noisestats

Noise statistics for each observation and group.

log.qa

Quality assurance reports

log.removedobs

Observations removed from each group (e.g. due to failing QA)




Table 6.4: Table of the log files written by the science pipeline.

Finally there are HDS scratch t*.sdf files, and temporary files oractemp* which can be NDFs or small text files such as file lists. These should be removed automatically at the end of processing, unless something has gone wrong.