9 The data reduction process

 9.1 Preliminaries
 9.2 Skydips
 9.3 Noise measurements
 9.4 Common data reduction
 9.5 Despiking
 9.6 Map making
 9.7 Photometry
 9.8 Scan maps
 9.9 Polarimetry data reduction

This section will describe the steps needed to process SCUBA data. For more detailed examples please consult the two cookbooks [910]. For this example I will use map and photometry observation of 3C279 from a commissioning night9.

9.1 Preliminaries

The data is not in the work directory so DATADIR should be set:

  % setenv DATADIR /scuba/observe/apr8/dem

assuming a C-shell type environment. I can now make a log of a subset of the nights data to find out which observations should be processed:

  % sculog -summary --begin=54 --end=63
  -Log for directory: /home/timj/scuba/maps/sun216
                      /scuba/observe/apr8/dem
  
   #    HST    Obsmode   Source   Meas/Int Airmass  Filter  Bolometers
  ---- -----   --------- -------  -------- ------- -------- ----------
  54   23:02   SKYDIP    SKY        10/20          450N:850 SHORT_DC,LONG_DC
  55   23:13   POINTING  3c279       1/2   1.14464 450N:850 LONG
  56   23:17   PHOTOM    3c279       1/10  1.13858 450N:850 H7
  57   23:22   PHOTOM    3c279       1/10  1.13217 450N:850 H7
  58   23:28   PHOTOM    3c279       1/10  1.12641 450N:850 H7
  59   23:35   MAP       3c279       1/3   1.12019 450N:850 SHORT,LONG
  60   23:44   MAP       3c279       1/3   1.11481 450N:850 SHORT,LONG
  61   23:53   MAP       3c279       1/3   1.11101 450N:850 SHORT,LONG
  62   00:02   MAP       3c279       1/3   1.10965 450N:850 SHORT,LONG
  63   00:03   POINTING  3c273       1/4   1.05483 450N:850 SHORT
   :                                  :                :
   :                                  :                :
  98   03:18   SKYDIP    SKY        10/20          450N:850 SHORT_DC,LONG_DC

In order to save time typing in the filename every time we wish to access demodulated data, we also set the SCUBA_PREFIX environment variable:

  % setenv SCUBA_PREFIX apr8

This variable allows demodulated data to be referenced by observation number. It is also possible to set the style for the default output filename provided by the individual tasks. In this example, we will use the default convention (’LONG’). More information on the Surf environment variables can be found in §7.

9.2 Skydips

From the listing in the previous section we can see that Skydip data was taken at scans 54 and 98. From the RO (either by using mapsum or photsum on the ro data or by using hdstrace) file we can see that the fitted taus were 1.140 (short) and 0.220 (long) for scan 54 and 1.042 (short) and 0.187 (long).

In most cases these numbers will be sufficient for use by extinction but it is possible to recalculate the tau by using the skydip task. As an example here is the result of skydip on scan 54:

  % skydip 54
  SURF: Opening apr8_dem_0054 in /scuba/observe/apr8/dem
  SURF: run 54 was a SKYDIP observation
  SURF: observation started at sidereal time 11 47 24 and ended at 11 54 07
  SURF: file contains data for the following sub-instrument(s)
   - SHORT with filter 450
   - LONG with filter 850
  SUB_INSTRUMENT - Name of sub-instrument to be analysed /’SHORT’/ > l
  SURF: file contains data for 20 integration(s) in 10 measurement(s)
  T_HOT - Temperature of hot load (K) /278/ >
  T_COLD - Temperature of cold load for LONG_DC /73.6/ >
  ETA_TEL - Telescope efficiency /0.91/ >
  B_VAL - B parameter /-1/ >
  SCULIB: fit for filter 850 and sub-instrument LONG_DC
   eta =  0.91 +/-  0.00  b =  0.86 +/-  0.00  tau =   0.220 +/- 0.002
   Standard Deviation of fit residual =   0.74 K (X=     1.0 N=    9)


pdfpict
Figure 5: The Skydip result for scan 54. The crosses are the measured sky temperatures and the line is the fitted model.


The results of the fit are displayed in figure 5. Points worth noting are that the local sidereal time of the observation is printed (this is useful later when running extinction), a fixed ηtel and a floating value of B (the default value for ηtel is read from the file header) were used and the tau agrees with the on-line system (which is not surprising since the same code is used on-line as in Surf). The errors derived for the fit can sometimes be suspect since the parameters are not completely independent. The standard deviation of the fit residual gives a measure of the scatter in the points about the model fit. Note also that the ‘X’ indicates the reduced χ2 of the fit (forced to be approximately 1.0 by the program when it determines the errors) and the ‘N’ indicates the number of iterations required to converge on the fit.

Occasionally, it is necessary to remove some points from the fit. This can be achieved by using reduce_switch and change_quality before running skydip. An example of this can be found in §I.2.

The sdip script can be used to automate the procedure of running skydip and displaying the results with Kappa’s linplot. More information on skydipping can be found in Appendix I.

9.3 Noise measurements

Noise observations are reduced on-line and written to an ASCII text file. In some cases this text file is not available and the reduce_noise task can be used to recreate it (as well as generating an NDF file containing the results) from the raw demodulated data file.

  % reduce_noise 19981113_dem_0001
  SURF: Opening 19981113_dem_0001 in /scuba/observe/19981113/dem
  SURF: run 1 was a NOISE observation
  OUT - Name of container file to hold map and time-sequence data /’o1’/ >
  FILE - Name of ASCII file to contain results summary /’noise_981113_1.dat’/ >

9.4 Common data reduction

Now the data processing can begin. We will start by running reduce_switch in order to subtract the off from the on and to split the data array of the raw data into separate components.

  % reduce_switch 59
  SURF: Opening apr8_dem_0059 in /scuba/observe/apr8/dem
  SURF: run 59 was a MAP observation of object 3c279
  SURF: file contains data for 2 switch(es) in 4 exposure(s) in 3 integration(s)
  in 1 measurement(s)
  OUT - Name of output file to contain reduced switch data /’o59’/ >

or with the full file specification:

  % reduce_switch apr8_dem_0059
  SURF: Opening apr8_dem_0059 in /scuba/observe/apr8/dem
  SURF: run 59 was a MAP observation of object 3c279
  SURF: file contains data for 2 switch(es) in 4 exposure(s) in 3 integration(s)
  in 1 measurement(s)
  OUT - Name of output file to contain reduced switch data /’o59’/ >

In this example the calibrator signal has not been used and any datum from which more than 5 spikes were removed by the transputers is marked bad (these are the default settings). The processed data are then written to file o59.sdf.

In this case we need to change the flatfield file (since the flatfield was updated after the data were taken) using change_flat:

  % change_flat
  IN - Name of input file containing demodulated map data /@o59/ >
  SURF: run 59 was a MAP observation of 3c279
  NEW_FLAT - The name of the file containing the new flat-field > photflat1.dat

The next task is to flatfield the data:

  % flatfield o59 o59_flat
  SURF: run 59 was a MAP observation of 3c279
  SURF: applying flatfield from photflat1.dat

If the input and output files are not specified on the command line they will be requested.

The data can now be corrected for airmass (elevation) and sky opacity by using extinction. According to the skydip observation taken prior to the map, the tau at 850 μm is 0.220 and a skydip taken after the map shows it was 0.187. If extinction is given two τ values from different times then the actual τ for each jiggle will be calculated by linear interpolation – obviously this assumes that the τ varied linearly with time. For this example we will assume the τ variations are correct in order to demonstrate the principle:

  % extinction
  IN - Name of NDF containing demodulated data /@o59_flat/ >
  SURF: run 59 was a MAP observation with JIGGLE sampling of object 3c279
  SURF: file contains data for 4 exposure(s) in 3 integration(s) in 1
  measurement(s)
  SURF: observation started at sidereal time 12 19 59 and ended at 12 28 40
  SURF: file contains data for the following sub-instrument(s)
   - SHORT with filter 450
   - LONG with filter 850
  SUB_INSTRUMENT - Name of sub-instrument to be extinction corrected /’SHORT’/ > l
  FIRST_TAU - First zenith sky opacity measured /0/ > 0.22
  FIRST_LST - Sidereal time of first opacity measurement; hh mm ss.ss /’0.0’/ > 11 54
  SECOND_TAU - Second zenith sky opacity measured /0.22/ > 0.187
  SECOND_LST - Sidereal time of second opacity measurement; hh mm ss.ss /’11 54’/ > 16 10
  OUT - Name of output NDF /’o59_lon_ext’/ >

The arrays are separated at this point (since the extinction correction would be different). In this case the LONG-wave array was selected; extinction would have to be re-run to select the SHORT-wave array (the question is not asked if only one sub-instrument is present).

Some comment is probably required for the use of LST for the tau measurements – this is not as bad as it sounds. The skydip task prints the LST of the skydip and extinction prints the LST of the observation; in many cases it is known that the tau value was taken a certain time before or after the observation so this value can simply be added. In addition, most of the time a constant tau value is used and for the case of a constant tau the LST is irrelevant.

If desired, sky noise can now be removed. Sky signal can be identified in two ways: firstly using bolometers that are known to be looking at sky (implemented in remsky) and secondly, using a model of the source structure to enable the sky signal to be calculated with the source subtracted from the data (implemented in calcsky). In this section we will examine the first method since this is the simplest and can be used for jiggle observations. The more complex approach will be dealt with in §9.8.2 where sky removal from scan map data is discussed. remsky works in a very simplistic way: Sky bolometers are specified, each jiggle is then analysed in turn, the average value for the sky bolometers (either MEDIAN or MEAN) is then removed from the entire array. At present it is not possible to specify sky regions, only sky bolometers can be specified. This may cause problems with extended sources (rebin the map in NA coordinates initially to find the sky bolometers). remsky should normally be run after rebin in order to choose sky bolometers that are really looking at sky, for this example we will skip that step. remsky is sufficient for the mapping of compact sources in jiggle mode and for photometry; if the source structure is complex then calcsky should be considered (§9.8.2).

  % remsky
  IN - Name of input file containing demodulated map data /@w48_newrebin/ > o59_lon_ext
  SURF: run 59 was a MAP observation with JIGGLE sampling of object 3c279
  OUT - Name of output file /’o59_lon_sky’/ >
  BOLOMETERS - The Sky bolometers, [a1,a2] for an array /[’all’,’-h7’]/ >
  SURF: Using 36 sky bolometers
  MODE - Sky removal mode /’median’/ >
  Adding mean background level onto data (value=1.5316721E-6)

In this example we have used all the bolometers except for the central pixel (H7) and then used median sky removal for each jiggle. The average background level has also been added back onto the data.


pdfpict
Figure 6: The 3C279 data after processing through extinction and remsky. The next stage is to regrid the data using rebin. The source can clearly be seen in bolometer 19 (H7). The negative stripes are indicating that the chop throw was smaller than the array.


The output of extinction or remsky can be displayed using, say, Kappa display to see whether some integrations or bolometers should be marked bad.10 Sometimes bad bolometers can only be identified after a rebin/scuover phase. The output so far can be seen in figure 6 – the axes are labeled with bolometer number along the X-axis and integration number up the Y-axis.

Now that the data have been extinction corrected and, optionally, processed with remsky, the data reduction path diverges according to the type of observation. Map making11 (jiggle and scan) and photometry will be dealt with separately. Note that scuquick can be used to automate some of the more repetitive tasks during this stage of the data reduction process.

Before diverging though, we should first take a diversion into the question of despiking.

9.5 Despiking

This section describes the different techniques available for despiking SCUBA data.

9.5.1 Manual despiking

Manual despiking simply involves examining the data with linplot and display, identifying bad regions by eye and then running change_quality to turn off the bad points. In general this is very time consuming (especially working out the pixel number of a spike so that change_quality can be told the exact location) so two interactive techniques are available:

(1)
dspbol

This script automates the linplot-change_quality cycle – bolometers can be plotted in turn, with spikes identified and removed, all within a few seconds.

(2)
Figaro sclean

The sclean task allows users to simply click on bad points to remove them. This routine has been designed for SCUBA despiking and therefore understands SCUBA quality. sclean provides an integrated despiking environment showing the 2-D image and a 1-D slice, allowing points to be marked bad (or good) in either window. More information can be found in the Figaro documentation.

Additionally, the rlinplot command, a wrapper for the Kappa mlinplot command, and the pltbol command, a wrapper for the Kappa linplot command, can be used to identify spikes and noisy bolometers rapidly without having knowledge of NDF sections or the specifics of each command.

9.5.2 Automatic despiking

At first sight, the automatic despiking of SCUBA data may seem somewhat daunting since there are 4 different tasks provided for this: despike, despike2, scuclip and sigclip. Detailed information on these can be found in the appendix (§C) but a direct comparison of the four is provided below:

sigclip

Originally intended for the final clipping of photometry data, this task finds the statistics of the entire data file and clips any point lying more than SIGMA from the mean. This task knows nothing about SCUBA data.

Disadvantages: Should not be used where bolometers see differing signals (i.e. most of the time) since the clipping is then invalid.

Advantages: Will clip any data file. Can be used on reduced photometry data (output of scucat) for clipping since only data for a single bolometer will be present.

scuclip

This task processes each bolometer in turn, finding the mean and removing any points lying more than NSIGMA from the mean for the current bolometer. An iterative clip is used so that the mean is recalculated each time a point lies NSIGMA from the mean until no points are removed. No knowledge of SCUBA is required by this task (except that it knows which quality bit to use for the despiking).

Disadvantages: For JIGGLE/MAPS, on-source bolometers jiggle on and off the source and therefore have a large change in signal (if the source signal is well above the noise level) – the mean and standard deviation calculations therefore have a tendency to remove peak signals from the data. (this can be partly overcome by setting the source bolometer bad, clipping the remaining bolometers and then setting the source bolometer to good.)

Advantages: Can be used for PHOTOM data and weak signals since each bolometers always sees approximately the same signal. Can be used for detecting large spikes on strong sources if a sufficiently large value is chosen for NSIGMA.

despike

This task places each point into a grid cell corresponding to the actual position of the datum on the sky. Each cell is then analysed in turn, any point further than NSIGMA from the mean for a given cell is then treated as a spike12. All modes are supported with the caveat that SCAN/MAP data should not have been restored (spikes must be removed before the single-beam restoration phase – also EKH data can not strictly be processed in this way beacause the chop angle is not fixed on the sky).

Disadvantages: For small data sets the number of points per bin is not sufficient to perform accurate statistics calculations.

Advantages: Small spikes can be detected in the presence of strong sources since the actual location on the sky is used for the calculation.

despike2

This task is designed specifically for SCAN/MAP data. Each scan for each bolometer is analysed in turn and spikes are detected using a running mean calculation.

Advantages: Finds the large spikes in SCAN/MAP data.

Disadvantages: Care must be taken when despiking bright sources (e.g. planets).

In summary, each mode should probably use different despiking techniques:

photom

scuclip can be used before scuphot and remsky. sigclip should be used after scuphot (or scucat).

scan/maps

despike2 should be used initially. For “Emerson II” data it is also possible to use despike since the chop angle is fixed on the sky (only despike data that were taken with the same chop configuration).

Jiggle maps of strong sources

Initially scuclip can be used with a large NSIGMA to remove the obvious spikes. Then despike should be used for the smaller spikes (i.e. those comparable with the source signal).

Jiggle maps of weak sources

Can probably run scuclip as for PHOTOM observations. Here ‘weak source’ means data where the source is not far above the noise level.

9.6 Map making

All that is required now is that the data be rebinned onto a rectangular grid with the rebin task. If necessary it is possible to enter Az/El pointing corrections by using change_pointing:

  % change_pointing n59_sky_lon
  SURF: run 59 was a MAP observation of 3c279
  SURF: observation started at LST 12 19 59 and ended at 12 28 40
  SURF: no pointing corrections found
  CHANGE_POINT - Do you want to change the pointing correction data > y
  POINT_LST - The sidereal time of the pointing offset (hh mm ss.ss) /!/ > 12 00
  POINT_DAZ - The azimuth pointing correction to be added (arcsec) > 0
  POINT_DEL - The elevation pointing correction to be added (arcsec) > 0
  POINT_LST - The sidereal time of the pointing offset (hh mm ss.ss) /!/ > 12 50
  POINT_DAZ - The azimuth pointing correction to be added (arcsec) > 1.1
  POINT_DEL - The elevation pointing correction to be added (arcsec) > -0.9
  POINT_LST - The sidereal time of the pointing offset (hh mm ss.ss) /!/ >

The time for the pointing corrections must be in LST (they also must be entered in chronological order). The pointing offset is assumed to vary linearly with time. Here I have assumed good pointing at an LST of 12h (the pointing observation before the map) and a small shift 50 minutes later when another pointing observation was taken (the shift can be found by using pointsum). It is probably best that the pointing offset is measured directly from the image by first regridding the data in Az/El coordinates and then using, for example, the Kappa centroid task to find any offset.

A number of questions need to be asked before regridding the data :-

What rebin method should be used?

Currently, three methods are available. The data can be regridded with a weighting function, interpolated using spline fitting or by calculating the median data value in each cell of the output grid.

Bessel, Gaussian and linear weighting functions are available; in theory the Bessel function interpolation should give the best results but in practice the Gaussian or linear functions should be used (they are much faster and less affected by edge effects). The Gaussian function should be used if you are interested in beam shape (since it is easier to work out what is going on when you convolve a JCMT beam with a Gaussian than when you convolve it with a cone).

The MEDIAN regridding technique can be used if many data points are available (since for small output grids at least one input data point [preferably many more] must be available in each cell in the output to avoid bad pixels). If fewer points are available (only a few integrations) consider using larger cells or use the Kappa routines fillbad or glitch to interpolate over the holes.

The spline fitting algorithms are experimental and have not been thoroughly tested – please use with care.

What coordinate system should be used?

The data can be rebinned in the following coordinate systems:

NA Nasmyth (SCUBA) coordinate frame
AZ Azimuth-Elevation offsets
PL Moving source (e.g. planet)
RB RA/Dec B1950
RJ RA/Dec J2000
RD RA/Dec epoch of observation
GA Galactic coordinates (J2000)

The first two coordinate systems are fixed on the telescope so that the source rotates during long observations. They are most useful for taking beam maps (AZ) or examining the properties of the SCUBA bolometers (NA). Obviously AZ and NA contain no astrometry information. The PL coordinate system should be used for moving sources (e.g. planets or comets) where the RA and Dec of the source is changing with time; offsets from this moving centre are calculated and no astrometry information is stored. The remaining coordinate systems correct for source rotation and do have associated FITS World-Coordinate-Systems (WCS) astrometry information [1920].

Map centre

The default map centre will be the map centre of the first map entered into rebin modified to the epoch of the output map if necessary. coordinates used as the map centre of the This question is not asked if a NA, AZ or PL coordinate system is being used.

Pixel size

The regridded image can be in any pixel size. The main point is that account is taken of the beam sizes: approximately 7 arcsec at 450 microns and 14 arcsec at 850 microns. The on-line system regrids with 3 arcsec pixels. Obviously the regridding takes longer the smaller the pixel size that is requested but only becomes a real problem if BESSEL regridding is used. Linear interpolation should be fast (less than 10 seconds) for most reasonable pixel sizes.

The map can now be made with rebin (in this case using linear interpolation, J2000 coordinates, 1 arcsec pixels and default map centre, looping is turned off since I am only regridding one map):

  % rebin noloop
  REBIN_METHOD - Rebinning method to be used /’LINEAR’/ >
  SURF: Initialising LINEAR weighting functions
  OUT_COORDS - Coordinate sys of output map; PL,AZ,NA,RB,RJ,RD or GA /’RJ’/ >
  SURF: output coordinates are FK5 J2000.0
  REF - Name of first data file to be rebinned /’n59_sky_lon’/ >
  SURF: run 59 was a MAP observation of 3c279 with JIGGLE sampling
  SURF: file contains data for 4 exposure(s) in 3 integrations(s) in 1
  measurement(s)
  
  WEIGHT - Weight to be assigned to input dataset /1/ >
  SHIFT_DX - X shift to be applied to input dataset on output map (arcsec) /0/ >
  SHIFT_DY - Y shift to be applied to input dataset on output map (arcsec) /0/ >
  SURF Input data: (name, weight, dx, dy)
     -- 1: n59_sky_lon (1, 0, 0)
  
  LONG_OUT - Longitude of output map centre in hh (or dd) mm ss.ss format /’+12
   56 11.17’/ >
  LAT_OUT - Latitude of output map centre in dd mm ss.ss format /’- 05 47 22.1’/ >
  OUT_OBJECT - Object name for output map /’3c279’/ >
  PIXSIZE_OUT - Size of pixels in output map (arcsec) /3/ > 1
  OUT - Name of file to contain rebinned map > n59_reb_lon
  WTFN_REGRID: Entering second rebin phase (T = 0.9061 seconds)
  WTFN_REGRID: Entering third rebin phase (T = 3.682912 seconds)
  WTFN_REGRID: Regrid complete. Elapsed time = 4.055644 seconds.

If more than one map is available the extinction corrected data (with or without sky removal) can all be added into a single map at this stage. The parameter IN can be supplied with one new map at a time or via a text file (§9.6.1). Each input data set can be shifted by setting SHIFT_DX and SHIFT_DY (this shift is in arcseconds on the output grid cf. pointing corrections which are in Az/El offsets) and assigned a relative weight with the WEIGHT parameter. rebin does understand SCUBA sections (§6) so it is possible to select part of an observation for regridding at this time.

In addition to rebin there are three closely related tasks (in fact they all use the same code): bolrebin will regrid each bolometer individually, intrebin will regrid each integration into a separate file and extract_data will write the data to a text file before regridding. Note that the output file for bolrebin and intrebin is an HDS container [21] rather than a simple NDF. For example, if the OUT file is test.sdf the images will be accessible as NDFs via test.h7, test.h8 etc (or test.i1, test.i2 for intrebin).


pdfpict
Figure 7: A 850 micron image of 3C279 rebinned in RJ coordinates with the long wave array overlaid. The two negative sources indicate the nodding and chopping that are part of a SCUBA jiggle/map.


At this point the map can be displayed with, say, Kappa display. Fig. 7 shows the 850 micron image of 3C279 rebinned in RJ coordinates with the long wave bolometer array overlaid (note that scuover displays the array at zero jiggle offset). Fig. 7 was made as follows (note that this requires psmerge [22] in addition to Kappa’s display):

  % display n59_reb_lon axes lut=$KAPPA_DIR/bgyrw_lut device=epsfcol_p
  MODE - Method to define the scaling limits /’SCALE’/ >
  LOW - Low value for display /-0.01095889788121/ >
  HIGH - High value for display /0.022901531308889/ >
  
  % scuover prompt
  MSG_FILTER - Messaging level /’NORM’/ >
  DEVICE - Name of graphics device /@xwindows/ > epsfcol_p
  Current picture has name: DATA, comment: KAPPA_DISPLAY.
  Using /scuba/maps/sun217/n59_reb_lon as the input NDF.
  EXT - Name of (extinction corrected) demodulated data file /’n59_sky_lon’/ >
  SURF: file contains data for 4 exposure(s) in 3 integration(s) in 1
  measurement(s)
  INTEGRATION - Integration number /1/ >
  EXPOSURE - Exposure number /1/ >
  COL - Colour of annotation /’red’/ > white
  NAME - Display bolometer name (else number)? /TRUE/ >
  
  
  % psmerge -e gks74.ps gks74.ps.1 > 3c279.eps

In general, for faint sources it would now be necessary to go back to the extinction corrected (or sky-removed) data so that any bad bolometers and integrations can be turned off (using change_quality and SCUBA sections – rebin can be used to test a section before committing the change), different sky bolometers chosen or new pointing corrections added. Once complete the data can be calibrated – planet fluxes can be obtained using the Fluxes package and work is progressing on a list of secondary calibrators (see e.g. [24]).

9.6.1 Rebinning multiple datasets

On many occasions it is necessary to combine multiple observations into one regridded image to attain the desired signal-to-noise for a source map. One way of doing this is to enter into rebin (or related task) each map in turn along with the WEIGHT, SHIFT_DX and SHIFT_DY. For a small number of input sets this approach is fine but for large numbers (n > 2) this approach becomes tedious and error prone. In order to overcome this problem the rebin tasks can accept an ASCII text file as input as well as an NDF.

This text file contains information on one input set per line. This line must contain either a space separated list of with the NDF, weight, shift_dx and shift_dy, or the name of another text file:

  # Regrid text file for 3c279
  
  # Format of text file should be
  #  NDF     WEIGHT     SHIFT_DX    SHIFT_DY
  
  n59_reb_lon  1.0   0.0   0.0    # Map 59
  n60_reb_lon  1.0   1.0   0.0    # Shift 1.0 relative to n59
  
  n61_reb_lon{i2} 1.02            # Only want the second integration from
                                  # this -- shifts will be requested when
                                  # the text file is included
  n62_reb_lon   0.98 1.0   2.0
  
  3c279_old.bat                   # Include previous 3c279 data via a text file.

From this example we can see that blank line are ignored and a ‘#’ indicates the start of a comment; all text on the line after the ‘#’ is ignored. Not all the parameters need to be specified on the input line; if they are missing the software will simply ask for the values from the user. The order of these parameters is important so it is not possible to specify map shifts without specifying a weight – similarly SHIFT_DY can not be given without SHIFT_DX. Also note that the NDF name can include SCUBA sections. Even though text files can include other text files a recursion depth of 5 has been hard-wired into the code to prevent abuse – it was felt that this should be sufficient in most cases.

With the default messaging level, rebin tasks always show a summary of all the input data before proceeding to the final regridding – this can be used to check that the correct files (and associated parameters) have been read in.

9.6.2 Output coordinate frames

The output map generated by rebin will contain at least 3 output coordinate frames. They are the GRID, PIXEL and AXIS coordinate frames. For maps regridded in RJ/RB/GA or RD coordinates there will be an additional SKY coordinate frame.

They can be listed with the ndftrace command:

  lapaki[M82/short]>ndftrace m82 fullwcs
  
     NDF structure ..../m82:
        Title:  SURF:remdbm
        Label:  Extinction corrected
  
     Shape:
        No. of dimensions:  2
        Dimension size(s):  256 x 256
        Pixel bounds     :  -127:128, -127:128
        Total pixels     :  65536
  
     Axes:
        Axis 1:
           Label : R.A. offset
           Units : arcsec
           Extent: 127.5 to -128.5
  
        Axis 2:
           Label : Declination offset
           Units : arcsec
           Extent: -127.5 to 128.5
  
     Data Component:
        Type        :  _REAL
        Storage form:  SIMPLE
        Bad pixels may be present
  
     Quality Component:
        Storage form :  SIMPLE
        Bad-bits mask:  3 (binary 00000011)
  
     World Coordinate Systems:
        Number of coordinate Frames      : 4
        Index of current coordinate Frame: 4
  
        Frame index: 1
          Title               : "Data grid indices; first pixel at (1,1)"
          Domain              : GRID
  
        Frame index: 2
          Title               : "Pixel coordinates; first pixel at (-127.5,-1..."
          Domain              : PIXEL
  
        Frame index: 3
          Title               : "Axis coordinates; first pixel at (127,-127)"
          Domain              : AXIS
  
        Frame index: 4
          Title               : "FK5 equatorial coordinates; mean equinox J20..."
          Domain              : SKY
  
     Extensions:
        FITS             <_CHAR*80>
        REDS             <SURF_EXT>
  
  
     History Component:
        Created    :  1999 Feb 07 17:34:41
        No. records:  9
        Last update:  1999 Jun 16 17:00:07 (MATHS           (KAPPA 0.13-6))
        Update mode:  NORMAL

In the above example, the SKY frame is the current frame (this will be used by Kappa display) and is set to FK5 J2000. The AXIS frame contains arcsecond offsets from the regrid centre. The PIXEL frame uses pixel indices and rebin ensures that the regrid centre is always at pixel coordinate 0,0 unlike the GRID frame where the pixel origin is always at the bottom left hand corner. The PIXEL frame is used by all Kappa commands that combine images and also the makemos command in Ccdpack.

For more information on coordinate frames please see Using World Coordinate Systems in SUN/95.

9.6.3 Exporting maps

After the data have been regridded with rebin the image can then be analysed with an image-analysis tool. Obviously Kappa, Figaro [25] or Gaia can be used immediately since they support the NDF standard.

In order to use packages such as iraf, aips or miriad the data must first be converted to FITS format by using either the Convert task ndf2fits or the Figaro task wdfits. The ndf2fits task is recommended since it can understand FITS tables, floating-point FITS, and the new AST extension [20]13

For images rebinned with an older version of SURF (pre-1.3) or if using wdfits or a version of Convert that does not understand AST (pre-1.1) it is necessary to remove the AXIS components from the NDF before converting since the axis information (arcsec offsets from the map centre) takes priority. In order to propagate WCS astrometry information from the NDF FITS array into the FITS file the axis information must first be removed by using Figaro’s delobj or Kappa’s setaxis. For example, if the image is stored in file scuba_image.sdf and we wish to convert this to an integer FITS file scuba_image.fits, with Kappa/Convert we would do:

  % kappa
  % convert
  % setaxis scuba_image mode=delete
  % ndf2fits bitpix=32 profits scuba_image scuba_image.fits

The PROFITS parameter is there to ensure that all the FITS information in the NDF is propagated to the FITS file. With Figaro we would do:

  % figaro
  % delobj scuba_image.axis
  % wdfits scuba_image scuba_image.fits

Note that wdfits always writes integer FITS whereas ndf2fits would by default write REAL FITS (bitpix= 32). ndf2fits also writes the Variance and Quality arrays to FITS tables in the output file (this can be turned off by specifying just the data component with COMP=D).

From Kappa V0.13 the WCS information stored in the header is used when manipulating the NDF. As of Surf version 1.4, the astrometry information is no longer stored in IRAS90 or FITS extensions; all astrometry information can be found in the AST/WCS component which is understood by Kappa, Gaia and ndf2fits.

9.7 Photometry

For photometry data all that is required after extinction/remsky is that the jiggle pattern be processed to determine the signal for each integration and bolometer. It is possible to derive the signal by taking the AVERAGE of the jiggle data or by fitting a PARABOLA to the data. Parabola fitting probably should not be used unless the sky was exceptionally stable – the individual jiggle maps rarely look like they can be fitted by a parabola.

For this example I will use photometry data on 3C279 taken just before the example used for mapping. The data have been processed in the same way as scan 59.

  % scuphot n56_sky_lon
  SURF: run 56 was a PHOTOM observation of 3c279
  SURF: file contains data for 1 exposure(s) in 10 integrations(s) in 1
  measurement(s)
  ANALYSIS - Which reduction method /’AVERAGE’/ >
  OUT - Name of container file to hold map and time-sequence data > n56_pht_lon
  FILE - Name of ASCII file to contain results summary /!/ > n56.txt

In this example n56_sky_lon.sdf is processed with scuphot. This observation consisted of 10 integrations and used a 9-point jiggle pattern. The value of each integration was determined by taking the average of the jiggle pattern. In some cases a better signal-to-noise can be achieved by processing the individual two second samples rather than averaging over the nine samples that comprise an integration. For these cases, usually short observations, where the scatter on the averaged data is not representative of the standard deviation of the raw data (small number statistics) ANALYSIS=SAMPLE is recommended.

Information on samples or integrations is written to a text file (n56.txt in this case) and also to n56_pht_lon.sdf. Since photometry observations can use multiple bolometers n56_pht_lon.sdf is in fact a HDS container [21] which contains two NDFs per bolometer: <BOL >_peak contains the photometry data for each integration and <BOL >_map contains the integrated jiggle pattern (assuming the jiggle pattern was on a regular grid – irregular jiggle patterns are written as 1-D images and no map is written for zero offset jiggles). In this example the bolometer used was H7 so that n56_pht_lon.h7_peak would be the NDF containing the integration data (the ascii version of which can be found in n56.txt) and n56_phot_lon.h7_map which would contain the integrated jiggle pattern

Since many photometry observations are usually combined to give the final result the scucat task can be used to concatenate data files that have been produced with scuphot (scucat knows about the _peak NDFs). In this case we have combined the three photometry observations listed in §9.1:

  % scucat
  METHOD - Concatenation method /’SEPARATE’/ >
  OUT - Rootname of files to contain concatenated data > 3c279
  IN - Name of input file containing photometry data /’n56_pht_lon’/ >
  SURF: Found data for the following bolometers: h7
  SURF: This is a PHOTOM observation of 3c279. There are 10 integrations
  IN - Name of input file containing photometry data /!/ > n57_pht_lon
  SURF: Found data for the following bolometers: h7
  SURF: This is a PHOTOM observation of 3c279. There are 10 integrations
  IN - Name of input file containing photometry data /!/ > n58_pht_lon
  SURF: Found data for the following bolometers: h7
  SURF: This is a PHOTOM observation of 3c279. There are 10 integrations
  IN - Name of input file containing photometry data /!/ >

scucat continues to request input data until a null value (!) is given for the IN parameter. Since different bolometers should be processed independently, a new file is created for each bolometer. In this example scucat produces one file called 3c279_h7.sdf; if this data was taken with 2-bolometer chopping there would have been another file called 3c279_h9.sdf (for example). These files can now be analysed with standard statistics packages (e.g. Kappa stats and kstest).

An alternative to the above for scucat is to use a text file to contain the list of filenames to be processed (useful for scripts):

  % scucat noloop
  METHOD - Concatenation method /’SEPARATE’/ >
  OUT - Rootname of files to contain concatenated data > 3c279
  IN - Name of input file containing photometry data /’n56_pht_lon’/ > ^in.lis
  SURF: Found data for the following bolometers: h7
  SURF: This is a PHOTOM observation of 3c279. There are 10 integrations
  SURF: Found data for the following bolometers: h7
  SURF: This is a PHOTOM observation of 3c279. There are 10 integrations
  SURF: Found data for the following bolometers: h7
  SURF: This is a PHOTOM observation of 3c279. There are 10 integrations

where in.lis contains the names of the 3 filenames to be processed (a comma separated list is also allowed).

If you do not want to process different bolometers independently, the METHOD parameter can be set to CATALL, in which case all data will be concatenated together regardless of bolometer and the output filename will match that specified in OUT (rather than being OUT + bolometer name).

9.7.1 Surf photometry and Kappa

The photometry data reduction system produces one flux measurement per integration per bolometer. Further analysis simply involves finding a self-consistent mean of the merged data set (multiple measurements with a given bolometer can be concatenated together using scucat).


pdfpict
Figure 8: Photometry data of 3C279. This is the concatenated data from three separate observations.


The Surf package supplies two Kappa scripts to aid with this step of the analysis:

The Kappa kstest routine can also be used to check the self-consistency of the photometry data by performing a Kolmogorov-Smirnov test on the data (e.g. [26]).

9.8 Scan maps

Scan map data can be taken using two techniques (both based on chopping). The first technique is to chop in the direction of the scan and deconvolve each scan independently (the EKH method [11]). This technique must be used for single pixel mapping although it can also be used for array scan mapping. Problems with this technique are that it is very sensitive to spikes, every scan must be completely off-source at both ends and correlations with adjacent scans/pixels are ignored.

For array scan maps we scan the array across the source whilst chopping in a fixed direction on the sky. Following the work of Emerson[12] we take data using a number of different chop configurations in order to sample as many spatial frequencies as possible (we are not sensitive to structure that is larger than the chop throw). Multiple chop throws in 2 orthogonal directions are used with chop amplitudes chosen so that, except at the origin, the zeroes in the Fourier transform of one do not coincide with the zeroes in the FT of the other up to the spatial frequency limit of the telescope beam. For SCUBA, it is recommended that 6 different chop configurations should be used: Chop throws of 20, 30 and 65 arcsec each with chop position angles of 0 (Dec chopping) and 90 degrees (RA chopping) in a coordinate frame fixed on the sky. This will give the best coverage of spatial frequencies but reasonable maps can also be obtained by combining four of the chop configurations. This mode also has the advantage that the deconvolution occurs after the images have been regridded; this means that data can be salvaged even if a scan did not go completely off source (by combining with data that does) and small spikes will be averaged out.

During commissioning it has been shown that the new method can result in a substantial improvement in signal-to-noise over the EKH method[13].

9.8.1 Baseline removal

The EKH method guarantees that the mean of every scan should be zero (the transputers remove the mean on-line). In the absence of spikes the data would not need baseline removal but in some cases a large spike can adjust the mean of the scan and the baseline should be recalculated after spike removal (with despike2).

For the “Emerson II” method the situation is more complicated since the mean of each scan is now not guaranteed to be zero (and in fact the transputers do not attempt to remove a baseline in this case). scan_rlb must be run in order to remove the baseline (each bolometer sees a slightly different background). For data where the scans are long enough to be off-source LINEAR baseline removal can be used. For more complicated source structure MEDIAN is worth a try although extremely complicated regions (e.g. OMC-1) may cause problems.

In order to overcome this problem it is also possible to specify specific scans that can be used for calculating the offset level since the DC level appears to be fairly constant during an integration. When the SECTION baseline removal method is selected a SCUBA section (§6) can be used to specify exposure (scan) numbers or actual positions in the data stream. Usually the first and last exposures are used since these are most likely to be ‘off-source’. The appendix on scan_rlb contains some examples on the use of SCUBA sections to select baseline regions.

9.8.2 Sky removal

remsky can not be used to calculate the sky contribution for scan map data because it is no longer possible to select bolometers that are guaranteed to be on sky (since most bolometers will see ‘source’ at some point during the observation).

In order to overcome this problem the source signal must be removed from the data before attempting to calculate the sky. This is achieved with the calcsky task.

For each point in the input datasets calcsky finds the expected flux at that position by comparison with a model of the source and removes that flux from the input data. The source model can be calculated internally by calcsky or an external image can be supplied (usually generated from the same input data using rebin).


pdfpict
Figure 9: Sky noise calculated by calcsky for a short interval of the M82 data.


The source model is calculated in exactly the same way as for MEDIAN rebinning and despike: the input data are placed in bins related to position on the output image; the median of each bin is then taken to be a good measure of the flux in that region of sky. This approach is an approximation since the bin size (quarter beam width) can accommodate large gradient changes towards point sources but in general these errors are smoothed out by the average taken over the whole array. An alternative approach is to rebin the input data on a fine grid (e.g. 1 arcsec or finer) and use that as the input model (this is especially useful for scan maps since calcsky can add the dual beam response to the data when calculating the model)

Once the source has been removed a sky signal is calculated from the residual signal by finding the average signal across the array for each time. In addition the time series can be smoothed since scan data are sampled much faster (approx. 8 Hz) than the sky emission is expected to vary (a few seconds). These time series are then stored in an extension inside the file (stored in .MORE.REDS.SKY). Once the sky has been calculated it can be removed by using remsky. remsky recognises the presence of a SKY extension and removes this signal from the main data array.

Fig. 9 shows the sky signal for some of the M82 data. In general the sky noise on scan data is below the noise but correlations are visible in the smoothed time series.

This technique is not limited to scan map data. Jiggle maps can benefit from using calcsky in cases where sky bolometers can not be identified or when the sky removal needs to be automated. One caveat is that the quality of the sky removal depends critically on the quality of the sky model. For extreme cases of sky noise in jiggle data where the individual switches are visible as hexagonal patterns across the image, calcsky can not disentangle the source from the hexagonal pattern (no other data are available for that position on the sky) and sky removal will fail.

More information on sky removal for jiggle and scan data can be found in Jenness et al.[13]

9.8.3 Dual beam deconvolution

Chopping whilst scanning results in an image that contains two beams (a plus and minus image of the source). To restore the source profile we must deconvolve the chop from the measured map. The problems associated with this step can best be appreciated by considering the Fourier transform (FT) of the chop function, which is a sine wave with zeroes at the origin and at harmonics of the inverse chop throw. Deconvolving the chop function is equivalent to dividing the FT of the measured map by the FT of the chop and then transforming back to image space. Clearly, problems arise at spatial frequencies where the sine wave of the chop FT has a low value or is zero. Noise at these frequencies is blown up and significantly reduces the signal-to-noise of the restored map [11]. EKH method  The restore task must be used to remove the dual beam response whilst chopping in the scan direction. This must be run before rebinning. Since the chop direction rotates slowly on the sky (since it is dependent on the scan direction, which is in Nasmyth coordinates, and not the sky orientation) tasks that try to map the chopped data onto a sky plane (despike, calcsky, rebin) can not be used before the dual beam has been removed (despike is useless after restoration since spikes propagate through the entire scan and show up as sine waves after restore). Emerson II method  To remove the dual beam signature from “Emerson II” data, rebinned images of each chop configuration must be generated (they can be coadds of lots of observations). These images must have the same map centre, the same pixel scale, the same dimensions and must be regridded in the same coordinate frame as the chop (RJ for RJ, RB and GA data, PL or RD for moving sources). Fig. 10 shows examples of four dual beam images of M82.


pdfpict
Figure 10: 4 dual beam images of M82. The chop throws are 20 arcsec RA chopping, 30 arcsec RA chopping, 20 arcsec dec chopping and 30 arcsec dec chopping.

Once these images have been generated, they can be processed by remdbm:
  % remdbm o8?_lon_reb.sdf -out=m82
  Starting monoliths...Done
  Loop number 1
  Chop: PA=90 THROW=20
  
  Doing forward transformation
  
  Loop number 2
  Chop: PA=90 THROW=30
  
  Doing forward transformation
  
  Loop number 3
  Chop: PA=0 THROW=20
  
  Doing forward transformation
  
  Loop number 4
  Chop: PA=0 THROW=30
  
  Doing forward transformation
  
  Maximum difference between estimates of the same Fourier component is
  0.02414273.
  
  Doing inverse transformation
  
  Result stored in m82
  %

Note the use of shell wildcards. The final image can be seen in Fig. 11.


pdfpict
Figure 11: Final image of M82 after single beam restoration.


9.9 Polarimetry data reduction

Polarimetry observations are similar to map and photometry observations except that they are broken into measurements of a number of integrations (usually 1) for different positions of a half-wave plate. The wave plate normally steps in 22.5 degree increments. The initial data reduction scheme is identical to standard reduction except that the remip task must be run after extinction to remove the instrumental polarisation. For map observations, the intrebin task must then be used to generate an image for each integration (in practice this means an image per wave-plate position). intrebin ensures that the sky rotation angle and the waveplate angle are stored in the FITS headers (using the ANGROT and WPLATE keywords respectively). At this point the images can either be processed by scripts to calculate the Q and U images14 or use the Polpack data reduction system (version 2 or higher) which fits a sine wave to each pixel in the input images. The following example uses Polpack and is similar to the approach used by the ORAC-DR [27] polarimetry recipes [28].

Assuming the output of intrebin is stored in file omc1_reb.sdf (remembering that this file will contain an image per waveplate position named .i1, .i2 etc. There are 16 images in this example), Polpack must first be told where to find the rotation angle and waveplate position:

  % polimp table=$SURF_DIR/polimp.scuba omc1_reb
    16 input images to process...
  
    Processing ’omc1_reb.I1’
       Setting WPLATE to -2.5
       Setting ANGROT to 70.27795
       Setting IMGID to ’omc1_reb.I1’
       Setting FILTER to ’850_-2.5’
  
  <cut intervening information>
  
    Processing ’omc1_reb.I16’
       Setting WPLATE to 335
       Setting ANGROT to 75.46666
       Setting IMGID to ’omc1_reb.I16’
       Setting FILTER to ’850_335’

Surf provides a suitable import table.

The next stage is to generate the I, Q and U images from these individual waveplate images. This can be done by the polcal task directly:

  % polcal weights=3 ilevel=2 omc1_reb
     Processing 16 images in single-beam mode...
  
  OUT - Output Stokes cube > omc1_cube
  
   Iteration: 1...
     Total number of aberrant input pixels rejected: 199
  
   Iteration: 2...
     Total number of aberrant input pixels rejected: 219
  
   Iteration: 3...
     Total number of aberrant input pixels rejected: 219
  
   Iteration: 4...
     Total number of aberrant input pixels rejected: 219
  
   None of the output pixels failed the test specified by parameter MINFRAC.

In this case, we use WEIGHTS=3 to generate the variance information from the fit since the SCUBA variances are unreliable, although in many cases these variances are not under-estimated. polcal can combine images from separate overlapping fields all in one go if desired. If the intention is to mosaic separate fields within polcal intrebin should be run with the TRIM parameter set to some non-zero value to prevent problems with edge effects during the mosaicing. Also, polcal expects all the images to be referenced to the same pixel origin. This can be achieved using the Kappa wcsalign command but it is easier to do this in intrebin by making sure that all images are regridded relative to the same RA/Dec centre – the resulting images will all be aligned to the same pixel grid.


pdfpict
Figure 12: Polarisation E vectors around OMC-1 at 850 microns.


In many cases, a more reliable approach to calculating the IQU cubes with satisfactory variance information is to use the polstack task on a set of 16 images and use that to generate 4 images with associated variances. These variances are determined directly from the data rather than from the fitting. We have found that running polcal on the resulting 4 images (with WEIGHTS=1 since we now wish to use the supplied variance information) and then mosaicking the resultant IQU cubes (e.g. via Ccdpack makemos) to improve signal-to-noise gives the most robust results and provides variance information that agrees with theory.

Once the IQU cube is made polvec can be used to generate the vectors:

  % polvec omc1_cube
  CAT - Output catalogue > omc1_cat
  
     2530 vectors written to the output catalogue.

This catalogue can then be binned using polbin, sections selected using catselect(part of Cursa) and plotted using polplot. An example image of OMC-1 can be seen in Fig. 12.

For more information on Polpack see SUN/223 [29].

For photometry observations the output from scuphot can be exported to a single-pixel data reduction system, or alternatively, processed as under-sampled images and reduced through Polpack as described above.

9Note that the naming convention has now changed to YYYYMMDD instead of MMMDD as used at the time the data were taken

10Note that PHOTOM data array is 3-dimensional; use the NDF section („2) with display in order to examine these data.

11It is possible to rebin photometry data although, obviously, the image will not be fully-sampled

12More details on despike can be found in appendix G.

13As of release v1.3 it is no longer necessary to remove the .AXIS extension before processing with ndf2fits because rebin now writes WCS information using the AST library.

14Available from your support scientist if required although the ORAC-DR pipeline recipes are now preferred