4 File formats

 4.1 The GMOS working format
 4.2 The new IRAF spectral format
 4.3 The UK data-cube format
 4.4 MEF to data-cube format
 4.5 Format conversion
 4.6 GMOS vs. TEIFU format
 4.7 FITS header manipulation
 4.8 FITS I/O with IDL
 4.9 NDF I/O with IDL

There are two main file format for IFS final data products, and one further prospective format. The first of these file formats is a MOS style multi-extension FITS file, and is being put forward by the GEMINI group (see Section 4.1). The other main format is an xyλdata cube which is used by the Durham group (i.e. for SMIRFS and TEIFU data) (see Section 4.3).

Of these two formats the more natural for data analysis is the TEIFU style data cube, it has therefore been adopted as the standard format by the major IFS groups in the UK, Durham (SMIRFS and TEIFU), Cambridge (CIRPASS) and the ATC (UIST). Conversion between the GMOS/CIRPASS MEF and UK standard data cube formats(see Section 4.4) will therefore need to be implemented before these instruments are brought into general use. The EOS VIMOS instrument will provide its final data product in both MEF and data-cube formats.

The final IFS format is still in draft, and is the new IRAF spectroscopic file format (see Section 4.2). The specification is intended to provide a general description for two-dimensional spectroscopic image data, and should be able to represent long-slit, multi-object (MOS), integral-field unit (IFU) and slitless spectroscopy. Conversion between this format and the standard data-cube format will be implemented if the standard is adopted.

4.1 The GMOS working format

Currently the final data product of the GMOS and CIRPASS data-reduction software is a multi-extension FITS (MEF) file. However, this format may be replaced by the new IRAF spectral format (see Section 4.2) which is currently in development by the IRAF group at NOAO. The MEF is similar to the standard NIRI format now used with GEMINI, and has a binary FITS table with separate data, variance and quality planes.


No.  Type Name Format BITPIX INH






0 ifs_data.fits 16
1 BINTABLE TAB 16×num. of fibres 8
2 IMAGE SCI λ×num. of fibres 32 F
3 IMAGE VAR λ×num. of fibres 32 F
4 IMAGE DQ λ×num. of fibres 16 F







Table 1: GEMINI MEF file format

The first extension is a binary FITS table with columns: ID, RA, DEC, and SKY. This table would hold information specific to individual lenslets/fibres like relative fibre positions on the sky (RA, DEC), whether the fibre is a sky or object spectrum (SKY), etc.

The three image planes are like the IRAF multispec format; each row is a separate spectrum. This is a compact and efficient way of storing the extracted spectra, avoiding having multiple extensions for each individual spectra. From IRAF, ONEDSPEC tasks like splot can be used on individual planes, and ldisplay should be able to work directly on the MEF.

4.2 The new IRAF spectral format

A draft document has been written by the NOAO describing the new IRAF spectroscopic file format. This may, eventually, replace the GMOS working format as the final science data product for GMOS and CIRPASS observations. An example header block is shown below for the CIRPASS instrument.

  OBJECT  = ’CIRPASS: m51 V 600s’ / Observation title
  OBJNAME = ’M 51    ’           / Target object
  OBJRA   = ’13:29:24.00’        / Right ascension of object (hr)
  OBJDEC  = ’47:15:34.00’        / Declination of object (deg)
  OBJEPOCH=               2000.1 / Epoch of object coordinates (yr)
  EQUINOX =               2000.0 / Default coordinate equinox (yr)
  RADECSYS= ’FK5     ’           / Default coordinate system
  RAUNIT  = ’hr      ’           / Right ascension unit
  DECUNIT = ’deg     ’           / Declination unit
  APERTURE= ’CIRPASS IFU’        / Aperture identification
  APTYPE  = ’hexlens+fiber’      / Aperture type
  APERDIA =                 0.36 / Aperture diameter (arcsec)
  APERPA  =                 90.0 / Hexagon angle (deg)
  APUNIT  = ’arcsec  ’           / Aperture dimension unit
  APPAUNIT= ’deg     ’           / Aperture position angle unit
  APEPOCH =               2000.1 / Aperture coordinate epoch (yr)
  CRVAL1  =                  1.1 / Spectrum dispersion center (um)
  CRVAL2  =                   0. / Spectrum cross-dispersion center (pixel)
  CRPIX1  =               1024.0 / Spectrum center (pixel)
  CRPIX2  =               1024.0 / Spectrum center (pixel)
  CMIN1   =                  0.9 / Spectrum dispersion limit (um)
  CMAX1   =                  1.3 / Spectrum dispersion limit (um)
  CMIN2   =                 -1.5 / Spectrum cross-dispersion limit (pixel)
  CMAX2   =                  1.5 / Spectrum cross-dispersion limit (pixel)
  CTYPE1  = ’WAVE-WAV’           / Spectrum coordinate type
  CTYPE2  = ’LINEAR  ’           / Spectrum coordinate type
  CUNIT1  = ’um’                 / Spectrum coordinate unit
  CUNIT2  = ’pixel’              / Spectrum coordinate unit
  CD1_1   =              0.00022 / Spec coord matrix (um/pixel)
  CD1_2   =                  0.0 / Spec coord matrix (um/pixel)
  CD2_1   =                  0.0 / Spec coord matrix (pixel/pixel)
  CD2_2   =                  1.0 / Spec coord matrix (pixel/pixel)
  SPECFWHM=                  2.0 / Fiber FWHM (pixel)
  ARA0001 = ’13:29:24.00’        / Aperture right ascension (hr)
  ADEC0001= ’47:15:34.00’        / Aperture declination (deg)
  CRP20001=                500.0 / Spectrum center (pixel)
  ARA0002 = ’13:29:24.00’        / Aperture right ascension (hr)
  ADEC0002= ’47:15:34.36’        / Aperture declination (deg)
  CRP20002=                504.0 / Spectrum center (pixel)

The aperture identification, APERTURE, specifies the IFU. The aperture type APTYPE, aperture diameter APERDIA, and aperture position angle APERPA are the same for each spectrum. This information can be used to construct a data cube (see Section 4.4) or spatial/dispersion displays in conjunction with the aperture centres. The position angle is for one of the hexagonal edges and would orient the hexagons (IFU lenslets) when reconstructing a spatial display or data cube. For a purely fibre IFU (such as SAURON) much the same description would be used except the position angle would be eliminated.

The centre of each spectrum in world co-ordinates is given by the CRVAL keywords. In this example each spectrum is centred at about 1.1 μm in the dispersion direction (CRVAL1) and zero pixels in the cross-dispersion direction (CRVAL2). The cross-dispersion co-ordinates are defined as pixels from the centre of the fibre profile, since there is no real spatial information. The region the spectra cover in world co-ordinates are given by the CMIN and CMAX keywords. In this example the spectra cover the range 0.9 to 1.3 μm along the dispersion and 1.5 to 1.5 pixels relative to the fibre profile centre.

The CD keywords define the conversion between world co-ordinates and pixels on the detector. They also define any possible tilt of the dispersion path relative to the detector pixels. In this example the dispersion is 0.22 nm per pixel along the first image axis (detector rows) and there is no tilt.

The CRP keywords override the CRPIX keyword and provide the positions of the fibre spectra on the detector.

The fibre full width at half maximum (SPECFWHM) gives the fibre profile FWHM at the detector in the units of the spatial WCS, in this case pixels. This is used to guide the tracing and extraction of blended fibre profiles.

The ARA and ADEC keywords give the centre positions of each lenslet element or fibre. While it is desirable for the absolute co-ordinates to be accurate it is more important that the relative positions be fairly precise. It is these keywords that determine the reconstructed field and gives the IFU sampling pattern and orientation. The relative positions of the lenslets or fibres on the sky is something that should be well-known for each IFU instrument.

4.3 The UK data-cube format

The MOS-style MEF format, which is the end product of the GMOS and CIRPASS data-reduction software, is not particularly natural way of handling IFS data. Indeed, under the longslit paradigm (used to reduce TEIFU data) these files cannot be generated. The TEIFU-style data cube format has therefore been adopted as the standard UK IFS file format and will be used in the analysis stages for both CIRPASS and UIST. This adoption allows the use of many of the generic applications within the SSC, which due to the adoption of NDF as the standard file interchange format for Starlink applications, has many tasks that can process N-dimensional data.

A conversion program for GMOS and CIRPASS data to a more easily analysed data cube, which will involve re-binning the input spectra on to a rectangular array, is therefore desirable (see Section 4.4).


No.  Type Name Format BITPIX Comment






0 ifs_data.fits
1 IMAGE SCI x×y×λ 32 3-D science array
2 IMAGE VAR x×y×λ 32 3-D variance array
3 IMAGE DQ x×y×λ 16 3-D data-quality array







Table 2: TEIFU data-cube format

In the case of this format the IFU geometry information is no longer needed, as the input spectra have already been rebinned, but it is likely that (in the finalised file format) such information will be included as a FITS binary table.

An example of the FITS header block from a TEIFU data cube is shown below.

  SIMPLE  =                    T /  file does conform to FITS standard
  BITPIX  =                   16 /  number of bits per data pixel
  NAXIS   =                    3 /  number of data axes
  NAXIS1  =                   59 /  length of data axis 1
  NAXIS2  =                  110 /  length of data axis 2
  NAXIS3  =                  961 /  length of data axis 3
  EXTEND  =                    T /  FITS dataset may contain extensions
  OBJECT  = ’T/S MOVING field  ’ /  Title of the dataset
  DATE    = ’2000-10-05T17:50:52’/  file creation date (YYYY-MM-DDThh:mm:ss UTC)
  BSCALE  =         3.126180E-02 /  True_value = BSCALE * FITS_value + BZERO
  BZERO   =         6.089249E+02 /  True_value = BSCALE * FITS_value + BZERO
  BLANK   =               -32768 /  Bad value
  CD1_1   =               0.0625 / Axis rotation and scaling matrix
  CD2_2   =               0.0625 / Axis rotation and scaling matrix
  CD3_3   = 5.795317000000068219 / Axis rotation and scaling matrix
  CRVAL1  =             -0.03125 / Axis 1 reference value
  CRVAL2  =             -0.03125 / Axis 2 reference value
  CRVAL3  = 7302.291864499999065 / Axis 3 reference value
  CRPIX1  =                 29.5 / Axis 1 pixel value
  CRPIX2  =                 55.0 / Axis 2 pixel value
  CRPIX3  =                480.5 / Axis 3 pixel value
  WCSDIM  =                    3
  CTYPE1  = ’LINEAR  ’           / Quantity represented by axis 1
  CTYPE2  = ’LINEAR  ’           / Quantity represented by axis 2
  CTYPE3  = ’LAMBDA  ’           / Quantity represented by axis 3
  CD1_2   =                  0.0 / Axis rotation and scaling matrix
  CD1_3   =                  0.0 / Axis rotation and scaling matrix
  CD2_1   =                  0.0 / Axis rotation and scaling matrix
  CD2_3   =                  0.0 / Axis rotation and scaling matrix
  CD3_1   =                  0.0 / Axis rotation and scaling matrix
  CD3_2   =                  0.0 / Axis rotation and scaling matrix
  LTV3    =                -39.0
  LTM1_1  =                  1.0
  LTM2_2  =                  1.0
  LTM3_3  =                  1.0
  WAT0_001= ’system=image’
  END

Here the the number and size of the cube dimensions is specified by the NAXIS keywords, and as with the IRAF spectral format the CD keywords define the conversion between world co-ordinates and pixels on the detector, along with the tilt of the dispersion path relative to the detector pixels. While the CRVAL keywords defines the central value of each axis in world co-ordinates, e.g. in the case the spectral axis is centred on 7302 Å (CRVAL3).

A full dictionary defining FITS header keywords which can be generated by the data-acquisition system is provided on the web by the National Optical Astronomy Observatories (NOAO) at http://iraf.noao.edu/projects/ccdmosaic/imagedef/fitsdic.html.

4.4 MEF to data-cube format

In the Gemini IRAF package the command gfcube converts the GMOS working format, which is currently being used as the science end product file format for the GMOS and CIRPASS instruments, to a UK standard x,y,λ data cube in FITS format.

4.5 Format conversion

The Starlink CONVERT package (see SUN/55) can be used to convert to and from the Starlink NDF format. On-the-fly conversion of supported file formats (such as FITS and IRAF) can also be done by most Starlink applications if the CONVERT package has been initialised.

4.5.1 GMOS MEF to NDF

The CONVERT package handles the GMOS/CIRPASS MEF working format without complaint, as in the following example.

  % fits2ndf
  IN - Input FITS file(s) > gmos.fits
  1 file selected.
  OUT - Output NDF data structure(s) > out
  %

Converting the MEF to a Starlink standard NDF, the FITS binary table is converted into a normal NDF extension. An example of a resulting NDF is illustrated below. The <  > indicate the data type of a component. Those beginning with an underscore are primitive types; others are structures.

  IFS_FILE  <NDF>
  
     DATA_ARRAY     <ARRAY>
        ORIGIN(2)      <_INTEGER>
        DATA(2010,750)  <_REAL>
  
     MORE           <EXT>
        FITS_EXT_1      <TABLE>
           NROWS          <_INTEGER>
           COLUMNS        <COLUMNS>
              ID              <COLUMN>
                 DATA(750)    <_INTEGER>
  
              RA          <COLUMN>
                 COMMENT      <_CHAR*19>
                 DATA(750)    <_REAL>
  
              DEC         <COLUMN>
                 COMMENT      <_CHAR*19>
                 DATA(750)    <_REAL>
  
              SKY         <COLUMN>
                 COMMENT      <_CHAR*19>
                 DATA(750)    <_INTEGER>
  
        FITS(790)       <_CHAR*80>
  
     VARIANCE       <ARRAY>
        DATA(2010,750)  <_REAL>
        ORIGIN(2)       <_INTEGER>
  
     QUALITY        <QUALITY>
        QUALITY         <ARRAY>
           DATA(2010,750)   <_UBYTE>
           ORIGIN(2)        <_INTEGER>

Here the right ascension and declination position of each fibre is preserved in the FITS_EXT_1 NDF extension along with and array indicating whether the fibre is ‘on sky’.

4.5.2 TEIFU FITS to NDF

The CONVERT package also handles the UK standard data cube format (i.e. TEIFU style data) without complaint, such as in the example below.

  % fits2ndf
  IN - Input FITS file(s) > teifu.fits
  1 file selected.
  OUT - Output NDF data structure(s) > out
  %

The FITS file is be converted it into a three-dimensional NDF which, as discussed earlier (see Section 4.3) can be read by many existing applications in the software collection.

  IFS_FILE  <NDF>
  
     DATA_ARRAY     <ARRAY>
        ORIGIN(3)      <_INTEGER>
        DATA(59,110,961)  <_DOUBLE>
        BAD_PIXEL      <_LOGICAL>
  
     MORE           <EXT>
        FITS(45)       <_CHAR*80>
  
     TITLE          <_CHAR*18>
     WCS            <WCS>
        DATA(99)       <_CHAR*32>

4.6 GMOS vs. TEIFU format

While both GMOS (MOS style) and TEIFU (data cube) representations of IFS data are perfectly valid, there are several advantages to using the data-cube format in preference to other options. First, and perhaps most importantly, for ‘longslit’ paradigm instruments such as TEIFU (and perhaps also CIRPASS) a MOS style data reduction is not possible and therefore it is impossible to produce the first file type without major problems. Additionally a data-cube format is considered, by most people, to be intrinsically easier to visualise. Both these reasons were taken under consideration when the data-cube format was adopted, with consultation of Durham, Cambridge and the ATC, as the UK standard format for this data.

4.7 FITS header manipulation

Due to the still developing nature of the IFU file formats it is possible that your data may have missing FITS header keywords, or keywords which contain incorrect information. If this is the case you may need to manually edit your FITS file headers.

4.7.1 Native FITS files

A good package to use for FITS header (and data) manipulation is the FTOOLS software, which is released along with XANADU, as part of the HEASOFT package from GSFC. Further information about HEASOFT, along with detailed installation instructions, user manuals and a development guide, can be found at http://heasarc.gsfc.nasa.gov/docs/software/lheasoft/.

4.7.2 The NDF FITS extension

When a FITS file is converted to an NDF a FITS extension—sometimes called the ‘airlock’ to avoid confusion with extensions within FITS files—is created. This comprises a one-dimensional array of character strings containing the imported FITS header information. On exporting a file from NDF format back to FITS using ndf2fits the airlock contents will be propagated back to the FITS file. However, since the FITS extension is not updated when an NDF is manipulated, any information that can be derived directly from the NDF structure such as dimensionality, units and axis information will replace any equivalent information held in the FITS extension when it is exported.

The (SUN/95) package provides tools that allow you to read from, and write to, an NDF FITS extension. Example code using some of these tools is shown later in this document (see Section 6.5), and detailed documentation on these tasks is available in SUN/95.

4.8 FITS I/O with IDL

FITS I/O with IDL can be accomplished using the IDL Astronomy Library from the GSFC. The IDL Astronomy Library contains four different sets of procedures for reading, writing, and modifying FITS files. The reason for having four different methods of FITS I/O with IDL is partly historical, as different groups developed the software independently. However, each method also has its own strengths and weakness for any particular task. For example, the procedure MRDFITS()—which can read a FITS table into an IDL structure—is the easiest procedure for analyzing FITS files at the IDL prompt level (provided that one is comfortable with IDL structures). But mapping a table into an IDL structure includes extra overhead, so that when performing FITS I/O at the procedure level, it may be desirable to use more efficient procedures such as FITS_READ and FTAB_EXT.

For example a data cube can be read into an IDL array using the FXREAD method.

  ; Read the FITS file
  fxread, ’ifu_file.fit’, DATA, HEADER
  
  ; Determine the size of the image
  SIZEX=fxpar(header, ’NAXIS1’)
  SIZEY=fxpar(header, ’NAXIS2’)
  SIZEZ=fxpar(header, ’NAXIS3’)
  
  ; Find the data type being read
  DTYPE=fxpar(header, ’BITPIX’)

As can be seen, various values contained within the FITS header of the original file can be obtained using the FXPAR procedure.

Alternatively the MRDFITS procedure can be used, as in this example.

  ; Read the FITS file
  data = mrdfits(’ifu_file.fit’,0,header)

In both examples the image data is read into an IDL array called DATA, while the FITS header information is read into another array, of TYPE STRING, called HEADER.

In addition FITS files can be read into IDL using the CONVERT package’s on-the-fly file conversion ability (see SUN/55 for more details) and the READ_NDF IDL function.

4.9 NDF I/O with IDL

There are several methods for reading an NDF file into IDL. First the NDF can be converted to a FITS file using the ndf2fits application in the CONVERT package,

  % ndf2fits comp=D
  IN - Input NDF data structure(s) /@section/ >
  1 NDF selected.
  OUT - Output FITS file(s) /@out/ > section.fit
  %

and then read into IDL using the IDL Astronomy Library (as in Section 4.8). However, there are several other approaches that can be taken.

The easiest approach is to use the READ_NDF IDL procedure available with the CONVERT package. When CONVERT is installed, both the IDL procedures READ_NDF and WRITE_NDF are placed in $CONVERT_DIR so, to make them available to IDL, that directory must be added to the IDL search path. This will be done if the environment variable IDL_PATH has been set.

For example assuming we have a data cube called file.sdf which is of type _REAL

  IDL> data_array = read_ndf(’file’)

creates an IDL floating array, data_array, with the same dimensions as the NDF and containing the values from its DATA component.

  IDL> data_array = read_ndf(’file’, !values.f_nan)

As above except that any occurrence of a bad value (VAL__BADR as defined by the Starlink PRIMDAT package) in the NDF will be replaced by NaN in the IDL array.

  IDL> var_array = read_ndf(’file’,comp=’v’)

creates an IDL byte array from the VARIANCE component of the same NDF. Output of an IDL array is achieved using the corresponding WRITE_NDF procedure, for example assuming data_array is an IDL floating array then,

  IDL> write_ndf, data_array, ’file’

creates the NDF file.sdf with the same dimensions as the IDL array data_array, and writes the array to its DATA component (of type _REAL). No checks on bad values are made by default, such checks can be carried out, e.g. 

  IDL> write_ndf, data_array, ’file’, !values.f_nan

Here any occurrence of the value NaN in the array will be replaced by the VAL__BADR value as defined by the Starlink PRIMDAT package. While

  IDL> write_ndf, var_array, ’file’, comp=’v’

writes the IDL array var_array to the VARIANCE component of the NDF created above. A check is made that the size of the array corresponds with the size of the NDF.

There is yet another approach to read NDF data into IDL. Again we make use of the CONVERT package, this time we use the ndf2ascii application to convert the NDF to a ASCII text file so that we may use the IDL read_ascii procedure, after generating an associated data template using the ascii_template GUI. For instance reading the file file.dat in the subdirectory ifu_data

  IDL> data_file = filepath(’file.dat’, SUBDIR=’ifu_data’)
  IDL> data_template = ascii_template(data_file)
  IDL> data = read_ascii(data_file, TEMPLATE=data_template)

we create an associated data template using ascii_template GUI and read the data into the IDL structure data.

We can similarly use the ndf2unf application to create a sequential unformatted binary file and use the read_binary and assocaited binary_template GUI to read the data into IDL, e.g.

  IDL> udata_file = filepath(’binary.dat’, SUBDIR=’ifu_data’)
  IDL> udata_template = binary_template(udata_file)
  IDL> udata = read_binary(udata_file, TEMPLATE=udata_template)

will return an IDL structure variable udata.

The alternative more-low-level approach to reading either the ASCII or binary unformated files can be taken, allowing you to bypass the template GUIs. More details can be found in CONVERT documentation (see SUN/55).