3 Naming, Types and Variants

 3.1 Character Set
 3.2 The Rôle of TYPE
 3.3 Notation and Pseudo-types

3.1 Character Set

NAMEs, TYPEs and the contents of strings in HDS must consist only of printable ASCII characters (i.e. hexadecimal 20 through 7E). Beyond this, HDS itself (see SUN/92) imposes restrictions on NAMEs and TYPES, to which we add the requirement that the first character of a (non-primitive) TYPE must not be underscore.

It is strongly recommended by Starlink that NAMEs and TYPEs be limited to letters, numbers, and the underscore character, and that the first character be a letter. This is to prevent inconvenience to users, who will find themselves having to resort to special syntax (extra quote marks, for example) in order to resolve command-line ambiguity where unconventional NAMEs or TYPEs are present. Worse, the possibility cannot be ruled out that unexpected interactions between command-language features and imaginative NAMEs or TYPEs will, at some stage, cause insurmountable difficulties.

The names of HDS container files should be chosen to be the same as the names of their top-level data structures, or at least to be closely related. Note that filenames must not appear in applications code (see SGP/16).

3.2 The Rôle of TYPE

TYPE describes the form of a data structure and the general rules for interpreting or processing all structures of that form. The NAME identifies the particular object. The TYPE tells the program what to expect and how to interpret the object.

It is possible to take the view that the TYPE attribute of HDS objects is important only for primitives (e.g.  <_REAL >) and is not needed for higher-level structures. Such structures would thus have a NAME but no TYPE, and applications would interpret them by the presence or absence of NAMEd components, as well as by data values within the structure. Note, however, that the concept of type will be needed even if the TYPE facility is not itself used. TYPE is already available for this purpose and its use eliminates the need to define naming schemes or other conventions, which could differ between packages and ultimately cause clashes. It is supported directly by the HDS subroutines and is likely to be more efficient than other methods. It assists both the user (who will see it in a standard place when structures are listed) and the programmer (who can expect a structure of registered TYPE to conform to a given set of rules, and to have associated interface routines). A further objection to naming schemes comes from the limited length of HDS names (15 characters, chosen to maintain adequate storage efficiency).

In order to minimise the total number of TYPEs to be recognised, the TYPE may be supplemented by a variant, an HDS object of TYPE <_CHAR > called [VARIANT]. In such cases, the TYPE will define the general processing rules for a given structure, while [VARIANT] will specify detailed interpretation. The variant also enables a data structure to be developed and modified. Therefore, applications must ascertain the value of [VARIANT] in every structure to determine whether or not it can process that version of the structure. Mostly, this will merely be a check of the existence of [VARIANT].

To avoid incompatibilities between programs, it will be necessary to register both TYPEs and variants with the Starlink Head of Applications.

How do you decide whether a structure should have a new TYPE or be a variant of a TYPE? The guideline is as follows. [VARIANT] should be used if an algorithm written to process an existing TYPE could also process the new structure, merely by interposing an automatic conversion routine which requires no additional external information. If [VARIANT] is not present within a structure, or it has the value ‘SIMPLE’, then the structure is in its most basic and readily comprehensible form.

Any general utility program, (e.g. one from the KAPPA package) can use TYPE when available, but must also be prepared to rummage within a data structure and identify the components it needs by other methods, if the particular TYPE is not recognised.

3.3 Notation and Pseudo-types

A component of an HDS structure is characterised by its dimension(s), NAME, TYPE and meaning. The notation adopted in this document to describe these attributes is as follows.

The dimensions are given in a FORTRAN-like way. For example:

[DATA_ARRAY(NAXIS1,NAXIS2)]

is an NAXIS1 × NAXIS2 array with name [DATA_ARRAY]. The square brackets are a notation convention in this document to indicate the NAME of a data object. They are not part of the NAME. As in this example, use is made of symbolic constants (e.g. NAXIS1) in specifying array dimensions. The meaning of these constants are described for each structure.

A NAME in capitals is the NAME the component must take; where the NAME is in lowercase it is generic, i.e. it may be chosen by the programmer provided it does not duplicate a Starlink-reserved NAME. NAME is not globally significant—its meaning is only defined within the context of a containing structure.

TYPE may be one of the primitives or one allotted to a structure, denoted <_TYPE > and <TYPE > respectively. As mentioned earlier, the > and < signs form a notation convention used in the text of this document to signify the TYPE of a data object, and they are not themselves part of the TYPE. Details of HDS primitive TYPEs can be found in SUN/92.

Within Starlink standard data structures TYPE is recognised globally—a given TYPE always means the same thing no matter where it appears in a structure.

In order to allow for flexibility in the way in which data are represented, certain structure components are allowed to take one of a number of options. This is indicated in this document by “pseudo-types”, which are denoted by <type >; the standard pseudo-types are listed in Table 6.


Table 6: Standard Pseudo-types


Pseudo-type Allowed Types


<various > structured or primitive (to be specified)
<numeric > one of the primitive numeric TYPEs
<integer > one of the primitive integer TYPEs
<float > one of the primitive floating-point TYPEs
<narray > a <numeric > array
<farray > a <float > array
<iarray > a <integer > array
<c_array > an array of complex numbers or an <ARRAY > TYPE
<p_array > a <narray > or an <ARRAY > TYPE
<s_array > a <numeric > scalar or <p_array >



The <various > notation is used where the options do not neatly fit into one of the categories in Table 6. Sections 10 and 11 contain tables of the contents of standard structures, followed by descriptions of the individual items. The possible options for each <various > can be found within the description of the associated data object.

There is problem for the notation. Certain data objects are arrays. The number and sizes of each dimension of these arrays are mostly not fixed, and so cannot be specified explicitly in a structure’s table of contents. To overcome this problem, the array pseudo-types <narray >, <farray > and <iarray > are used instead of the scalars <numeric >, <float > and <integer > respectively in cases where the number and size of an array’s dimensions cannot be specified a priori. Thus, if a vector object has its dimensions specified in a table, it will not have an array pseudo-type assigned to it, because the dimensions are constant.

The first letter of the last-three types in Table 6 may be an aide mémoire of their meanings: c stands for complex, p for primitive and s for scalar.

The notation <c_array > can mean either an <ARRAY > or a <COMPLEX_ARRAY > (described below). The <ARRAY > TYPE illustrates some of the terminology used in this document. <ARRAY > has POLYNOMIAL, SCALED, SPARSE and SPACED variants. These forms are all methods for expressing an array of numbers, intended for use where a primitive array would not be suitable. Whenever one of these special forms is used, the primitive TYPE of the equivalent array (for example _REAL), which we will call the equivalent primitive type, must also be specified to allow application programs to select the most efficient form of processing.

Some examples to clarify the meanings of the pseudo-types are presented in Table 7. Parameterised dimensions indicate that their number and sizes are fixed. Numerical dimensions are arbitrary.


Table 7: Pseudo-type Examples




Entry in a Table of Contents
Example




Name Pseudo-type Name and dimensions Type




[BASE] <numeric > [BASE] <_INTEGER >




[LIST(NAXIS,NDATA)] <integer > [LIST(2,450)] <_WORD >




[TMIN(NAXIS)] <float > [TMIN(3)] <_DOUBLE >




[DATA] <narray > [DATA(512,100,3)] <_REAL >




[VARIANCE] <farray > [VARIANCE(100)] <_DOUBLE >




[DATA_ARRAY] <iarray > [DATA_ARRAY(512)] <_INTEGER >




[DATA_ARRAY] <c_array > [DATA_ARRAY] <ARRAY >
or [DATA_ARRAY] <COMPLEX_ARRAY >




[QUALITY] <p_array > [QUALITY(384,512)] <_UBYTE >
or [QUALITY] <ARRAY >




[WIDTH] <s_array > [VARIANCE] <_REAL >
or [WIDTH(2000)] <_INTEGER >
or [WIDTH] <ARRAY >