### 13 Creating new structures

13.1 Definitions
13.2 Algorithm

13.4 Extensions

The methods presented in this section are intended to ensure that the rules given in Section 2 are adhered to. Particularly, it makes sure that standard structures are used wherever possible, and therefore encourages the building of new structures (when needed) by gathering together existing structures, so ensuring the maximum commonality.

New structures must not be constructed simply by adding new components into standard ones (which is illegal), but instead by adding a new layer. The HDS hierarchy then provides a natural barrier between separate structures, and ensures that further components can later be added at any level without risking naming conflicts.

#### 13.1 Definitions

In the description of the design process the total hierarchy of structures is called a dataset to distinguish it from a single component structure. It is equivalent to the contents of an HDS container file. The term structure refers to a set of related data items; it corresponds to a single level of a dataset. Often a dataset will consist simply of a single structure.

#### 13.2 Algorithm

A summary of the algorithm to be used when creating a new dataset is given below. The circled numbering refers to expanded notes in the next subsection.

 Define what the software is going to do and not going to do   $○$1

 Identify, in concept, the datasets required

 Determine their interrelations, perhaps via a tree diagram   $○$2

 Start at the most deeply nested level of the hierarchy

 for each structure

 Identify the data components  $○$3

 if an existing standard structure can be used then

 Use it

 Place remaining associated items in a new structure or in an extension

 else

 Assign a unique HDS TYPE to the new structure   $○$4

 Assign a NAME to each component   $○$5

 Determine the rules and restrictions governing the way the data will be stored in the

 various components

 Assign a TYPE to each component   $○$6

 Identify the sorts of operation to be performed on the structure and ensure they are

 meaningfully defined   $○$7

 if the processing of a component cannot be defined in some cases then remove the

 component from the structure

 Implement and document the software needed to process the new structure   $○$8

 if the new structure is to become a standard type then

 Submit it and its software to Starlink for approval  $○$9

 endif

 endif

 endfor

#### 13.3 Explanatory Notes

(1)
• It is important to define the scope of the software initially and not to let it expand arbitrarily during implementation. If the software design does subsequently need to be revised, then the dataset may also need re-designing.
• During the following stages of the design process, the original outline of the dataset may prove to be incorrect or inadequate, especially in more-complex hierarchies. In such cases you have to start again.
(2)
• The interrelations between structures specify how they should be organised hierarchically. Drawing a dendrogram should help.
• The design process is bottom-up. Multiple-level datasets are built up from from the lowest (most deeply nested) level of the hierarchy. Design each and every structure at the current HDS level before going to the next higher level.
(3)
• Check to see whether any of the required data components are standard structures or are components of standard structures. If suitable standard structures already exist, then use them. If not, and you have to design new structures, try to make them general so that they might later become standard structures themselves.
• The original dendrogram design of the dataset may grow some extra branches if standard structures can be used, because there may be a net increase in the number of structures.
• Certain standard structures include provision for extension structures, and may thus be used even if there is no appropriate place in the standard structure itself for some of the items to be stored.
• Using existing standard structures gives the obvious advantages of being able to use existing software. (Starlink will maintain a list of standard structures and their conventions.) The standards and conventions associated with standard structures must be observed by all new software which uses them.
(4)
• The TYPE should reflect the sort of data to be held in the structure, but must not conflict with the TYPE of any other standard data structure. Starlink management should be consulted when defining TYPEs.
(5)
• The names should preferably identify the rôle which each component plays. Although the name will have no global significance outside the structure, it may still be sensible to have a naming convention for certain common types of structure to avoid confusion.
• There may be any number of rules and conventions governing use of the structure. For instance, some components may be optional, and the presence of some components may depend on others (as with the VARIANT concept). These rules must be explicitly stated and obeyed by all software which uses the structure. If this software is likely to be written by many different people, then the rules should obviously be kept simple.
(6)
• Only primitives or structures of a TYPE already defined may be used.
• Since only defined TYPEs (which includes primitives) may be used, any substructures must already have been defined along with the rules for processing them. It might also occasionally be appropriate to define structures “recursively” by including components of the same TYPE as that being defined.
• Often, standard subroutines will already exist for processing the data components from which the new structure is being built, and these can therefore be used to process the components of the new structure.
(7)
• Ensure that all the operations are meaningfully defined in terms of what will happen to each component when the structure is processed. Consider all valid combinations of structure components.
• Many packages “grow” indefinitely, so it may not be possible to enumerate all possible operations. However, if the initial (global) stage of the design was obeyed, it should be possible to identify them as broad classes, such as [image display, arithmetic, spatial smoothing$\dots$], or [create history, append history, search for history record$\dots$].
• It may be necessary to reject some components if you cannot meaningfully define what will happen to them in all circumstances.
(8)
• The software should obey all the conventions appropriate to the new structure (and any other structures it uses). When accessing a structure, software should first check its TYPE—this specifies how the structure contents are to be interpreted. Any component not covered by the structure definition should be completely ignored.
• From time to time, ignorance and independence of spirit will no doubt lead implementors and users into inserting extra components into a structure, but these are illegal and will be ignored. This is not a valid way of defining a new structure.
(9)
• If the new structure is to become a standard type, submit the design (providing details of the NAME, TYPE, meaning and processing rules for each data object) to the Starlink Head of Applications for approval and registration. If appropriate, a subroutine interface should be written for handling the structure; this would ensure that the conventions governing its use are enforced. Any associated software should also be submitted to the Head of Applications.
• Once a new standard structure has been accepted, anyone is free to use the structure and to incorporate it in any new structures he or she may create. Once this point is reached, it may be difficult to change the structure definition without upsetting somebody; changes in the form of additions to the structure are the least likely to cause trouble.

#### 13.4 Extensions

Once a “core” of fairly simple standard structures exists, the process of designing more specialised structures will be devolved to the various SIGs, who can use the simpler structures as building blocks. This avoids the problems of the ‘all or nothing’ monolithic approach. When a more complex (and therefore highly specialised) structure is built out of simpler ones, software will then automatically exist for processing all its substructures in a more general way. This should give a high degree of flexibility.

There will be independent extensions, each having a uniquely defined TYPE together with rules for its interpretation. Though many extensions will be independent and self-contained, some will form hierarchies. The design of each extension should be kept straightforward and appropriate to the kind of software which will use it. Simple and specialised, simple and general, and complex and specialised are all acceptable, but implementors should beware of attempting to design extensions which are both complex and general. By introducing a strict criterion to decide whether a given component is acceptable (“do we know how to process it?”), it is ensured that the problem is broken into manageable pieces, the complexity of which does not exceed our software-writing abilities.