C Routines for accessing NDF Provenance

 C.1 Functions Provides:
 C.2 The PROVENANCE Extension:
 C.3 The Pre V6 PROVENANCE Extension:
 C.4 Full Fortran Function Specifications
 C.5 Full C Function Specifications

This section describes all the functions used for reading, writing, modifying and querying provenance information in an NDF.

The provenance information in an NDF encapsulates details of all the other NDFs that were used in the creation of the NDF. The information is hierarchical and includes parents, grandparents, great-grandparents, etc., all the way back to “root ancestors” (a root ancestor is an ancestor NDF that has no recorded parents).

On disk, the provenance information is stored in an NDF extension called “PROVENANCE” (for details see the section “The PROVENANCE Extension” below). The ndgReadProv function reads this information and copies it into an in-memory structure for faster access. All the other public functions defined by this module accept an identifier for such an in-memory structure as their first argument. The ndgWriteProv function can be used to write the in-memory structure back out to disk as a PROVENANCE extension in an NDF. The in-memory structure should be freed when no longer needed, using ndgFreeProv.

C.1 Functions Provides:

The following public functions are available. There is an equivalent set of F77 routines with names formed by converting the C name to upper case and inserting an underscore after the initial “NDG” string (C and F77 versions are - for the most part - documented individually in separate prologues below):

Note, within a provenance block, it is possible to prevent selected NDFs receiving provenance by using ndgHltpv (NDG_HLTPV) to stop the recording of NDF names before the NDFs are accessed, and then calling ndgHltpv again afterwards to re-establish the recording of NDF names.

C.2 The PROVENANCE Extension:

This section describes the format of the NDF extension used to store provenance information in NDG version 6.0 and later (for the pre-V6.0 format, see “The Pre V6 PROVENANCE Extension:”. The PROVENANCE extension in an NDF contains the following components:

DATA
— A one-dimensional integer array containing descriptions of all the NDFs that were used to create the main NDF. These descriptions are encoded into an opaque set of integer values in order to save time and space, but represent the same set of items descibed in “The Pre V6 PROVENANCE Extension:” below.

In version 6 of NDG, the provevance extension contained an additional component called “MORE”, which held an optional one-dimensional array of structures containing arbitrary extra information about selected ancestor NDFs. If present, each element of this array contained supplemental information for a single ancestor NDF, and the DATA array contained indices into the MORE array for those ancestors which had additional information.

As of version 7, the information previous held in MORE is now held in the main DATA array.

C.3 The Pre V6 PROVENANCE Extension:

The format in which provenance information is stored within an NDF’s PROVENANCE extension changed radically at NDG version 6.0, with another more minor change at version 7. Prior to v6.0, the seperate numerical values, strings, etc, that form the provenance information were stored in separate HDS components. But for large provenance systems this proved to be in efficient in terms of both processing time and disk space. Therefore, as of NDG v6.0, the numerical values, strings, etc, forming the information are encoded into a single array of integers as described in the previous section. The current version of NDG will read both formats of provenance extension, but always writes the new integer-encoded format.

The rest of this section describes the old format. In addition to documenting the old format, this description serves to illustrate the concepts behind the provenance system. These concepts have not changed - the only thing that has changed is how these concepts are stored within an HDS object.

Pre-V6.0 PROVENANCE extensions in an NDF contains four components: “PARENTS”, “ANCESTORS”, “CREATOR”, “DATE” and “HASH”. The DATE component is a character string holding the date and time at which the information in the provenance extension was last modified. The date is UTC formatted by PSX_ASCTIME. The ANCESTORS component is a 1D array of “PROV” structures (described below). Each element describes a single NDF that was used in the creation of the main NDF, either directly or indirectly. The PARENTS component is a 1D integer array holding the indices within the ANCESTORS array of the NDFs that are the direct parents of the main NDF. The CREATOR component holds an arbitrary identifier for the software that created the main NDF. The HASH component is an integer that identifies the contents of the current History record in the NDF at the time the PROVENANCE extension was created. This is used to determine which history records to copy into the PROVENANCE extension if the main NDF is used in the creation of another NDF.

Each PROV structure describes a single NDF that was used in the creation of the main NDF, and can contain the following components; “PARENTS”, “DATE”, “PATH”, “CREATOR”, “HISTORY” and “MORE”. If present, the PARENTS component is a 1D integer array holding the the indices within the ANCESTORS array of the direct parents of the ancestor NDF. If PARENTS is not present, the ancestor NDF is a “root” NDF (that is, it has no known parents). If present, the DATE component is a string holding the formatted UTC date at which the provenance information for the ancestor NDF was determined. If this date is not known, the DATE component will not be present (this will be the case, for instance, for all root NDFs). The PATH component will always be present, and is a string holding the full path to the ancestor NDF. This includes any HDS path within the container file, but will not include any NDF or HDS section specifier. Neither will it include the trailing “.sdf” suffix. If present, the MORE component is an arbitrary HDS structure in which any extra information about the ancestor NDF can be stored. The CREATOR component holds an arbitrary identifier for the software that created the ancestor NDF. The HISTORY component is an array of “HISREC” structures, each containing a copy of a single History record from the NDF described by the PROV structure. Only History records that describe operations performed on the NDF itself are stored (including the record that describes the creation of the NDF). That is, History records inherited from the NDF’s own parents are not included.

Each HISREC structure contains the following components (all taken from the corresponding items in the NDF History record): DATE, COMMAND, USER and TEXT. If the history record was created by the default NDF history writing mechanism, the TEXT component will contain a list of environment parameter values used by (or created by) the corresponding command, and another statement of the software that performed the action.

C.4 Full Fortran Function Specifications