AVO Metadata-tree : Overview and Documentation ============================================== François Bonnarel and Mark Allen (CDS) Version 1.1 February, 2004 Introduction ============ Browsing and visualization of image datasets will be an important part of Virtual Observatory operations. Such datasets may range from a small set of images stored on a local disk, to the terabyte collections of modern surveys made accessible via various Web Services. Standardized and scalable descriptions of image metadata will be required to enable dataset browsing, selection and visualization. The Metadata Tree that was developed for the AVO 1st year demonstration and enhanced after that, represents a prototype implementation of a scalable, hierarchical metadata description (in VOTABLE/XML) of an image dataset. This component of the AVO prototype was built as a customization of the Load feature of the Aladin sky atlas, and was designed initially for the GOODS dataset. It has been extended to describe the full content of the Aladin server. A test WFPC2 server has been developed by CADC and ST-ECF, for which they have also implemented the protocol. Support for spectra and datacubes has been added for the AVO 2nd year demo. Using a VOTABLE description, it displays a tree view of the data that is available within a specified region on the sky (or within a given range of parameters). The implementation in the prototype also allows display of fields-of-view outlines, and provides feedback between the tree and visualization window. The metadata description includes the location and retrieval parameters for the data; in the case of GOODS, this is basically a http-get template for the GOODS data server at the CDS. This function is now included in the public version of the Aladin sky atlas (from version 2.0). This generalized metadata tree allows description of other datasets (local or remote) in the same way as was done for the GOODS dataset in the AVO demos. This document outlines how to define a VOTABLE description of a dataset for use with Aladin. Relationship to the IDHA Data Model =================================== The metadata description of the GOODS data used for the AVO demo was based on a generic description of astronomical data being developed by the IDHA data model project. The goal of the IDHA data model project is to describe the information content of image archives and datasets. In the IDHA data model the metadata are coded as objects, with logical links between them. The graph of the IDHA data model is shown at http://alinda.u-strasbg.fr/IDHA/lastmodel. While the use of the data model is not required to generate the VOTABLE for a general dataset, it is instructive to see how the general form of the VOTABLE description relates to the data model graph. Various tree structures may be extracted from the model graph. In general this is done by starting at a node, and following various links, avoiding reverse and circular links. In the case of GOODS, the tree was chosen with a hierarchy organized in terms of: ObservingProgram - ObservationGroups - Observations This can be described as one "view of the data model", but it is also possible to use the model to define trees with different hierarchies, and different starting points. For example, by starting from Observation one may generate a tree that has branches describing the instrument used, and the observing conditions etc. VOTABLE description of a dataset ================================ Here we describe in detail, how to structure a VOTABLE description of a dataset for use with the Aladin Metadata Tree. VOTABLE has the capability of describing not only flat table structures, but can also describe a hierarchical tree. The Metadata Tree is expressed in VOTABLE by recursive use of the element. Each element represents a node of the tree. The children of the nodes appear as new elements included in the previous one. NODES ----------- In this description we make use of three different kinds of nodes: "ObservingProgram", "Observation_Group", and "Observation". These relate to parts of the data model with the same name. "ObservingProgram" nodes are intended to be the root nodes. Several root nodes may occur in a single document. "Observation_Group" nodes are the children of "ObservingProgram" nodes, or other "Observation_Group" nodes. These nodes are intended to be the main way of subsetting the dataset into a hierarchy. "Observation" nodes are children of the "Observation_Group" nodes and are intended to contain the description of a subset of the image metadata that will be useful to display, for example field centre coordinates and sizes. This is intended for information relevant to an Astronomer. "StorageMapping" and "StoredImage" nodes are linked to the "Observation". They are intended to contain detailed metadata about the images that are required for using the images in an interface. NODE ATTRIBUTES ------------------------------- The attributes of a node are defined within the part of the element. Each attribute of the model class is coded as a in the . The use of these attributes is necessary to enable the full functionality of the Metadata Tree within Aladin. The tree will however function provided the main node definitions are included. We now provide a list of the attributes we are using, and define them as recommended or optional. A tree with all the recommended attributes will be called an IDHA metadata tree (Image Data Hierarchical Access tree). For ObservingProgram, the or attributes are: * Name (recommended), the ObservingProgram name. * Organisation(optional), the name of Organisation(s) performing Observing Program * Begin(optional), the Begin date of the survey * End(optional), the End date of Observing program * SpectralDomain(recommended), the General spectral domain (Optical X-ray ...) For Observation_Group, the or attributes are: * Selection_Criterion(recommended), for the sampled parameter. A sampled parameter can be any attribute of any class of the data model considered useful. For example: Filter name (or filter range), epoch of observation, Polarizer state, etc ... In practice we only experimented selection on various classes Names (Filter, HST_Association , Grating, Telescope, Instrument, etc ... .) Pseudo criteria can also be used to create and name additional level (for example epochs in HST-ACS data) * Selection-Range(recommended) , the Constraint on the sampled parameter (Inside the Observation_Group resource we can find any resource describing one aspect of the ObservingConfiguration on which the selection has been made: in our annotated exemple, it is the case for the filter resource) For Observation, the or attributes are: * Observation_Name(recommended) , the name of the Observation * ReferenceNumber (optional) , an internal or reference number of the Observation *ObservingProgram_Name (optional) , for reverse link to the Observing Program *BandpassRestriction(ID)/FilterName, for link to the Filter or Bandpass information *X_LIM (recommended/ if lacking Size_alpha will be used): vector for description of X coordinates of the corners of a polygonal field of view. If one value is given a rectangular FOV is assumed. *Y_LIM (recommended/ if lacking Size_delta will be used): vector for description of Y coordinates of the corners of a polygonal field of view. If one value is given a rectangular FOV is assumed. * Size_alpha (recommended) , the Bounding box size including the observation in Right Ascension * Size_delta (recommended) , the Bounding box size including the observation in Declination * PixelSize (recommended) , the Pixel size measured in sky units. Actual unit (arsec,arcmin,deg)is given in the FIELD definition. * Origin (optional) , the Organisation who provided the data. * OriginalCoding (optional) , the image format provided * COMPRESSION(ID)/AvailableCodings (optional) , for the possible codings of these data * MODE(ID)/AvailableProcessings (optionals), for the possible on line processings * alpha(recommended) , the RA Position of the center * delta(recommended), , the DEC Position of the center * date(ID)/DateAndTime (optional) , the Date and time of observation (begining) * Position Angle (optional, assumed 0 if lacking) , the Position Angle of the Observation: Y axis East of North * resourceType(ID)/ObservationType (optional for 2DIMAGE, else recommended), Type of the observation: 2DIMAGE, SPECTRUM, DATACUBE The meaning of all the attributes will be clarified by an annotated example, which is presented below. DATA MANAGEMENT ----------------------------------------------- The (various) managment possibilities of an observation are described in the StorageMapping . Several StorageMappings instances can be created for a single observation. The FIELDS are: * Cutout(ID)/Organisation which deals with the mode of organisation of the provided data (possible values: CUTOUT, STANDALONE, RETRIEVAL, SLICE, PREVIEW, MULTICHIP, etc ...) * desc(ID)/OrganisationDescription which gives a litteral description of these organisation * RESOLUTION(ID)/ResolutionType which deals with the level of relative resolution of the provided output. Possible values : FULL, LOW, PLATE , etc ... * number(ID)/NumberOfPatches * Indexing which gives the indexing mode of the Pixel links (stored in StoredImage), possible values: PATH, URL, TEMPLATE, GLU, HTML the latter for HTML front pages to services (instead of direct access to pixels). * Mapping Parameters which contains a vector of numerical values Organisation dependant: For SLICES it can be a velocity zero point and an an increment. DATA LOCATION ---------------------------------------------------- The data location parameters are set within StoredImage elements. The main attribute () are Location, and Glink (name = LinktoPixels, in Stored Image) the value of which can be a file path, a URL, or in the second case a GLU mark. The file path is intended for referring to local files only. * Each file path must be preceded by "file:" URL and Glu Marks are intended to provide access to remote files. *URL marks are prececed by "http:" The URL and Glu marks allow for inclusion of parametrized URL templates. To enable the Glu facility the service must be registered in a GLU database.