|
|
How to publish data to the VO?
Rev. 1.12, 25 March 2004
Overview
1. To Scientific Users of the AVO prototype client tool: directly go here
2. To Data Providers: please keep reading
This page is about how to publish data to the VO. It is mainly a technical process relevant to data providers. Go directly here if instead you intend to use the AVO prototype as a research tool.
The necessary standards are worked out by the global forum of the International VO Alliance (IVOA) whereas prototype software and use cases are implemented by individual VO projects depending on available expertise and resources.
There are three steps to take when providing data products in a VO compliant manner:
- find an archive or data center that puts the data on-line
- team up with this data center and decide which VO specifications apply
- together implement the bits that are applicable to your data collection
IVOA specifications are at various states of maturity. There is no single recipe covering all possible scenarios. A core IVOA document about data quality is in the making and expected to be ready in May 04.
Below follows a collection of paragraphs describing the building blocks of the VO infrastructure. It is not a linear text. See the architecture diagram for interrelationships.
Click on individual items on map of VO architecture to learn more. Sample implementations are biased towards the AVO project.
Credits for diagram: Building the Framework for the National Virtual Observatory by Szalay, Williams, Hanisch, et al.
Anchors in this document ...
Who maintains data collections
The responsibility for creating and maintaining a data collection is with the research team and its archive or data center that provides the infrastructure for putting the data online.
To be clear: VO projects like AVO are conducting studies and do not provide archive facilities.
A few institutions such as CDS and ADS deal with certain types of information e.g. tabular or bibliographic data. But again, there is no VO facility in charge of maintaining data collections.
In Europe the Euro-VO project will provide some resources to help existing archives with this task starting mid 2004. Again, archives will adapt VO standards to improve services but stay in charge of their data.
Portals, User Interfaces, Tools
AVO Prototype
The AVO prototype is a freely available suite of tools used to test and demonstrate new functionality. It can also serve as a template for developers. Building user interfaces is not a core task of the VO as such, but simply a necessity to promote new features. It is the plumbing underneath - the VO infrastructure that is subject to innovation. Non-VO projects are encouraged to re-use individual components and build their own user interface on top of it and - on the other hand - to release their own special purpose components in turn.
See How to provide data in Aladin and the AVO prototype by P. Fernique.
More details about supported formats and procedures:
2d FITS images, Specview plug-in, SED plotter, cross match tool, column calculator, catalog filters and below paragraph about format conversion.
Further documentation is available from the download page.
Format Conversion
There is an on-line VOTable format conversion service that the ESO/ST-ECF contributed to AVO:
Another conversion tool is conVOT by VO-India.
VOPlot
VOPlot is a lightweight plotter for VOTable documents. See the tutorial.
Mirage is another frequently used analysis and visualization tool available from Bell Labs.
ACE - Astronomy Catalogue Extractor
ACE is a Web Service wrapper around the Sextractor source extraction tool. It is integrated into the AVO prototype, but also works stand-alone. It was one of the first utilities that returned results in VOTable format. Support is limited and, of course, it requires profound knowledge of the given data and Sextractor configuration parameters (e.g. samples for HST/ACS) to assure that the output is meaningful. Please inquire with the Astrogrid technical lead Keith Noddle for further information.
Resource Discovery, IDHA
Automation of resource discovery will keep the VO community busy for a couple more years. One approach for hierarchical browsing of heterogeneous data collections available today is based on the IDHA (Images Distribuées Hétérogènes pour l'Astronomie) data model. IDHA views can be serialized into VOTable documents. The AVO prototype uses it to merge results of distributed queries. IDHA is not (yet) a standard though, but there is currently no readily available alternative either. Services returning hierarchical views on their data encoded as IDHA/VOTable documents can be integrated into the AVO prototype like SIA or SSA services. See also registries.
Registries
VO registries support the OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting). Browse OAI libraries here.
Some portals of individual registries are:
An organization may decide to maintain it's own registry and have its content mirrored to other - so called full registries. The mechanism is quite similar to that of DNS servers managing Internet resources. Here's the latest IVOA registry specification.
A number of dedicated new registries have been setup. They haven't yet replaced resource registries such as CDS/GLU and mission/observation logs. Currently (Feb. 04) it is possible to
- request an AuthorityID for an organization from exactly one VO registry of your choice
- ingest resources of various types for a given AuthorityID
- browse existing entries
- registries can harvest each other (=> ingest ones, find it everywhere)
Full integration (registry harvesting language) into browsing tools and automated resource discovery features incl. invocation of services is still subject to standardization efforts. See also resource discovery.
Data Access Layer
Belonging to this conceptual layer are several access protocols:
| ConeSearch | search catalogues by position (i.e. returns objects in a cone shaped slice of the universe) |
| ADQL | more powerful query mechanism for catalogues |
| SIA | protocol for 2d images |
| SSA | protocol for 1d spectra and SEDs |
The protocols share many similarities and are likely to be merged at a later stage.
Access Methods
Apart from the classic access methods like HTTP GET and CGI scripts especially the new methods of the web & grid service world are explored. See the material presented at the VO Tutorial ADASS XIII on how to write Web services.
ConeSearch
ConeSearch was the first VO experiment where many interested parties decided to implement a basic service in a common way. It's significance is mainly of sociological and experimental nature. It helped to pave the way for the International VO Alliance. ConeSearch summary page.
SIA Protocol
Specification of the Simple Image Access Protocol. SIA and SSA protocols have three modes: A query mode for browsing, a retrieval mode for getting selected observations and a metadata mode to self describe the service.
SSA Protocol
The initial release of the Simple Spectrum Access protocol is expected in May 2004. This is an abstract. Prototype services were implemented by the ISO data centre and AVO and the included in the AVO portal.
ADQL
Astronomical Data Query Language (ADQL)* is a XML language for constructing queries. It is based on SQL. The mechanics of passing a query is described in the Skynode interface document*. Visit the VO Query Language Work Group page to learn more about writing your own VO compliant web services. ADQL is far more feature rich than ConeSearch.
There is an on-line SQL to ADQL translator at JHU.
VOTable
An XML format which is used for data as well as metadata. A number of APIs for reading and writing VOTable exist.
It is best suited for tabular data. All the data access layer protocols use it to return query response or service metadata. A number of analysis tools read and write VOTable format already.
Computational Services
Certain data products are computed on demand. The output, which is tagged Virtual Data in our diagram, is treated like any other datum.
The Astrogrid project provides a package of tools and libraries that form the framework for a data center that offers computational services.
Models - Simulated Data
G. Lemson and J. Colberg summarized the status of simulated data in the VO in this paper.
Need more help
*Caution: Do not bookmark the documents flagged with a star *. They are about to be moved to a different location. The links in this page will be adjusted accordingly.
go to top
|
|