|
|
WP3.2 - Storage and Compute Technology
Work Plan : March 6th 2003
This plan results from recent conversations between Andy Lawrence (overall WA3 leader) and Andreas Wicenec (WA3.2 leader). A timeline is at the bottom of this page.
Reminder of key deliverables from original project plan :
- 3.2.1 Deploy trial hardware (Nov 2002)
- 3.2.2 Storage assessment report and recommendations (Nov 2003)
- 3.2.3 Architecture assessment report (May 2004)
Status of deliverables.
Deployment of trial hardware (3.2.1) was abandoned at the start of the project. The original idea was deploying something like the NGAS system, or a standard Beowulf system, to several sites, for each group to experiment with. However, we got somewhat less than our full request and decided to keep staff levels as high as possible, while cutting hardware provision to a minimum. This means that we have funded personal computer equipment for AVO staff, but no archive installation equipment at all, expecting constituent centres to provide own hardware as necessary.
The "storage assessment report" (3.2.2) is still very much the intention, except that its remit will be interpreted more widely, to include other hardware issues affecting datamining - speed (and cost) of running simple queries, complex queries, and N**2 computations such as on-the-fly Fourier transforms. (To reflect this, I have re-named the report below to "Archive Centre Hardware Report".) Most of the work we will report on is within the NGAS project, but significant work has been going on elsewhere, both inside and outside the consortium. What we need therefore is to co-ordinate this work and bring it together to make our final report.
The "architecture assessment report" (3.2.3) should be taken as a placeholder until further work has been completed. We will certainly achieve one more formal deliverable, but should debate its nature once we get closer to completing the November 2003 report.
Strands of work
The main strands of work we want to pull together are :
- the NGAS experience in designing and implementing a cheap and robust scaleable storage system, and an associated management system
- similar Canadian work based on a proprietary system
- the experiments undertaken by Szalay, Gray, Kunszt et al in making DBMS queries run as fast as possible on a commodity cluster system, and their experience in implementing SkyServer as a working system for external users (see report)
- recent similar experiments by the WFAU group in Edinburgh on loaned hardware, during the design work for the WFCAM science archive
- experience of the Jodrell Bank group in using a large Beowulf cluster to do on-the-fly image construction from fringe data
- the experiments planned by Edinburgh AstroInformatics group to do very fast DB trawls in RAM on an SMP machine
- any other relevant published work
- any other work we can locate for which reports are available privately.
An issue is whether we initiate new tasks explicitly aimed at the AVO reports. Andreas and I decided to delay this decision. In the first instance we will commission or collect reports on the above tasks for collation into a single report. But we will also arrange some kind of workshop in which we debate the issues and decide whether to undertake new tasks for the remainder of the AVO project.
Plan for "Archive Centre Hardware Report".
The report will be written by Andreas Wicenec. It is likely to have sections along these lines :
- Aims
- Requirements Analysis
- Technology alternatives / variable parameters
- Implementation Experiences
- Tests undertaken
- Assessment
- Reccomendations for Data Centres
- Plans for further work
The requirements section, as well as having a basic analysis of the issues, should probably have some quantitative detail on data volumes for key datasets (VISTA, ALMA, Gaia, etc) and target query speeds and trawl rates.
The report needs to cover a variety of issues. (i) Storage, its scaleability, and cost (including "total cost of ownership"). Although NGAS is looking good, we should step back and report on SAN, NAS, disk arrays versus clusters, etc. (ii) Bulk data access, i.e. pixel images and similar data. (iii) Short queries on database tables, i.e. those at which indexing is targeted. (iii) Long queries, requiring bulk trawls. (iv) Computations which aren't just SQL queries - e.g. cluster analysis, Fourier transforms, etc. All these will have software issues as well of course.
Crudely the world seems to divide into pixel storage, query engines, and analysis engines. There has been considerable experimentation with storage options, with query speed versus number of disks, and experience with SMPS is just starting. A couple of new technical issues also look interesting. (i) Has anybody tried a cluster with Myrinet? There are also other new low-latency interconnects such as InfiniBand which might strongly reduce the need for SMP machines. (ii) Synchronised disk systems may hugely improve Disk-memory bandwidth, although this may be very expensive.
The general plan is that people working on the "strands" listed above will be asked to deliver reports to Andreas. These don't have to be in a standard style, as they are material for Andreas to use as he sees fit in assembling the report. Andreas will then write a preliminary report and circulate it. We will then have a workshop at which people will present their experiences, debate new tasks needed, and produce a plan for the remainder of the AVO programme. Then we will finish the report in time for November, and any subsequent report next year.
Timeline
- March 6th 2003 : This plan
- April 2003 : Andreas posts skeleton of report to Twiki
- May 1st 2003 : Strand Reports to Andreas
- June 1st 2003 : Preliminary "Archive Centre Hardware Report"
- Jul-Aug 2003 : Workshop, location TBD
- Sept 2003 : Revised Plan
- Oct 1st 2003 : Draft Report
- Nov 1st 2003 : Final "Archive Centre Hardware Report"
- May 1st 2004 : Stage II deliverable TBD
|
|