Jul 092013

At the 2013 National Astronomy Meeting in St Andrews, I presented a review of the UHRA the University of Hertfordshire Research Archive including the plans to preserve data in very long term cloud storage managed by Arkivum.  I also described the recent discovery of photographs depicting the development of the Bayfordbury observatory, and how these and the observational data will be released for reuse via the UHRA.  The audience was a mere 10 people, half interested in the history of astronomy, and the other, invited speakers from museums; the Science Museum in London and National Museums Scotland, and from Jodrell Bank, where they are petitioning for Heritage Site status.

The aim of the session was to discuss the strategy that the Royal Astronomical Society (RAS) should take with respect to preserving astronomy heritage for the future, but focused on historical artefacts from the 1900s and what should be kept and how for the next 100 years.  It was enlightening that the issues that surround objects such as telescopes, computers, instruments, and software, are similar to those that we are used to with digital data.

Storage for the vast number of objects is an issue with warehouses filling up and only 7-10% of objects being exhibited.  Storage is expensive, objects have to be catalogued, looked after, and access to objects is difficult due to the over cramming of storage spaces.  This could describe both digital data and historical objects; the issue of what should be saved, what can be saved, and where it should be kept is challenging.

The museums operate reactively, saving what they can when items are donated from private collections, families, and universities, or can be purchased from auctions.  However, without documentation, the item may not be identifiable, repairable, or recognised as  historically important.  A grey box could be anything or nothing without evidence; who owned it, what was it used to do, or as basic as what is it and when was it made?  This is metadata that puts the item in context without which the object could be destroyed; this is also the case for digital data.

While these issues of storage and metadata are important in astronomy heritage, retention is the major concern.  The audience recoiled at the prospect of retaining data for only 10 years as this is nothing compared to the 100-year timescale that they consider.  A hundred years. The idea that digital data would still be useful in 100 years is incredible.  It is understandable that photos, interviews, and videos, of people, places, and events would have value to future historians, but would astronomical observations also be useful?

We know that stars are born and die, that objects move, and this temporal information is crucial to science, but how could these data be preserved in a useful format?

Just 50 years ago, images from optical telescopes were recorded on glass plates.  These images show fine details of nebulae and galaxies, but cannot be used for modern scientific work as they contain no data on intensities, wavelengths, or spectra.  Newer electronic data from 30 years ago when the Very Large Array (VLA) in New Mexico, was commissioned, is still accessible and can be processed using radio image processing software called AIPS.  Recently, the array has been upgraded, the software is now being maintained by users, and in another 10 years these data may not be processible.  It begs the question, if there are new images, should the old data be kept? Also, should we continue to keep only the unprocessed ‘raw’ data?

To continue to make these data useful, we also need to keep the software, process instructions, and ensure that current operating systems can run the software – is this reasonable and cost effective for 100 years?  Perhaps if there are many data that would need the software etc., then it would be worthwhile, but maybe it would be more beneficial to keep the data in a processed form.  The question then is who processes it?  How should these data be done?  Some calibration methods are quite subjective.  With historical objects, the amount of information that should be kept is equally difficult; is it sufficient to keep the documentation and photos and discard the item itself?

The group described some objects as ‘Rich’, where there is obvious importance behind an object – this maybe something that is worth consideration.  The racket that Andy Murray won Wimbledon with this weekend would be a good example of a Rich object, but with digital data, the PI who requested the observation will think their data is far more important than everyone else’s.  In this respect we have a far more complicated choice to make for the future of digital data in Astronomy.

It was also interesting that many museums and university’s with lots of historical instruments keep catalogues of these objects.  These catalogues are not currently open access, and while you can ask if something is in the catalogue, you can’t search for an object or compare it with other sites.  It is a comfort that institutional repositories are making their data catalogues open access as comparisons are a vital part of research.  There was discussion about making a national list that would pull together the catalogues – the fact that so many museums and institutes have catalogues has shown there’s a demand for a national list and this is likely to happen in data repositories, once subjects show that there is a demand for these national data catalogues.

In conclusion, the results of the strategic plan produced by the RAS should provide some guidance for preservation selection criteria and retention periods.  I for one have learnt that those brassy, old-looking bits of equipment in our lab are worth keeping and I’m going to get a sticker put on them to contact the science museum if they’re at risk of being discarded.