An audit of research data holdings within University of Hertfordshire was conducted in the period May to July 2012.
The online survey (described in more detail here) was circulated to around 600 research staff first via their regular monthly newsletter, with follow up reminders sent by our information managers to schools and research centres, as well as via our continuing programme of RDTK awareness meetings and interviews. There were 67 responses which represents %12 of those invited to take part. Most research active disciplines were represented in the respondents, albeit with a strong showing from the STEM subjects.
The survey has brought insight into the extent of our research data. It allows us to estimate that we hold approximately 2PB across the whole research landscape. This is a factor of ten larger than our current central provision. However, around 80-90% of this belongs to a very few research groups, who are relatively well organised and funded for RDM, and it tends to be working data for those that crunch numbers – so it may not necessarily be data that requires retention. The remaining 10-20% of research data, which belongs to the balance of 80% of researchers, looks like a manageable quantity. This suggests that cultural change rather than capacity may be the predominant issue when it comes achieving a migration to a more robust infrastructure for working data for the majority of researchers. Likewise, we should expect to be able to manage the data that could be preserved, if we can build the culture and processes to make that possible.
In addition to requirements that have already been resolved (such as easy to use encryption and more flexible provision of storage for mixed staff/student/external research groups) the survey revealed some previously unvoiced requirements, such as centralised version control for source code, CAD and design files.
Perhaps influenced by the STEM respondents the survey also showed that venerable FTP is alive and still working well in amongst the new (and rebranded) offerings of the cloud. This indicates there continues to be profit in exploring a FTP based cloud storage pilot.
The key messages from the survey support the anecdotal evidence acquired to date – in the main there was no big, new, news. However, the subtext obtained is valuable and it underlines that considerable help and resources are needed over the whole project lifecycle, from planning to preservation, if we are to satisfy the demands of a rapidly developing (some would say hardening) funder’s policy regime.