It was back to serious business in January with lots of activity in engagement, service development, and recruitment. All these strands took off at once in mid-month leaving me struggling to keep up, but it was good to see progress across all active work packages.
Work package WP1 – Audit current RDM practice at University of Hertfordshire
Cathy Tong and I continue to work on the audit in the interim before our new project officer becomes established. A picture is emerging with regard to existing centralised facilities. This shows a gap in provision for research groups where there are non-staff members involved, but also that there is under-utilisation of existing resources because researchers do not understand how to use or access them. I have blogged on this issue and started on a FAQ to counter the knowledge gap.
Cathy established two really interesting engagements, both valuable, but in different ways. One, in the Humanities, was very much forward looking and about possibilities; whilst the other, in Health, focused on solving an urgent Research Data Management problem in the here and now.
Our conversations with Owen Davies, Professor of Social History revealed he and his colleagues are holding a lot of what might be called ‘collateral data’ that has resulted from their research portfolio. Although they are taking the trouble to maintain it, they don’t have the capacity or technical know-how to get further value from it. Hopefully rdtk_herts can help with both robust maintenance and improved access to this valuable social history resource. In a moment of serendipitous synchronicity this conversation coincided with the @jiscmail thread about the History DMP project at Hull. I joined Martin Donnelly and Richard Green on Skype and agreed that University of Hertfordshire would participate in the production of a History specific template for the new version of DMPonline. All I have to do now is persuade Owen to go through the full DCC data management planning checklist and identify the gaps and irrelevancies with respect to his discipline.
Wendy Wills, Reader in Food and Public Health looked to rdtk_herts to answer some very particular and stringent storage needs. Wendy’s problems are a catalogue of frustration for her, but a perfect illustration of why the Research Data Toolkit is needed. Wendy is working with the Food Standards Agency who have very strict requirements for exchange of research data. She is required to deliver conditioned data by encrypted CD or DVD, and send the encryption key by a separate secure route. It doesn’t matter that the size of the dataset is small or the information within has been anonymised and has a negligible confidentiality risk – the rules apply and have a knock on effect on the conduct of the whole project. The best advice available locally was to use Truecrypt, with which portable ‘containers’, holding encrypted file systems, can be created. Unfortunately, like many still maturing OpenSource solutions, Truecrypt is difficult to use by a non-rocket scientist. Eventually a procedure for Wendy’s project was worked out, but the Truecrypt element was just one of set of interventions that were required, and this case exemplifies the challenge that RDM compliance brings to many researchers.
Work package WP2 and WP3 – Cloud storage and document management pilots
Most of the month has been spent working on the first of our pilot services in the cloud. We are working in partnership with Herts Regional Consortium (HRC3) to develop these services, but I have spent a lot of time working directly with their datacentre in Iceland as we struggled to understand each other’s business position. HRC3 uses the Thor datacentre, which was recently acquired by the large nordic IT services company Advania, and the relationship between the three parties is taking some time to settle.
I made the mistake of assuming the HRC3/Thor offer was much more mature than it turned out to be, and I think they assumed I had a substantial queue of research groups elbowing to be first at the feast.
The feedback I was getting from Work Package 1 suggested our researchers are open to the idea of the cloud, but find it difficult to say ‘we could use that’ without something tangible to work with. So I decided to create three simple services based on their informally expressed requirements. The services were: 1TB storage and backup; 1GB document management; and 1GB mysql database. I pushed HRC3/Thor hard to come up with costs and regulatory provenance and the other instruments of a mature offer and got carried away with this, neglecting the warning signs of continued requests for specifications. I had also been trying out storage and software as a service (SaaS) from the likes of Dropbox, Livedrive, Rackspace, etc, and these kind of services come with a well developed non-negotiable specification. So I was seeing the world through this lens and it took a while for it to sink in that HRC3/Thor were not ready to compete in this market. As I understand it, Advania bought Thor so they have a vehicle to deliver their existing ‘Big IT’ services but are also using it to expand into the SaaS and IaaS market. rdtk_herts can benefit from this by being part of the journey.
Some people may question why we are taking this tack with the Janet Cloud brokerage just around the corner. I would argue that there is value in pursuing an alternative, given that the Brokerage’s offer is unclear, and the details which have been resolved – in form of Eduserve’s offer – seem to be focused on Infrastructure as a Service (IaaS) and not, for example, storage. I know that I have colleagues here at University of Hertfordshire who are currently working up a plan with one of the brokerage’s approved suppliers, but such an agreement will be focused on looking after the mainstream requirements of our business. The idea of exploring other services to satisfy the edge cases found in research is still valid. There are other factors worth investing some effort in too, such as Thor’s Power Usage Effectiveness (PUE) ratio of 1.07.
‘get on with it‘, I hear you cry.
I am pleased to report that, perhaps a month later than planned, we have begun testing our pilot services.
These consist of:
- desktop storage: file system volumes mounted on Windows 7 and OSX.
- managed backup: not yet taken out of the box
- Microsoft Sharepoint: ‘foundation’ version for document and working data management
- Mysql database: as a standalone remote service and also via a Virtual Private Server (VPS) with web application facilities
Since our ‘developer’ post is still vacant, my Geek mode, never far away, has kicked in. I have been down among the ports and protocols, testing performance and comparing features with our existing infrastructure. More on this next month.
Work package WP6 Review data protection and IPR issues
HRC3 provided some detailed information about service integrity and compliance with European tendering processes. This was beyond me so I have filed it in the draw marked work package 6, to be revisited in the summer.
Work package WP8 Review data protection and IPR issues
I have also started a conversation about data publication using the public portal component of our CRIS (Current Research Information System), Atira Pure.
WP11 – JISC Managing Research Data (MRD) programme activity
January was the first month that we have not travelled to a programme related event. Prior to getting involved in the discipline specific work in History mentioned above, I explored and blogged about the DCC DMPonline tool.
WP12 – Project Management
After a lengthy recruitment process we interviewed a pool of candidates for two posts, hoping to be flexible with the skills presented to us. We advertised for a Project Officer (technical development) and a Project Officer (analyst and training). Unfortunately, we were only able to fill the latter position, though out of the two, the second post is more important at this stage of the project. By the next time I report I hope to be able to introduce a new name to the team.
With assistance from our Research Office we are nearing the point of gathering enough researchers to form the a Stakeholder Forum. This groups will advise the Project Team, help identify further engagement opportunities, identify Research Data Management (RDM) requirements, and disseminate the project’s outputs. It will receive and evaluate and research community feedback. It is expected that they will also be among the first users to benefit from new RDM facilities. More details can be found in the Stakeholder Forum Terms of Reference (PDF, 200 KB).