It has been a while since I have committed one of these management reports to the blog, but this one seems to gel well and wrap around our recent articles, not to mention providing a linkfest opportunity, so it seems worth it this time around. Enjoy it if you can.
WP2 Cloud Services Pilot
The focus for this activity has been in looking at options for a collaborative workspace, in which researchers from within and without University of Hertfordshire can share data. Several threads have been pursued:
iFolder and SparkleShare (Dropbox alternatives). This work complements that done on OwnCloud by the Orbital project. iFolder originated from Novell and runs on SuSe linux. We looked at this because its provenance and requirements seem to match our local infrastructure, however the feature set is inferior to more modern alternatives and it looks like it may be moribund. SparkleShare is opensource software which puts a team sharing layer on top the GitHub version control system. It is a good candidate for a cloud hosting service and looks promising, but it would require considerable technical investment to operationalise at UH. Further investigation is required.
DataStage. DataStage offers WEBDAV and a web browser GUI, independent user allocation (albeit via the command line), and a bridge to data repositories. We have conducted tests running on desktop machines and within our virtual server environment. The release candidates we have tested are not yet stable enough to support an operational service. Development of the v1.0 release of DataStage, which has been talked about for several months, seems to have stalled.
SharePoint has returned to our thinking, since it is widely available as a cloud service, offers WEBDAV and a web browser GUI, and version control. When combined with active directory services, it seems to offer a cloud service to complement our existing networked service in a hybrid service model. Further investigation is required.
The utility of the main offerings in the cloud files market is being assessed. This has been less a technical appraisal and more a review of the Costs (over and above the initial free offer), Terms and Conditions and options for keeping data within specified jurisdictions. Further investigation continues.
RDTK is attempting to increase the usage of centrally managed network storage at UH. We continue to regularly encounter researchers who don’t know how to use their storage allocation, or that they may request shared storage at a research group or project level (R: drive). We are producing new advice about, and encouraging the use of, the secure access client for our virtual private network, which Is not well understood, but gives much more effective access to the networked file store than the usually advertised method. We intend to offer a ‘request R: drive’ feature on the RDM web pages, and facilitate the adoption of this facility, which again, is not known to many research staff.
A lengthy technical blog Files in the cloud ( http://bit.ly/R583If ) and a presentation A view over cloud storage (http://bit.ly/SB2cK8 ) bring the issues encountered in this work together. This presentation gave vent to the widespread interest amongst many projects at the recent JISCMRD Workshop in the issues around the use of ‘dropbox like’ cloud services. I represented these voices in a session with the JANET brokerage, underlining the importance of nationally negotiated deals with Amazon, Microsoft, Google and particularly Dropbox, for cloud storage, in addition to the infrastructure agreements already in place. A reflection on the JISCMRD Programme Progress Workshop (http://bit.ly/SdTqCz) refers to this encounter.
WP3 Document Management Pilot
A part time member of staff has been recruited to scan legacy documents into an electronic Trial Master File (eTMF) for work carried out by the Centre for Lifespan and Chronic Illness Research (CLiCIR). The work is progressing extremely well under the direction of the Principal Investigator. A journal of activity, issues and time and motion is being kept. After 8 x 0.5FTE weeks the first phase of scanning, covering the trial documentation, is all but complete. Only the original anonymous patient surveys remain. There are ethical issues and a debate about whether this remaining material is useful, publishable data to consider at this point.
This is proving to be valuable collaboration between the researcher, the University Records Manager, and the EDRMS system consultant. The draft Research Project File Plan has been updated in the light of practical experience and the work has attracted two further potential ‘clients’ from within the School of Life and Medical Sciences.
WP6 Review data protection, IPR and licensing issues
CT and SJ have begun reviewing literature on licensing data, including that from the DDC.
WP8 Data repository
An instance of DSpace 1.8 has been installed on a desktop machine with the aim of testing data deposit via SWORDII protocols. Data deposit has been achieved using several mechanisms, including Linux shell commands, a simple shareware application, and the deposit component of Datastage. This latter piece was facilitated after generating interest at a programme workshop, whereupon a collaboration with other projects helped modify our instances of Dspace and Datastage, so as to allow the SWORD protocol to work.
SWORD deposit has also been demonstrated into the development system for the University Research Archive (UHRA – this the old one). This is not working with Datastage as yet, as it needs further modification. Further progress awaits the roll out of the new system, which in turn has been delayed due to its dependencies on our newly re-engineered Atira Pure Research Information System (UH access only).
A presentation DataStage to DSpace, progress on a workflow for data deposit ( http://bit.ly/Xvmidr ) refers to this work.
Inter alia, there was considerable interest generated by a blog and submission to Jiscmail with regard to organising a practical workshop on how to acquire Digital Object Identifiers for Datasets. Since that time Data.Bris have minted their first DOI using the BL/Datacite api and service, and the Bodleian Library are expected to do the same shortly. The current position is that a workshop might be arranged as part of the JISC sponsored BL/Datacite series. Simon Hodson is facilitating. The article DOIs for Datasets (http://bit.ly/QonFoN ) produced the largest spike in traffic seen by this blog so far (175 page views, 55 bit.ly clicks).
Expressions of interest to publish datasets are beginning to be cultivated, including, for example, datasets of oral histories and historical socio-economic data.
WP9 Research Data Toolkit
Content for the Research Data Toolkit is progressing well with all RDM team members contributing and refining the product. We have decided to restrict the ‘toolkit’ brand for use with project and adopt a more generic RDM brand for the published material and activity, so as to create a foothold for a sustainable service. ToolKit resources will appear at herts.ac.uk/rdm which may re-direct to herts.ac.uk/research-data-management or herts.ac.uk/research/research-data-management. We are still in negotiation with the University web team over a content delivery platform.
The content is being developed in a platform agnostic way as a set of self-contained pages of advice, which could be delivered under different overarching models, and/or re-purposed for print. There is still some debate as to whether to arrange the pages in groups by activity or by research lifecycle stage. The draft table of contents and sample content are under wraps for now.
WP11 Programme Engagement
RDM team members have made 5 presentations and participated in 14 days of programme events including:
- DCC London Data Management Roadshow, London
- BL/Datacite: Working with DataCite: a technical introduction, London
- BL/Datacite: Describe, disseminate, discover: metadata for effective data citation, London
- JISC Managing Research Data Programme / DCC Institutional Engagements Workshop, Nottingham (3 presentations)
- RDM Training Strand Launch Meeting, London (1 presentation)
- JISC Managing Research Data Evidence Gathering Workshop, Bristol (1 presentation)
The project blog (including 4 new articles) has received over 800 visits and nearly 1500 page views in the previous quarter.
Other Activity: recruitment
The project has been recruiting for 3 x RDM Champions to work in the University’s research institutes for six months at 0.4 FTE. We were looking for an established member of each institute, with sufficient experience and to be able to quickly embed sustainable and good practice RDM among their peers. The vehicle for this will be the objective of assisting PIs to prepare or improve a significant number of Data Management Plans within each institute. Recruitment has been only partially successful. One person started on December 1 in the Health and Human Sciences Research Institute (HHSRI), another is due, subject to final agreement, to start on 28 January in the Social Sciences, Arts and Humanities Research Institute (SSAHRI). The remaining post, in the Science and Technology Research Institute (STRI), has not been filled.
Other Activity: data encryption workshop
We have produced a guide (http://bit.ly/QHyN2y), blog (http://bit.ly/XxDoEM) and workshop (http://bit.ly/11rwLXA) with regard to encryption of sensitive data for sharing, transport, and security on removable media. Over 20 university staff participated in the first workshop, and we have a waiting list for the next date, which will be announced this month. The material was well received and the feedback, though not yet properly evaluated, looks positive. We equipped 6 researchers with large capacity data sticks secured with TrueCrypt, and will evaluate their experience in February. We are encouraging the other attendees to try out the standalone encrypted container available via the toolkit blog (http://bit.ly/TKb8hG ).