Jun 052014
 

The Service Oriented Toolkit for Research Data Management project was co-funded by the JISC Managing Research Data Programme 2011-2013 and The University of Hertfordshire. The project focused on the realisation of practical benefits for operationalising an institutional approach to good practice in RDM. The objectives of the project were to audit current best practice, develop technology demonstrators with the assistance of leading UH research groups, and then reflect these developments back into the wider internal and external research community via a toolkit of services and guidance. The overall aim was to contribute to the efficacy and quality of research data plans, and establish and cement good data management practice in line with local and national policy.

The final report is available via http://hdl.handle.net/2299/13636

Blog Survey based on Digital Asset Framework http://bit.ly/18QUZR9
Survey Survey results http://bit.ly/1ao74vy
Report Survey analysis http://bit.ly/128uGMK
Blog UH Research Data Policy in a nutshell http://bit.ly/14cXC9w
Artefact Interview protocol, used by project analyst and RDM champions http://bit.ly/12Jr9KZ
Case studies 12 Case Studies http://bit.ly/19MjnD3
Review Review of cloud storage services: features, costs, issues for HE http://bit.ly/12Jn2yz
Blog Files in the cloud http://bit.ly/R583If
Test data Files transfer rate tests http://bit.ly/1266WsJ
Blog Analysis of barriers to use of local networked storage http://bit.ly/12Gleqg
Blog Hybrid-Cloud model: when the cloud works and the attraction of Dropbox et al. http://bit.ly/Xvmidr
Blog Hybrid-Cloud example: Zendto on Rackspace, integrated with local systems http://bit.ly/11In83q
Service UH file exchange https://www.exchangefile.herts.ac.uk/
Blog Cost of ad-hoc storage http://bit.ly/19ilycQ
Blog Cost of data loss event http://bit.ly/13RSckb
Blog Reflection on use of Rackspace CloudFiles
Blog Data Encryption http://bit.ly/XxDoEM
Training Data Encryption workshop http://bit.ly/11rwLXA
Training Data Encryption guide http://bit.ly/QHyN2y
Blog Document Management for Clinical Trials http://bit.ly/15cfT5K
Artefact eTMF – electronic Trial Master File, 1954 legacy documents scanned no public access
Artfifact Research Project File Plan http://bit.ly/11InVkW
Workflow Post award storage allocation
Workflow Request ‘Research Storage’ Form http://bit.ly/17V7J8t
Workflow Research Grant and Storage Process http://bit.ly/14kvCB0
Workflow Request ‘Research Storage’ Workflow http://bit.ly/12d2aJP
Service R: (R drive), workgroup space with external access access by workgroups
Service DMS, workgroup space with external access access by workgroups
Dataset 4 Oral history datasets, ~300 interviews, 125GB http://bit.ly/uh-hhub
Dataset 1 Leisure studies dataset, SPSS survey, interviews, transcripts, 8GB in preparation
Blog Comparison of data licenses http://bit.ly/12DmXfR
Report Comparison of data licenses http://bit.ly/13NC7gA
Service UHRA repository improvements phase 1 http://uhra.herts.ac.uk/
Blog DOIs for datasets, includes mind map http://bit.ly/QonFoN
Workflow Deposit/access criteria for data with a levels of openness http://bit.ly/12cUqrq
Service RDM micro site (aka Research Data Toolkit), 100+ pages and pdfs of RDM guidance http://bit.ly/uh-rdm
Report Register of Programme engagement at external events, estimated audience 480, ~300 individuals Appendix A
Blog Programme engagement: 38 Blog posts http://research-data-toolkit.herts.ac.uk/
Presentation Association of Research Managers and Administrators Conference 2013 http://bit.ly/ZXv8RK
Presentation UH RDM Stakeholder briefing June 2012 http://bit.ly/11KkJGo
Presentation UH HeaIth and Human Sciences research forum July 2012 http://bit.ly/15cDUKb
Presentation JISCMRD progress workshop Nottingham 2012: storage http://bit.ly/10qpry3
Presentation JISCMRD progress workshop Nottingham 2012: repository http://bit.ly/126zjab
Presentation JISCMRD progress workshop Nottingham 2012: training http://bit.ly/15cH1lj
Presentation JANET/JISCMRD Storage Requirements workshop Paddington 2013 http://bit.ly/12QFu9S
Presentation JISCMRD benefits evidence workshop Bristol 2013 http://bit.ly/ZXE09Y
Presentation JISCMRD progress workshop Aston 2013: training http://bit.ly/11t3Lg0
Presentation JISCMRD progress workshop Aston 2013: agent of change http://bit.ly/13NVIgH
Presentation JISCMRD progress workshop Aston 2013: storage http://bit.ly/19Juixf
Report Register of programme engagement at UH events: interviews (~60), meetings, seminars , workshops. Total attendance 400, est 200-300 individuals Appendix B
DMP 10 data management plans, facilitated by RDM champions and Research Grants Advisor limited public access
Report 6 project manager’s reports to Steering Group no public access
Report Benefits report http://bit.ly/19V1rWS
Report Final Report http://hdl.handle.net/2299/13636

Conclusions

There are many conclusions that could be drawn from the project. These are the headlines:

  • JISCMRD has been a success at UH.
  • The RDTK project has made an impact in awareness raising and service development, and made good inroads into professional development and training. There are good materials, a legacy of knowledge and a retained group of people to sustain and develop the learning.
  • We believe the service orientated approach shows that better technology can facilitate better RDM and the project has been an effective Agent for Change.
  • We also understand that advocacy and training are as important as technology to bring about cultural change.
  • Funding body policy and the implications of the ever increasing volume of data are understood. The business case is clear: the University cannot afford not to invest in RDM.
  • JISCMRD phase2 has been an effective vehicle for knowledge transfer and collaboration. It provided an environment in which a new and complex discipline, and the many, interacting, conflicting, seemingly endless issues therein, could be explored with common cause and mutual support.

Recommendations

JISCMRD activity should continue, and try to reach the part of the research community that is least able to adopt RDM best practice without assistance, and won’t do so as a matter of course. A profitable strand for JISCMRD3 would be Collaborative Services. Appropriate services would include joint RDM support services, or shared specific services, such as regional repositories (including DOI provision) or shared workgroup storage facilities. Institutions with advanced RDM capability could play a mentoring role. Another key strand would be Benefit of Data Re-use; to gather examples of innovative data use and academic merit and reward for individual data publishers.

The DCC should continue in its institutional support role. It should consolidate its DMPonline tool toward a cloud service, with features to allow organisational branding, and template merging. It should place new emphasis on the selection and publishing of data, with a signposting tool for Tier 1 and Tier 2 repositories for subject specific data, including selection criteria, metadata requirements, and citation rates.

Opportunities for organisations to learn from each other and establish collaborations, which have been effective at JISCMRD2 workshops, should continue to be facilitated in some way. In addition, more attempts should be made to reach researchers directly in order to demonstrate the potential personal benefit of good RDM.

The JISC should continue to pursue national agreements via the JANET brokerage. These negotiations should be widened beyond Infrastructure as a Service to include RDM Applications as a Service (RAaaS), for example, Backup as a Service, Workgroup Storage, and Repository as a Service. The goal should be to achieve terms of use which satisfy institutional purchasing, IP and governance requirements; whilst allowing for acquisition by smaller intra-institutional units, from faculty, down to workgroup level. (JISC GRAIL- Generic Rdm Applications Independently Licenced) might be suitable brand for this activity. In addition, JANET should press cloud vendors for an alternative to ‘pay-by-access’ for data which is a barrier to uptake in fixed cost project work.

May 202014
 

Research Data Management Training for the whole project lifecycle in Physics & Astronomy was co-funded by the JISC Managing Research Data Programme 2011-2013 and the University of Hertfordshire. The project was carried out in parallel with other JISCMRD work at University of Hertfordshire and collaborated with researchers in Centre for Astrophysics Research (CAR) and the Centre for Atmospheric & Instrumentation Research (CAIR) to develop a short course in RDM for Post-Graduate and early career researchers in the physical sciences. It adopted a whole project lifecycle approach, covering issues from data management planning, through good data safekeeping, to curation options and arrangements for data reuse. The resultant short course is available via 4 modules at www.jorum.ac.uk.

The final report is available via http://hdl.handle.net/2299/13638

Output / Outcome Type Brief Description and URLs (where applicable)
UH DMPonline Template Progression from a RDM checklist within our UH Data Policy, to a DMPonline template that fulfils the UH data policy and stands alone as a record of the treatment and location of data.
Project Website Including guidance on best-practice RDM for topics related to the lifecycle of research projects and the following training materials http://bit.ly/uh-rdm
Training Slides Presentation slides covering 18 topics within four RDM sessions, available via JORUM

1 – Planning a project http://find.jorum.ac.uk/resources/18502

2 – Getting started http://find.jorum.ac.uk/resources/18503

3 – Safeguarding your data http://find.jorum.ac.uk/resources/18504

4 – Finishing touches http://find.jorum.ac.uk/resources/18505

Trainer Notes Aims and key points for each slide of the training.
Discipline Packages Examples to make the generic advice relevant in physical sciences; Physics and Astronomy. Also in Health sciences, History, and Business.   (Additional packages to follow in the coming months.)
How to choose training Advice on which training is suitable and how these materials can be used in training sessions for researchers, research students, support and technical staff within and without UH.
Case Studies Descriptions of 12 projects, highlighting RDM practices, and key issues and solutions that have affected researchers throughout the university, posted on our RDM website for the benefit of other researchers in the university. http://bit.ly/uh-rdmcs
Current and best-practice assessment Formal and informal interviews with researchers in Astronomy, Physics, Maths, Robotics, and Atmospherics to discuss the bespoke solutions they have adopted and the applicability of our RDM tools to the physical sciences.
Development Blogs Blog summaries on

  • the progression from RDM training sessions for astronomers to generic training sessions for researchers in all disciplines,
  • the development of the website
  • the development of the UH DMP Template
  • http://research-data-toolkit.herts.ac.uk/
Evaluation of Training Feedback evaluated after each training session used to improve training sessions in particular the content and duration.
Improved data management in astronomy research students Follow up interviews with research students demonstrated improved awareness of data management, preservation requirements and security of data.
Workshop presentations This work has been presented at JISC workshops and RDM training related meetings;24/10/12 – JISC Building Institutional RDM Meeting in Nottingham “RDM Training for Physics and Astronomy”26/10/12 – RDM Training Stand workshop “RDMTPA at UH”25/03/13 – JISC RDM Meeting in Birmingham “RDM Training at UH”
Presentations for researchers “Introduction to RDM” presented to researchers, staff and students.

  • Staff development: 16/10/12, and 30/04/13
  • GTR:   13/05/13
  • For Astronomy PGRS: 23/10/12
  • For STRI new PGRS: 01/03/13

Research Group seminars are planned for the autumn term 2013.

“Preserving Digital Data at UH” presented at the National Astronomy Meeting in St Andrews, 1-5/07/13.

Dec 062012
 

It has been a while since I have committed one of these management reports to the blog, but this one seems to gel well and wrap around our recent articles, not to mention providing a linkfest opportunity, so it seems worth it this time around. Enjoy it if you can.

WP2 Cloud Services Pilot

The focus for this activity has been in looking at options for a collaborative workspace, in which researchers from within and without University of Hertfordshire can share data. Several threads have been pursued:

iFolder and SparkleShare (Dropbox alternatives). This work complements that done on OwnCloud by the Orbital project.  iFolder originated from Novell and runs on SuSe linux. We looked at this because its provenance and requirements seem to match our local infrastructure, however the feature set is inferior to more modern alternatives and it looks like it may be moribund. SparkleShare is opensource software which puts a team sharing layer on top the GitHub version control system. It is a good candidate for a cloud hosting service and looks promising, but it would require considerable technical investment to operationalise at UH. Further investigation is required.

DataStage. DataStage offers WEBDAV and a web browser GUI, independent user allocation (albeit via the command line), and a bridge to data repositories. We have conducted tests running on desktop machines and within our virtual server environment. The release candidates we have tested are not yet stable enough to support an operational service. Development of the v1.0 release of DataStage, which has been talked about for several months, seems to have stalled.

SharePoint has returned to our thinking, since it is widely available as a cloud service, offers WEBDAV and a web browser GUI, and version control. When combined with active directory services, it seems to offer a cloud service to complement our existing networked service in a hybrid service model. Further investigation is required.

The utility of the main offerings in the cloud files market is being assessed. This has been less a technical appraisal and more a review of the Costs (over and above the initial free offer), Terms and Conditions and options for keeping data within specified jurisdictions. Further investigation continues.

RDTK is attempting to increase the usage of centrally managed network storage at UH. We continue to regularly encounter researchers who don’t know how to use their storage allocation, or that they may request shared storage at a research group or project level (R: drive). We are producing new advice about, and encouraging the use of, the secure access client for our virtual private network, which Is not well understood, but gives much more effective access to the networked file store than the usually advertised method. We intend to offer a ‘request R: drive’ feature on the RDM web pages, and facilitate the adoption of this facility, which again, is not known to many research staff.

A lengthy technical blog Files in the cloud ( http://bit.ly/R583If ) and a presentation A view over cloud storage (http://bit.ly/SB2cK8 ) bring the issues encountered in this work together. This presentation gave vent to the widespread interest amongst many projects at the recent JISCMRD Workshop in the issues around the use of ‘dropbox like’ cloud services. I represented these voices in a session with the JANET brokerage, underlining the importance of nationally negotiated deals with Amazon, Microsoft, Google and particularly Dropbox, for cloud storage, in addition to the infrastructure agreements already in place. A reflection on the JISCMRD Programme Progress Workshop (http://bit.ly/SdTqCz) refers to this encounter.

WP3 Document Management Pilot

A part time member of staff has been recruited to scan legacy documents into an electronic Trial Master File (eTMF) for work carried out by the Centre for Lifespan and Chronic Illness Research (CLiCIR). The work is progressing extremely well under the direction of the Principal Investigator. A journal of activity, issues and time and motion is being kept. After 8 x 0.5FTE weeks the first phase of scanning, covering the trial documentation, is all but complete. Only the original anonymous patient surveys remain. There are ethical issues and a debate about whether this remaining material is useful, publishable data to consider at this point.

This is proving to be valuable collaboration between the researcher, the University Records Manager, and the EDRMS system consultant. The draft Research Project File Plan has been updated in the light of practical experience and the work has attracted two further potential ‘clients’ from within the School of Life and Medical Sciences.

WP6 Review data protection, IPR and licensing issues

CT and SJ have begun reviewing literature on licensing data, including that from the DDC.

WP8 Data repository

An instance of DSpace 1.8 has been installed on a desktop machine with the aim of testing data deposit via SWORDII protocols. Data deposit has been achieved using several mechanisms, including Linux shell commands, a simple shareware application, and the deposit component of Datastage. This latter piece was facilitated after generating interest at a programme workshop, whereupon a collaboration with other projects helped modify our instances of Dspace and Datastage, so as to allow the SWORD protocol to work.

SWORD deposit has also been demonstrated into the development system for the University Research Archive (UHRA – this the old one). This is not working with Datastage as yet, as it needs further modification. Further progress awaits the roll out of the new system, which in turn has been delayed due to its dependencies on our newly re-engineered Atira Pure Research Information System (UH access only).

A presentation DataStage to DSpace, progress on a workflow for data deposit ( http://bit.ly/Xvmidr ) refers to this work.

Inter alia, there was considerable interest generated by a blog and submission to Jiscmail with regard to organising a practical workshop on how to acquire Digital Object Identifiers for Datasets. Since that time Data.Bris have minted their first DOI using the BL/Datacite api and service, and the Bodleian Library are expected to do the same shortly. The current position is that a workshop might be arranged as part of the JISC sponsored BL/Datacite series. Simon Hodson is facilitating. The article DOIs for Datasets (http://bit.ly/QonFoN ) produced the largest spike in traffic seen by this blog so far (175 page views, 55 bit.ly clicks).

Expressions of interest to publish datasets are beginning to be cultivated, including, for example, datasets of oral histories and historical socio-economic data.

WP9 Research Data Toolkit

Content for the Research Data Toolkit is progressing well with all RDM team members contributing and refining the product.  We have decided to restrict the ‘toolkit’ brand for use with project and adopt a more generic RDM brand for the published material and activity, so as to create a foothold for a sustainable service. ToolKit resources will appear at herts.ac.uk/rdm which may re-direct to herts.ac.uk/research-data-management or herts.ac.uk/research/research-data-management. We are still in negotiation with the University web team over a content delivery platform.

The content is being developed in a platform agnostic way as a set of self-contained pages of advice, which could be delivered under different overarching models, and/or re-purposed for print. There is still some debate as to whether to arrange the pages in groups by activity or by research lifecycle stage. The draft table of contents and sample content are under wraps for now.

WP11 Programme Engagement

RDM team members have made 5 presentations and participated in 14 days of programme events including:

  • DCC London Data Management Roadshow, London
  • BL/Datacite: Working with DataCite: a technical introduction, London
  • BL/Datacite: Describe, disseminate, discover: metadata for effective data citation, London
  • JISC Managing Research Data Programme / DCC Institutional Engagements Workshop, Nottingham (3 presentations)
  • RDM Training Strand Launch Meeting, London (1 presentation)
  • JISC Managing Research Data Evidence Gathering Workshop, Bristol  (1 presentation)

The project blog (including 4 new articles) has received over 800 visits and nearly 1500 page views in the previous quarter.

Other Activity: recruitment

The project has been recruiting for 3 x RDM Champions to work in the University’s research institutes for six months at 0.4 FTE.  We were looking for an established member of each institute, with sufficient experience and to be able to quickly embed sustainable and good practice RDM among their peers. The vehicle for this will be the objective of assisting PIs to prepare or improve a significant number of Data Management Plans within each institute.  Recruitment has been only partially successful. One person started on December 1 in the Health and Human Sciences Research Institute (HHSRI), another is due, subject to final agreement, to start on 28 January in the Social Sciences, Arts and Humanities Research Institute (SSAHRI). The remaining post, in the Science and Technology Research Institute (STRI), has not been filled.

Other Activity: data encryption workshop

We have produced a guide (http://bit.ly/QHyN2y), blog (http://bit.ly/XxDoEM) and workshop (http://bit.ly/11rwLXA) with regard to encryption of sensitive data for sharing, transport, and security on removable media. Over 20 university staff participated in the first workshop, and we have a waiting list for the next date, which will be announced this month. The material was well received and the feedback, though not yet properly evaluated, looks positive. We equipped 6 researchers with large capacity data sticks secured with TrueCrypt, and will evaluate their experience in February. We are encouraging the other attendees to try out the standalone encrypted container available via the toolkit blog (http://bit.ly/TKb8hG ).

Oct 222012
 

One year in! Time flies when you are having fun, or trying to pin the tail on a donkey which at times is how it feels to be a JISCMRD project manager. This isn’t a complaint, it is a stimulating and worthwhile endeavour, and I think programme is working well at UH. The Research Data ToolKit, even before it is properly manifest, is acting as an agent of change, and gaining momentum as the RDM team expands from 1 person, to 3, now 6, soon to be 9.

Most of JISCMRD 2011-2013 convened at NCSL in Nottingham Wed 24-Thu 25 October.  I was taken by the increased confidence and authority of my fellow travellers, compared to the prevailing feeling a year ago. In some senses, the horizon is no closer, indeed it may have receded further in the light of the knowledge we have all acquired; the difference is, perhaps, that the benefit of experience gives us conviction. The RDM problem won’t be fixed by JISCMRD, but those of us involved will be well placed to carry the effort forward beyond the life of the programme.

The progress workshop was packed with interesting sessions, touching all parts of the life cycle of research data. The only disappointment  I had was that I couldn’t divide myself in three to attend parallel sessions.

In my first presentation A view over Cloud StorageI sought to explore the circumstances under which cloud storage can and can’t be utilised.  Part of the intent was to stimulate discussion, and in this it was successful, as I seemed to touch a nerve by naming the elephants in the room: Dropbox, Skydrive, Googledrive (D, S & G). The issues around using these  applications seemed to resonate throughout both days of the workshop. Before I become identified as an advocate for Dropbox I would like, in the manner of a minister redressing a half baked policy, to ‘clarify’. It is not a specific incarnation of any of these cloud storage App’s that I am advocating; it is their feature set.  Unless you work with more than a few gigabytes of data, the ease of use of these the public cloud services make them irresistable to researchers. The implications of the terms and conditions of use, which fall foul of pretty much any institutional policy that you could find, have little impact: usability wins over regulation. During the workshop MariekeGuy tweeted a list of alternatives applications, and we discussed some of these, but no one could wholeheartedly endorse any of the candidates for a robust, reliable service. D, S & G simply work better than our own networked storage offerings in many, many RDM scenarios. Like it or not, this is the case.

In the final workshop session, John Milner gave an account of the major cloud and data centre framework agreement already concluded  and the negotiations that the JANET Brokerage is planning to undertake  with Amazon, Microsoft, Google and Dropbox. An agreement with Microsoft on Office 365 has been reached and it is hoped that favourable terms with Amazon for, for example, EC2 and Glacier and Microsoft Azure can be achieved in co-operation with Internet2 in the USA. Talks with Dropbox and Google have recently been initiated. John indicated that a ‘negotiation’ typically takes at least three to six months to see through. It was encouraging that John indicated, that despite their strong market positions, these companies are willing to discuss HE needs and it is likely that education and research will attract favourable prices and terms and conditions of service, the latter of which (I suggest) is the higher hurdle to adoption. So perhaps JANET may yet resolve an answer to the search for an easy to use cloud storage application, that can be brought within the constraints of our governance, use our authentication and work with our infrastructure, they are certainly working on it and keen to hear requirements from the sector!

I am seeing an App’ like  D, S or G; sitting over hybrid storage; in our own datacentres or within the European Economic Area public cloud; accessed using our own passwords; and, governed by our own T and Cs.  Maybe for Christmas?   Unlikely, but worth the thought.

RDTK’s presentations are available below:

RDTK A view over Cloud Storage, in Parallel Session 1B: Managing Active Data: storage, access, academic ‘dropbox’ services, JISCMRD progress workshop, Nottingham, 2012 (PDF, 0.6 MB)

RDTK DataStage to DSpace, progress on a workflow for data deposit, in Parallel Session 2B: Data Repositories and Storage: options for repository service solutions, JISCMRD progress workshop, Nottingham, 2012 (PDF, 1.5 MB)

RDMTPA Research Data Management Training for Physics and Astronomy, in Parallel Session 3A: Training and Guidance, JISCMRD progress workshop, Nottingham, 2012  (PDF, 1.8 MB)

RDTK Poster, Service Oriented Toolkit for Research Data Management, in Poster Session, JISCMRD progress workshop, Nottingham, 2012, Poster (PDF, 1.9 MB)

Other recent blogs:

 

Aug 202012
 

An audit of research data holdings within University of Hertfordshire was conducted in the period May to July 2012.

The online survey  (described in more detail here) was circulated to around 600 research staff first via their regular monthly newsletter, with follow up reminders sent by our information managers to schools and research centres, as well as via our continuing programme of RDTK awareness meetings and interviews. There were 67 responses which represents %12 of those invited to take part. Most research active disciplines were represented in the respondents, albeit with a strong showing from the STEM subjects.

The survey has brought insight into the extent of our research data. It allows us to estimate that we hold approximately 2PB across the whole research landscape. This is a factor of ten larger than our current central provision. However, around 80-90% of this belongs to a very few research groups, who are relatively well organised and funded for RDM, and it tends to be working data for those that crunch numbers – so it may not necessarily be data that requires retention. The remaining 10-20% of research data, which belongs to the balance of 80% of researchers, looks like a manageable quantity. This suggests that cultural change rather than capacity may be the predominant issue when it comes achieving a migration to a more robust infrastructure for working data for the majority of researchers. Likewise, we should expect to be able to manage the data that could be preserved, if we can build the culture and processes to make that possible.

In addition to requirements that have already been resolved (such as easy to use encryption and more flexible provision of storage for mixed staff/student/external research groups) the survey revealed some previously unvoiced requirements, such as centralised version control for source code, CAD and design files.

Perhaps influenced by the STEM respondents the survey also showed that venerable FTP is alive and still working well in amongst the new (and rebranded) offerings of the cloud. This indicates there continues to be profit in exploring a FTP based cloud storage pilot.

The key messages from the survey support the anecdotal evidence acquired to date – in the main there was no big, new, news. However, the subtext obtained is valuable and it underlines that considerable help and resources are needed over the whole project lifecycle, from planning to preservation, if we are to satisfy the demands of a rapidly developing (some would say hardening) funder’s policy regime.

Download the survey results (PDF  400KB).
Download further discussion and analysis (PDF 1.7MB)

Jun 272012
 

Electronic Document and Records Management systems (EDRMS) have the potential to answer the needs of some Research Data Management scenarios.

EDRMS offer file sharing, file versioning, flexible access control, and retention, preservation and discovery services (albeit most often in a closed environment). In the case where a project’s data is bound up in everyday office formats, and does not need a database or other structured format, an EDRMS can be used to bring rigour and robustness to otherwise freeform file management. Although there will often be a reluctance on the part of  a researcher to ‘get organised’, there are often circumstances where there is no choice: the nature of the work means that a high standard of file management is more than a matter of efficiency or professionalism and it becomes a requirement of the funding body, subject to audit. This is the case, for example, with all clinical trials.

In workpackage WP3 we are looking to improve project management practice in general and promote more robust RDM of unstructured data by developing a standard ‘file plan’ for use in research. This will be backed up by policy changes which will encourage, (then in the fullness of time mandate) the deposit of project documentation in our EDRMS. The file plan must be generic enough to be useful for researchers who just need to improve their document organisation, whilst also allowing for the more robust requirements of those needing to comply with external oversight. The starting point which informed this work was the JISC advice on Developing a File Plan in the JISC Business Classification Scheme (BCS) and Records Retention Schedule (RRS) for Higher Education Institutions infokit.

We are working with the Centre for Lifespan and Chronic Illness Research (CLiCIR) to develop an electronic Trial Master File (eTMF) for use in clinical trials. A Trial Master File contains essential documents: every document that is used to conduct and report on a clinical trial. Our eTMF will complement the locked filing cabinet which is currently used to satisfy the demands of audit. The Medicines and Healthcare products Regulatory Agency (MHRA), which is responsible for the regulation of medicines, requires each trial to be conducted in accordance with a Standard Operating Procedure (SOP).  The MHRA carries out unannounced inspections to audit SOP compliance including data management arrangements and storage procedures. We hope to demonstrate that an eTMF could be more efficient and at least as robust as the current paper based arrangements. In doing so, we are also developing our general purpose file plan.

We think the EDRMS will offer considerable benefit in respect of  managing the large volume of documents and data produced in drug trials, whilst addressing the legal requirements on data retention and security. In addition, we expect it to make the sharing of data between chief investigator, co-ordinating centre and participating sites more effective.

The workpackage mini-plan is as follows:

  1. Develop the File Plan
    • Define folder structure
    • Define classification, retention and disposal requirements
  2. Define Meta data requirements
  3. Build a Model Office
  4. Draw the life cycle of a clinical research project
  5. Develop roles, responsibilities and access
    • Define roles and access level (including: ownership, managed by, administrated by, visible to …)
    • Set security and access
  6. Create Data Management and Maintenance Policy
    • Develop guidelines for data maintenance and update
    • Develop guidelines on retention
  7. System Implementation
  8. Train users

After close consultation with our CLiCIR  colleagues we have produced a draft folder structure which is ready to deploy to support the eTMF and to take to other research groups for comment.  It includes many folders which are generally applicable and some which are peculiar to health related research. It also highlights those folders which must be highly secure in the context of the eTMF.

Each item in the EDRMS has associated metadata. In addition to the basic file management metadata required by the system, it is possible to add additional fields for project specific data. The work so far has focused on the metadata needed to describe a research project at UH.We will consider how this maps to other schema, such as those used by our CRIS, our  Research Archive and the various minimum metadata sets circulating in JISCMRD, in due course.

The EDRMS can be used as primary store of actual data but also as a management tool for externally located data, be it electronic or hard copy. It can manage the retention and disposal of both. The various requirements for retention and disposal of different types of research data have been brought together in a draft retention policy. (It is interesting to note that although the EDRMS is in theory designed to deal with the time scales involved, most of the retention schedules we see extend beyond the expected lifetime of the EDRMS itself.)

 

Jun 222012
 

RDM Audit

Work package WP1 – RDM Audit  is about assessment of RDM practices at UH in order to identify the gaps, requirements and to transfer knowledge from experienced RDM practitioners to all staff holding valuable data. We are  employing two methods to carry out the audit: DAF survey and interviews.

DAF survey: we have carried out a survey based on DAF methodology over the last month. The questions we asked were mostly faithful to the DAF online tool with some tweaks to accommodate local infrastructure. The result is more like that used by Orbital at Lincoln rather than Iridium at Newcastle. The survey was circulated to around 600 staff via our “Research Grants News and Funding Opportunities” newsletter, with follow up reminders sent by our information managers to schools and research centres.   We have had 60 responses so far from senior researchers, principal investigators, research students, lecturers and research fellows. We have extended the open period due to requests from a couple of research groups who want to consider it at their next regular group forums.  The results already make interesting reading and will be published here soon.

Interviews: We have designed an interview protocol for carrying out semi-structured interviews with selected researchers across different disciplines in the University of Hertfordshire.  The protocol was designed using the following sources:

Starting with the established ‘friends’ of the project we will deploy this interview protocol across our research community, and aim to use it as a source of case studies, leveraging each opportunity we get to assist a researcher with a particular problem.

RDM benefits

The final section of the interview protocol has been designed to help us with the vexed problem of evaluating the benefits of the Research Data ToolKit.

When evaluating benefit the first port of call will often be a hard financial metric:

  • does RDM as a whole cost less now than before we started?
  • are we winning more research grants as a result of RDM good practice?

Given the relatively short timescale of the project and our complete lack of existing RDM accounting we can not answer these questions.  These leaves us considering a softer set of metrics:

  • has the usage of robust centralised storage increased during the life of the project?
  • has the use of Data Management Plans increased?
  • how many datasets have we published in support of our publications?

Even these questions will not be easy to answer because they have not previously been asked, but they probably are measurable over the period of the project.

There are less quantifiable but still tangible benefits to be recorded too.  For example, RDTK has already lead to a closer a relationship between our information systems providers and research administrators, which had become distanced by organisational restructuring.  For another example, as a result of a tangential intervention from RDTK, our largest ‘departmental’ facility (an 80 core HPC cluster and 200TB SAN) is about to move out of its less than ideal premises and into one of our purpose built data centres, making it much less prone to downtime or disaster.

In order to capture this kind of collateral benefit and to try to get the individual researcher’s perspective we believe it is worth considering factors like ‘increasing awareness regarding RDM good practice’,  ‘improving staff confidence in developing a DMP’, as well as the ‘usage of resources and organisational capacities and infrastructure to support RDM activities’.   To this end we have added a section to our interview protocol which asks the respondents about their competencies in these areas. At the conclusion of the project we will return to our interviewees to see if their competencies have improved.

These measures of benefit may not show us an explicit return on investment, but to paraphrase ViDaaS’s James Wilson – it is better to measure what you can than what you can’t, and ‘soft’ benefits are known to yield hard results (see the latter part of JISCMRD launch event: Thematic session on the business case for RDM).

 

Mar 302012
 

This is a very lengthy blog to catch up on a couple of months of non-stop JISCMRDness and set out the position of RTDK at the end of month six. There will be some seriously dry reporting later but I am going to start with an emotive whiz through a very busy but rewarding period.

February was a head down month, in which we pressed on with actual hands on things, and I tried to balance technical work with project management, with the odd programme event thrown in for entertainment.

Continue reading »

Feb 152012
 

It was back to serious business in January with lots of activity in engagement, service development, and recruitment. All these strands took off at once in mid-month leaving me struggling to keep up, but it was good to see progress across all active work packages.

Work package WP1 – Audit current RDM practice at University of Hertfordshire

Cathy Tong and I continue to work on the audit in the interim before our new project officer becomes established. A picture is emerging with regard to existing centralised facilities. This shows a gap in provision for research groups where there are non-staff members involved, but also that there is under-utilisation of existing resources because researchers do not understand how to use or access them. I have blogged on this issue and started on a FAQ to counter the knowledge gap.

Cathy established two really interesting engagements, both valuable, but in different ways. One, in the Humanities, was very much forward looking and about possibilities; whilst the other, in Health, focused on solving an urgent Research Data Management problem in the here and now. Continue reading »

Jan 062012
 

Progress seemed fitful in a month shortened by the MRD launch event and seasonal festivities. There was a lot of talking, and from a personal perspective it was interesting to have to curb my tendancy to want to build something, and try instead to sift through the many voices and conversations to identify profitable ways forward for #rdtk_herts. Continue reading »

Nov 302011
 

Work package WP1 – Audit current RDM practice at University of Hertfordshire

Cardio and DAF : We have agreed a plan with Andrew McHugh of the Digital Curation Centre (DCC) to carry out a Cardio Assessment with key UH stakeholders, followed by a Data Asset Framework (DAF) exercise with one of our large research groups in Health and Human Sciences.  Working with the Andrew will hopefully allow us to equip our own staff to deploy DCC tools and methods in the future, whilst conducting an effective audit in the present. The DCC have match funding, so project money spent with them is doubled up and looks like good value.  The micro plan for the Cardio deployment is this:

  • engage stakeholders (November, #rdtk_herts project team)
  • introduction workshop, in which DCC introduce the audit methodology (December, All stakeholders, DCC led, 1hour)
  • individual RDM assessments using the DCC’s online Cardio assessment tool (December/January, All stakeholders, at their convenience)
  • aggregate and analyse results (January, DCC)
  • feedback workshop (January, All stakeholders, DCC led, 1-2 hour)

Continue reading »

Nov 042011
 

Work package WP1 – Audit current RDM practice at University of Hertfordshire

We have had preliminary discussions with Principal Investigators and Research Group leaders in Health and Human Sciences; Physics, Astronomy and Mathematics; and History.

These discussions led to the Co-investigators for the project being identified and the first steps to collaboration being undertaken. Some early requirements for pilot services were revealed. Researchers from Health and Human Sciences have suggested three distinct research projects, covering a range of interventions at the beginning, middle, and end of the project lifecycle. This is encouraging for a good start to work packages WP2 & WP3. Continue reading »