Wednesday, 18 December 2013

Guest blog post: personal thoughts about depositing data and the importance of RDM

We are delighted to publish our first guest blog post on the importance of research data management from the perspective of a University of Sheffield staff member. Dr Julie Ellis is a Research Associate in the Department of Sociological Studies.

Julie Ellis, RA, Sociological Studies
Thoughts about depositing data

I was fortunate enough to secure a studentship from the Economic and Social Research Council (ESRC) to complete my PhD research at Sheffield (I graduated in 2011).  I imagine that due to demands from funders to consider RDM issues at the very inception of a project, there is now far more support and information available for PGRs starting their doctoral work.  

At the time I began mine, this support didn’t feel so present.  Nonetheless, it was still a contractual requirement that I should offer the data I generated from my study to be archived in what was then Qualidata – now UK Data Archive.  I discovered this during the course of my research, rather than at the start. Sure enough, there it was buried within a funding handbook – but somehow with so many other things to think about, I hadn’t taken in fully what it meant and what was required.  In the time that had lapsed between getting the award and completing my MA as the first part of my studentship, archiving the data had been lost in a myriad of other questions, issues and anxieties about doing a PhD. 

Midway through data collection I realised depositing data was a requirement.  I sought retrospective consent to do this and did the best job I could to find out what I needed to do and how the depositing would work.  Since submitting my thesis and moving immediately into a series of fixed-term contracts (working both inside and outside of academia) I have spent the best part of 2 years working on making sure my interview data is in good shape to be deposited.  This has involved re-reading 37 interview transcripts and carefully formatting them, ensuring that they are fully and consistently anonymised throughout and spotting spelling mistakes (lots in my case as I did my own transcription!)  I haven’t resented doing this – I understand why it is important in some cases to make our data available so others can work with it and new insights can be gained.  However, for me to be able to honour this obligation to deposit, I have had to use annual leave and work into evenings and weekends to do this work. 

It shouldn’t have taken you so long I hear you cry!  Maybe it shouldn’t have.  Maybe I worked too slowly, was too fastidious about checking.  But because I wanted to make absolutely sure that the anonymity of my participants was protected (I’d anonymised at point of transcription, but when you are putting the whole thing out there and not just bits of quotes… I had to check), I simply could not do this as a skim-read job.  It takes time to get peace of mind. 

So what am I saying?  Despite the fact that there’s probably more general awareness about RDM now, please don’t assume PGRs know all about this, how they can best plan to manage it and how to make most effective use of the advice and support. Make sure they are signposted at induction and that supervisors are supported to support their PGRs to manage RDM challenges.  Better still, introduce it – albeit light-touch - where undergraduate students are doing dissertations that generate data – just so they are aware of the landscape that is developing in relation to RDM (OK, good data management has always been crucial, but I refer to the formalisation of this practice).

As a result of my experiences and my recent engagement with the RDM team we managed to identify that there are aspects of University Policy which need to be honed to effectively deal with this evolving landscape.  It wasn’t clear who owned the copyright to my data – me solely or me and the University jointly?  It seems that PGR projects sat within a murky grey area between student and employee.  We went for joint ownership in the end. 
So with the copyright question sorted and my data beautifully prepared, I have finally ‘let go’.  The team at UK Data Archive have, in my experience, being very patient and supportive.  They agreed to embargo access to my data for a while to give me time to try and publish from my research.  I have to say, it was a nice feeling when the link to my data record landed in my inbox a couple of weeks ago. 

Dr Julie Ellis, December 2013

Thursday, 28 November 2013

Research Data Management - thinking ahead and DMPs

It's been a while since we blogged, mainly because we have been really busy with a substantial amount of Research data management activity and meetings within our institution. Our research data management (RDM) service delivery group met for the first time this week, so suddenly everything is feeling very real!

The RDM service delivery group consists of senior stakeholders from the University Library, Research & Innovation Services (R&IS) , the academic community, and Corporate Information and Computing Services (CiCS). We discussed how there is increasing responsibility for RDM being placed on universities and that we need to consider the reputational risks if research data is not handled properly. We identified some key priority areas for service delivery and allocated some discrete areas of work. More details in the new year.

The advocacy and awareness is gaining momentum and this week we delivered a one hour RDM training workshop for ECRs in the Faculty of Medicine, Dentistry and Health as part of the excellent Think Ahead Programme . This programme aims to ensure that early career researchers within the Faculty of Medicine, Dentistry and Health have a defined career trajectory and are provided with the current training and the skills required by employers.

For many attending this was their very first introduction to the concept of research data management and a timely reminder to us that RDM is not just about secure storage and avoiding data loss it is about supporting the research data lifecycle so that research data is properly curated and where feasible, reused.

At the end of the session participants were asked what the one thing was they now know about, that they didn't before this session and responses included:

  • The availability at TUoS of the Web of Knowledge Data Citation Index
  • Planning and the importance of data preservation
  • What to include in a data management plan (DMP)
  • Datasets are already being published in the public domain
  • Funders having expectations regarding data management plans
  • What RDM actually means
  • The different types of research data
  • Data journals and Data papers
  • TUoS support for RDM
It was a really interesting event and many thanks to Dr Lucy Lee, Researcher Development Manager for organising this session and to all the participants.

Monday, 28 October 2013

Introduction to Research Data Management (RDM)

This is an attempt at ‘RDM in a (Coco)nutshell’ primarily for researchers rather than Information management practitioners.

Key Concepts:

1. RDM is good research practice

2. RDM is concerned with looking after your data throughout the research project.

3. RDM involves long-term preservation of some of your data after the project.

4. Metadata is data documentation and is essential for RDM.

5. RDM makes it possible to share your data – if you want to or have to.

6. Data Management Plan

1. RDM is good research practice – it is mostly common sense. Most researchers will be aware of many of the issues involved and already practicing some of the procedures. Data in this case will refer to digital material collected during the research process, which will be analysed to test the research hypothesis – the same principles apply to non-digital data (see the earlier blog post about research data).

2. RDM is concerned with looking after your data throughout the research project from the project planning stage to the publication stage. Looking after your data involves making sure that the data are stored securely and backed up regularly. It involves ensuring only the right people have access to the data, during the research process and afterwards when the data are archived.

When planning research, risks to the data will need to be assessed and procedures agreed upon to minimise these risks. What would you do if you lost your research data tomorrow? Can your data be recreated or are they, for example, observational data that can’t be captured again?

At the planning stage, consideration needs to be given to choosing appropriate file formats for your digital data – will the software necessary for interacting with the data be available in the future? Consideration should also be given to the organisation of the data – the arrangement of folder structures and protocols for naming files and folders.

3. Long term preservation (or curation) of some of your data may be required by the funder.
At the end of the research project, decisions need to have been made about which data will be preserved, how they will be processed for preservation and which will be disposed of. This may well have been determined during the planning stage by agreement with the research funder. The UK research funders have different requirements for long-term curation of research data – details can be found at the DCC website.

Where should your research data be curated? Many research funders and researcher communities have established Data Centres and Discipline based Repositories; a list of these may be found at Datacite or at Databib. A number of HE institutions are developing their own Data Repositories, or have widened the capabilities of their Institutional Repositories to include research datasets as well as research papers.

4. Metadata is data documentation and is essential for RDM.
From the planning stage to the data preservation stage, metadata will need to be collected. Metadata identifies specific units of research data and their purpose in the research process.

When data are created, it is essential that details such as instrument settings, experimental protocols and conditions are recorded, for it may be difficult to recall these at a later stage. Many instruments will record such metadata during the process – the more automated the recording of metadata is, the better.

Many metadata elements will be recorded as attributes of the digital file when the data are created / collected – file format, size and creation date for example. Data should be given an appropriate filename or title; Different versions of files will need naming to distinguish them; Files may be collected together in appropriately named folders. Some metadata elements will be common to all research data created by a research project – Creator name, Project name, Institution, Grant number, for example. 

Where necessary, metadata will need recording, perhaps in text files or word documents, and placed in a folder with the data that it documents. This doesn't necessarily need to be structured or in a standard format, although that would be best. 

5. Sharing your data – if you want to or have to – is made possible through RDM practice.
Perhaps the most contentious issue for researchers is the concept of making their research data openly accessible.

Good Research Data Management does not require your research data to be ‘Open Data’ or openly accessible! 
Although it is true that some of the drivers in developing RDM good practice are the concepts of ‘Data Sharing and Data Reuse’ and ’Data driven Science’, making research data openly accessible is NOT a requirement of RDM, but it is made possible through RDM.

You may be required by your funder to make your long-term curated research data accessible to anyone or by registered access. In line with the RCUK principles on Data policy, which does advocate open access to data resulting from publicly funded research, most UK funders have published data sharing policies – details are available at the DCC website.

You may of course be willing to share research data, openly or on a restricted basis. Your research data may be cited in the same way as other research publications and is now accepted as a research output for the REF2014 assessment (output type S). There is a growing body of research that indicates the benefit of sharing research data; Piowar, et al (2007) and Henneken & Accomazzi (2011) find increased citation rates for articles associated with accessible data. Data sharing has been common practice for many years in a number of disciplines – Molecular Biology for example.

Before publishing data, it is important to check with Research and Innovation Services – as The University of Sheffield  Research Data Management Policy states:

5. Unless the terms of research grants or contracts provide otherwise, data generated by research projects are the property of the University of Sheffield. Researchers should exercise care in assigning rights in data to publishers or other external agencies.
To publish and share research data, they will need submitting to one of the Data Centres or Discipline based Repositories - listed here at Datacite or at Databib. These are, of course, the source of other projects’ data that you may wish to reuse.

6. A Data Management Plan (DMP) will need to be drawn up which will set out your arrangements for managing your research data. Most research funders require the submission of a DMP (also known as a Technical Appendix by the AHRC) during the grant application. Information about and resources for creating a DMP can be found at the DCC website.

Useful Resources:
Datacite - Repository list 
Henneken, E. & Accomazzi, A. (2011) Linking to Data - Effect on Citation Rates in Astronomy
Piowar, et al (2007) Sharing Detailed Research Data Is Associated with Increased Citation Rate. PLoS ONE 2(3): e308.
RCUK - Principles on Data policy
The University of Sheffield - Research Data Management Policy

Friday, 25 October 2013

Has International Open Access Week stimulated your interest in open access (OA)?

International Open Access Week is a great opportunity to celebrate and learn more about the benefits of Open Access (OA) and we hope it has inspired you to participate in OA activities which aim to ensure that research is made freely available to all. Enabling access to research is important not just for academia, but for society as a whole.

I have attended two OA events this week and it made me think about what resources and tools are useful when exploring the concept of OA for the first time. There are so many issues to think about - article-processing charges (APCs), green and gold, compliance, mandates, paywalls, toll-free, institutional repositories, archives, costs, copyright, open data, peer-review, impact, OA journals, data mining, text mining, OA monographs, licenses, waivers, embargo periods, OA platforms, CC-BY, sustainability, funding, etc. etc. The list is endless!

So, if you are new to OA, here are a couple of resources and tools that you may find useful and if you are a University of Sheffield author and want to make your research open access, remember that there is a dedicated Open Access team within the University Library who can provide advice and guidance on OA.

Peter Suber and Open Access
Peter Suber's 'Open Access Overview' is a really useful starting point for those new to the concept of OA. His book 'Open Access' was published by MIT Press in June 2012 and the OA version was launched in June 2013. Peter is the Director of the Harvard Office for Scholarly Communication and this week he has written in the guardian, putting to rest six of the most common myths around open access to research.

Directory of Open Access Journals (DOAJ)
Contains records for more than 9,900 quality controlled Open Access Journals. There is the facility to browse the directory by journal, subject, country, license, and by publication charge.

A searchable database of publishers' copyright and self-archiving policies for pre-prints and post-prints.

Jisc Open Access
The Jisc Open Access website provides information on the work that Jisc do to support UK universities and publishers adopt open access policies. There is information on open access projects and a series of quick guides on OA topics.

Finally, one of my favourite blogs is Richard Poynder's 'Open and Shut?'. He writes on the evolution of the OA movement and publishes regular interviews with OA advocates.

So, our week of #OA blog posts comes to an end and I hope you have found something useful and informative to mull over. Next week it is back to all things RDM related.


Tuesday, 22 October 2013

Guest post: Dr Nancy Pontika and open access to peer-reviewed scholarly literature

Our second guest blog post celebrating International Open Access Week is from Dr Nancy Pontika, Information Consultant (Research), Royal Holloway, University of London. Nancy's PhD dissertation was entitled: "The Influence of the National Institutes of Health Public-Access Policy on the Publishing Habits of Principal Investigators" and is available on figshare. Further biographical information can be found here.

Open access to peer-reviewed scholarly literature

It was long before the first computers integrated into our everyday lives when a technology enthusiast, Vannevar Bush, was envisioning in his Atlantic Monthly article an online library, called “memex”, where someone could effortlessly store and retrieve information.  Almost twenty years later, the idea of a virtual library was strengthened and discussions about paperless journals were becoming dominant [Sondak (1973), Senders (1977), Turoff (1978)][i]. Around that time it was becoming obvious that the future can be nothing else but paperless.

Almost ten years ago the Budapest Open Access Initiative established the open access movement. With the term open access (OA) we refer to online peer-reviewed scientific literature which can be accessed free of charge from everyone in the world provided that a person has access to the Internet.

The primary routes to open access are the open repositories (green open access) and the open access journals (gold open access).

OA infographic created by Nancy Pontika

The open repositories (green oa) can be divided into two categories, the institutional repositories and the subject repositories. The first are maintained by institutions, focus on collecting the institution’s outputs and their collections vary widely. For example, faculty members may self-archive into the repository post-prints versions of an article, but they may also self-archive the syllabi of their courses. Institutional repositories may hold students’ theses, grey literature, databases, etc. An example of an institutional repository is Sheffield’s White Rose Research Online

On the other hand, subject repositories are discipline specific repositories, and collect outputs in a specific subject field. The is both the first repository, established in 1991, and the first subject repository in the world and collects outputs in physics and mathematics. Another example of a subject repository is the international PubMed Central, containing outputs related to medical research.

Another way to provide your outputs open access is through the open access journals (gold oa). The open access journals follow the same quality procedures as the subscription journals; they have editors, editorial boards, conduct peer review and have a periodicity. Primarily, the open access journals have two advantages: first, they allow authors to retain some or all of the rights of their articles and second, because of their open content, they receive a high number of citations. A list of the Open access journals can be found at the Directory of Open Access Journals (DOAJ).

Because the open access journals do not charge for subscriptions, they have developed alternative business models to support their publications. Some charge a publication fee for every article they publish, which is called an article processing charge (APC), while others may be supported by the universities that run them, like First Monday, which is supported by the University of Illinois, Chicago. Because the business models of the open access journals are too many, we have created a list.

In some subject fields the open access journals have gained high prestige. In medicine, for example, all the PLoS journals have a high impact factor and compete with the most prestigious traditional journals in the field. The open access journals in the Humanities and Social Sciences (HSS) are moving slowly, but some of them have started gaining momentum, such as the Open Library of Humanities and the Social Sciences Directory.

Although the past ten years there has been a focus mainly on journal articles, currently attention is drawn both on the open access monographs and research data. Regarding the first, a great project is the OAPEN-UK project, which focuses on developing open access monographs in HSS in the UK. Regarding research data there is the excellent Digital Curation Centre (DCC), which guides researchers on best practices for managing research data and the UK Data Archive that collects data on humanities and social sciences.

[i] Sondak, N.E & R.J. Schwartz (1973). The paperless journal. Chemical Engineering Progress, 69(1), 82-83
Senders, J. (1977). An on-line scientific journal. The Information Scientist, 2(1), 3-9
Turoff, M. (1978). The EIES Experience: Electronic Information Exchange System. Bulletin of the American Society of Information Science, 4(5), 9-10
Roistacher, R. (1978). The Virtual Journal. Computer Networks, 2, 18-24.

Creative Commons Licence

This work is licensed under a Creative Commons Attribution 3.0 Unported License.

Monday, 21 October 2013

International Open Access Week: Guest blog postings from the #OA community

A very happy Open Access Week (OAW) to all our followers. For the duration of International Open Access Week this blog will be taken over by #OA advocates, at the request of the University of Sheffield Open Access Team.

Our first guest posting is from Carmen O’Dell, Open Access Coordinator at the University of Sheffield. 

In honour of International Open Access Week the RDM blog will this week host blogs from individuals involved in different ways with Open Access who will offer their own perspective on the challenges and opportunities that surround the movement to widen access to scholarly research. 

The underlying principle of Open Access is very simple – access to scholarly research, on the internet, free of charge for everyone.  However making that a reality is not simple at all - not least because not everyone agrees we should be going down that route in the first place and there is certainly much disagreement over how best to get there.  In the UK however, the publication of the Finch report and the subsequent release of new funding requirements by Research Councils UK, means that many researchers are now having to deal with the complexities of open access for themselves, often for the first time,  like it or not. 

Fortunately at the University of Sheffield help is at hand. A small, but very dedicated team are here to help you understand the options available to you so you can ensure your research both reaches the widest audience possible and also complies with any applicable funder directives.  Use our web pages as a starting point on your own open access journey, invite us into your department or research group to talk to you and your colleagues in depth about the issues which concern you the most or contact us on for individual help when you need it.   

Additionally, on Wednesday the 23rd of October we have arranged a series of short talks by experts from within the University on various aspects of open access.  For more information and to book a place visit our event web page

Wednesday, 11 September 2013

'RDM the movie' - University of Leicester

Dr Andrew Burham from the University of Leicester has just released 'RDM the Movie' a short 'Digital Story' regarding the RDM work and researcher issues they encountered during their research data management (RDM) project.

I really enjoyed this video, mainly because it was quick to view 3:55 minutes and it includes four real life data challenges encountered by researchers from different faculties.

Data challenges faced covered:

  1. Big data
  2. Data storage and retention
  3. Information Security
  4. Dealing with confidential and identifiable patient data
The solutions to all of these data challenges are provided.

Andrew is looking for feedback on this video as the plan is to use this sort of communications method for research projects generally. Further information can be found on the excellent University of Leicester Research Data Management Website

Monday, 2 September 2013

What is research data?

Trying to define what is research data is something of a challenge and there appears to be no clear consensus on a definition. What we do know is that research data means different things to different people in different contexts and that the definition varies depending on your subject discipline and research funder.

However, it is worth exploring some of these different definitions of research data. The University of Sheffield Research Data Management Policy defines 'data', as:
"...observational data, experimental data and data derived from analysis, independent of format."
The Engineering and Physical Sciences Research Council (EPSRC) define research data as:
"...recorded factual material commonly retained by and accepted in the scientific community as necessary to validate research findings; although the majority of such data is created in digital format, all research data is included irrespective of the format in which it is created."
What about the visual arts? How do they define research data? The distinctive and varied nature of research data in the visual arts was explored in depth by KAPTUR, a Jisc Managing Research Data Project led by the Visual Arts Data Service.  They define research data as:
"Research data can be described as data which arises out of, and evidences, research... Examples of visual arts research data may include sketchbooks, log books, sets of images, video recordings, trials, prototypes, ceramic glaze recipes, found objects, and correspondence."
Earlier this year I attend an event where Laura Molloy from the University of Glasgow provided one of the more succinct definitions of research data I have come across. She defined it as:

“The material underpinning a research assertion”

Research Data types 

Can include all of the following:

  • sketchbooks 
  • video recordings 
  • correspondence 
  • log books
  • test responses 
  • slides, artefacts, specimens, samples 
  • audiotapes, photographs, films 
  • models, algorithms, scripts 
  • questionnaires, transcripts, codebooks 
  • methodologies and workflows 
  • standard operating procedures and protocols
In our next posting we will start to reflect on why research data management (rdm) is important and outline the benefits of good practice in research data management. 

Further reading:

Friday, 23 August 2013

Welcome to the University of Sheffield Research Data Management Blog

Welcome to the University of Sheffield Research Data Management (RDM) Blog. This blog will be the place we share the latest news, views, and developments in the exciting area of RDM.

Our main audience for this blog is the researchers of the University, and the professional services staff involved and engaged in RDM work - namely those that will be supporting the research community with their RDM requirements. As part of building institutional RDM capacity and capability we hope that you will find this blog a useful tool for raising RDM awareness. The University of Sheffield Research Data Management Policy was published in 2012.

During 2011/2012 the University conducted a 12 month internal research data management scoping project and one of the main objectives was to explore the University's current capabilities for meeting RDM needs and propose reasonable, viable and sustainable extensions to the University's support for RDM. As a result of the project several key areas requiring action were identified. They include:
  1. Advocacy, awareness-raising and training for our research community
  2. Support for costing RDM requirements in grant applications
  3. Support and guidance on data management plans (DMPs)
  4. Provide an authoritative and up-to-date UoS portal providing advice and support
  5. Guidance on storage, back-up, and data security
  6. Creating a network of RDM champions
  7. Provide advice and guidance on intellectual property (IP) and copyright in research data
  8. Develop a data repository infrastructure
So, we will be covering many different components of RDM, and where feasible signposting to other sources of useful information. We welcome relevant guest posts from TUoS community, so please do let us know if you are interested in sharing your views and thoughts on RDM.

Future posts will consider 'What is Research Data?' and what are the drivers and benefits of managing research data.

Useful resources:

Digital Curation Centre
RCUK Common Principles on Data Policy

Monday, 12 August 2013

Research data access and sharing - your views wanted

The Wellcome Trust, on behalf of the Expert Advisory Group on Data Access (EAGDA) are looking for the views of the research community on the incentives and culture change for data access and sharing.

The survey responses will be used to inform the development of recommendations and guidelines on data access and management for the EAGDA funders. The EAGDA funders include the Wellcome Trust, Cancer Research UK, the Economic and Social Research Council and the Medical Research Council.

The survey is available here.