We鈥檙e recognizing Love Data Week (February 11-15) and this year鈥檚 theme is 鈥榙ata in everyday life.鈥 We鈥檝e asked several researchers who participated in our Better Science Through Better Data event to reflect on the importance of data sharing in their own lives. We鈥檒l be sharing their stories all week so keep checking back!
In November 2018, I was invited to give a lightning talk at the #scidata18 conference. My first open data conference ever. This invitation made me start thinking about my personal perspective of data sharing from an early career researcher point of view; what it means to me and to my research.
I am a PhD student at the 鈥楥oastal Risk and Sea Level Rise鈥 research group at the University of Kiel, Germany. My research focuses on the analysis of future coastal impacts of sea-level rise and the assessment of the benefits of using different adaptation strategies in the Mediterranean. One part of my PhD has been to develop a Coastal Mediterranean Database that enables us to assess potential future impacts around the basin. A pre-requisite to develop such a database is the availability of consistent, multidisciplinary data on current and future conditions of the coast. The compilation of such a database has been a challenging task due to the lack of primary data and difficulties in merging data from different formats, resolution, quality and accuracy in a consistent manner. But we have managed.
Lack of data is one of the main barriers for regional coastal studies. Thus, we decided to share our data to help other researchers overcome this problem. We therefore decided to publish the database as a data descriptor in Scientific Data.
As none of my colleagues had ever published a data descriptor before, I was anxious that my lack of experience would stop me from contributing to open science. My programming skills are limited and I was unsure about the proper methods for documenting data and coding well. But at the same time, I had experienced how difficult research can be without having access to high-quality and well-documented data. We gave it a try and in the end we managed.
For me science is by default objective, comprehensible and therefore, reproducible. That is what I always thought, but the more I think about it ,the more I realize that just publishing results, as most of us usually do, does not promote reproducible science.
After publishing the data and listening to many inspiring talks at the #scidata18 conference, I realized that I am a big supporter of open science. I strongly believe that we, as a scientific community should foster this transition. The movement of giving authors scientific credit for well-documented, openly shared datasets is an important component of the transition towards open science.
There are several reasons why sharing benefits all of us as a research community:
1) It forces us to document data work and code well, enabling researchers to replicate studies faster.
2) We can get scientific credit for the effort we put into the data work. This is often one of the most time-consuming and less rewarding tasks of research even though it is a fundamental part of science.
3) We can make the life of the next researcher much easier, which will advance science even faster in the long run.
To speed up the transition process towards open science, maybe we, the early career researchers, can play an important role by initiating this from the bottom up, by working in a more open and transparent manner.
So, early career researchers 鈥 be brave and do not be afraid of failure or the possibility of others finding mistakes when you share your data or code. Instead, structure your code and open it to make mistakes discoverable. We all make mistakes, but as Kirstie Whitaker said in her #scidata17 key note: 鈥淟et鈥檚 build a reward system in science that makes it okay to be human. That makes it ok to make honest mistakes.鈥 In the end, we will all be faster and conduct better research in an open system. This can only bring science forward. Let鈥檚 give it a try. Let鈥檚 be open about it.
.