Q: How important is data sharing and open research in your community? Have you seen attitudes changing in recent years?
Kevin: A cornerstone of research is reproducibility, and this necessitates transparency and accessibility when it comes to research data. There is and has been a consensus within life-science publication that all the data required to validate a claim and reproduce reported findings should be provided by authors in order to secure publication and editors policing of this policy has been key to ensuring research data sets has been made freely available by authors to their communities.
Often however, the data presented has already been compiled by the authors to facilitate interpretation by the readers. What has changed in recent years in the expectation that not just compiled data, but actually complete raw datasets should be provided either directly associated and co-published with manuscripts or via made freely and publicly available through public databases in advance of publication.
Amar: All of us know that data sharing and open research science lead to reliable, reproducible and impactful science, and trustable scientists who will have scientific reputations with greater citation and stronger collaboration.
James: We have always advocated for Open Research & Open Data. If the original datasets aren’t available for review, readers have to assume that data collection & analysis are correct. Data sharing aids open peer review as these assumptions don’t have to be made! Open Data is now widely accepted in the Life Sciences and in many cases is also required by funders. We’re seeing data sharing spreading to other disciplines such as Social Sciences too.
May: IJGIS started data sharing requirements in August 2019 and now fully implements the policy. Some resistance arose at the beginning. Attitudes have changed in the last 12 months, and now all our manuscripts, except for review papers, share data and codes.
Urska: I work in movement analytics and develop methods for movement ecology, (which studies animal movement) and human mobility. In ecology, open data are a well-established tradition and it is now a default to place your data to open repositories, e.g. Movebank or Motus. Ecology journals and other journals where ecologists publish (general ones PloS One) require publication of data on submission & this is then published either in the journal or in the repository parts of the portals. Ecologists also develop their methods in Free and Open Source Software (mostly R) and publish them online.
On the human mobility side, data sharing is IME less developed, primarily because of problems with geoprivacy and commercial nature of data which prohibits public publication due (in contrast, ecology data are usually collected by academics themselves, who are the owners and can decide if they want to publish them openly or not).
This has changed somewhat during the COVID-19 pandemic, where companies which collect human mobility data (e.g. mobile phone providers, big IT companies, such as Facebook, Google, etc.) have offered them in various “Data for good” schemes (e.g. Google’s COVID-19 Community Mobility Reports). These however don’t come in raw form, but are aggregated to larger spatial/temporal scales, so they are not quite the same type of open data as the ones that are collected by researchers (not companies).
With respect to open code, there is also less of that in human mobility, since researchers often come from disciplines like physics and computer science, which often publish their methods as pseudocode in the papers, but not as open code. Some areas of human mobility are however much better at opening code, for example transportation (follow @robinlovelace for open R tools for transportation) and GIScience itself (@underdarkgis does open movement analytics in Python).