header
published quarterly by the university of borås, sweden

vol. 27 no. 2, June, 2022



Data sharing practices in open access mode: a study of the willingness to share data in different disciplines


Heidi Enwald, Vincas Grigas, Jurgita Rudžionienė, and Terttu Kortelainen.


Introduction. Considering the problems of making research data freely available despite many initiatives and legislation, this study explores the possible differences and similarities of research fields in understanding and enacting open access use and sharing of research data.
Method. An online survey was distributed and 671 responses of researchers from Lithuania and Finland were surveyed.
Analysis. Chi-squared tests were used to compare proportions across questions groups. Bonferroni corrections applied to p-values were used to reduce the risk of type I errors in the post-hoc analysis of Chi-squared tests. Chi-squared post-hoc tests using adjusted residuals were calculated.
Results. Willingness to share research data in open access mode was higher in natural, biomedical and technological sciences than in humanities and social sciences. Although the principle of open research data is supported, there are considerable differences in the character of research data, and the ethical concerns and data management policies between different research fields.
Conclusions. Changing culture of data sharing is much harder than publication of new initiatives and legislation. By comparing results with the data from same survey from UK, France, and Turkey we find out that researcher in Finland and Lithuania are more conservative regarding open research data.

DOI: https://doi.org/10.47989/irpaper932


Introduction

Over the past decade, the open research data movement has gained momentum among all participants in scholarly communication: publishers, funders, institutions, and researchers. We continue the research theme with a researcher-centred approach in outlining the benefits of open research practices. The goal of the open data approach is to have reproducible and transparent research data. In 2014, a set of principles was proposed to optimise the machine-based reusability of research data, named the FAIR Data Principles. The main aim is to ensure that data or any digital object are Findable, Accessible, Interoperable and Re-usable. (Wilkinson, et al., 2016.) The data should be sustainable, open and the research should be conducted responsibly.

In everyday research, research data management takes many forms and those forms are not always open. Some researchers keep their research data in their computers, on DVDs, or in private accounts on services such as Google Drive, and OneDrive. Others put their data into online databases or into journal repositories and let their data be used freely. Nowadays, it is becoming more common that, when publishing in scholarly journals, the researchers are required or strongly recommended to provide their data openly. The open research data movement has gained increased visibility and influence for several reasons. Most often it is because of the policy implemented by research funding institutions to renew public trust in science-based policies (Leonelli, 2013). Furthermore, there has been an increase in the number of so-called data papers, defined as scholarly journal publications, whose primary purpose is to describe research data (Schöpfel, et al., 2019).

Sharing data, or the release of research data for use by other researchers, is not a new topic of discussion among researchers and policymakers. Release may vary from private exchange upon request to deposit in an open data repository. Furthermore, providing datasets to a journal as supplementary materials also qualifies as sharing. (Borgman, 2012.) In fact, many funding agencies and international publishers have begun to require that the raw data and research documentation, which have been generated and used to report research findings, should also be submitted. (Borgman, 2012; Tripathi, et al., 2017). In many cases they advocate that the data should be deposited on open data repositories for others to access, browse, use and validate (Tripathi, et al., 2017). Despite widespread recognition of the value of openness, there is still some opposition.

Researchers from different science fields see research data openness in different ways, because of different research and scholarly communication practices. Previously there has been little consensus on the openness of research data. For instance, the following reasons have been given: policies have different terms and requirements for researchers (Corrall and Pinfield, 2014); infrastructures, provided by research institutions, vary, and scholarly communities have different commitments and goals. This leads to different approaches on what is openness in research data and, in general, in open science. In their essay on data cultures Poirier and Costelloe-Kuehn (2019, p. 1) argue that ‘strengthening international data sharing networks will not only demand advancing technical, legal, and logistical infrastructure for publishing data in open, accessible formats; it will also require recognising, respecting, and learning to work across diverse data cultures’.

Although open research data has gained momentum in the scholarly community, it appears there are still some differences in regard to discipline and the country of a researcher (the effects of local policy, and community practices). The aim of this study is to understand how disciplines may affect the use and sharing of research data in open access mode. More specifically this study increases our knowledge of researchers’ views on data sharing and the idea of openness of research data, leading to our research questions:

  1. What kind of difference is there between different research fields in sharing data?
  2. What kind of concerns and barriers do the researchers face in data sharing?
  3. What difference we can see by comparing data of the same survey from small and big language countries regarding open research data?

The survey was performed among Finnish and Lithuanian scientific researchers and the combined analysis identify core themes that characterise the understanding and practice of the open use and sharing of research data. Finland and Lithuania represent small European countries where most of the scientific communication takes place in other than the native language, but, on the other hand, the textual data collected is usually based on native languages. This most likely has an impact on the attitudes towards data sharing and connects these two countries. It is important to compare data of small-languages countries with data from countries such as the UK, France, and Turkey.

Background

Research data in open access mode

Attitudes towards the importance of data management and data sharing can vary a great deal from researcher to researcher and especially between disciplines (Surkis and Read, 2015). Possible barriers may be deeply rooted in the practices, and culture of the research process, as well as the researchers themselves (Tenopir, et al., 2011; Tenopir, et al., 2015).

Changes from the DataONE project 2009/2010 baseline to the 2013/2014 follow-up study indicate that not only is data sharing behaviour increasing, but that researchers are also viewing the practice and the overall movement more favourably (Tenopir, et al., 2015). In contrast to scientists’ willingness to share data, results also show increases in scientists’ concern over data being misinterpreted because of the complexity of the data, its poor quality, or data being used in ways other than the intended purpose.

Openness in data sharing has a long history. One of the first mentions is found in such domains as oceanography and the biodiversity sciences (e.g., taxonomic data and museum specimens) (Michener, 2015). The idea of open data is especially supported in oceanography, ecology and genomics (Kim, 2019).

In one of the first studies on differences by research fields, Tenopir and colleagues (2015) found that the most distinct pattern reveals a division between those who work with the data of human subjects, including medicine and health science, business, psychology, and the social sciences, and those who do not.

In terms of perceptions about data sharing, some human subject disciplines felt less strongly that lack of access to others’ data is an impediment to science. They also expressed less willingness to engage in data sharing and reuse. They were more likely to think that their data shouldn’t be made available to others, and that they do not have the rights to make it available anyway. (Tenopir, et al., 2015, pp. 14-15.)

Furthermore, those who work with the data of human subjects were more likely to not use metadata to describe their data. Medicine and health science researchers were least likely to place their data, or part of it, into a central repository with no restrictions or to share data across a broad group of researchers (Tenopir, et al., 2015, pp. 14-15.) In a scoping review by Ohmann et al. (2021) the willingness to share clinical data was high, but the actual data-sharing rates are suboptimal. In contrast to the findings of Tenopir and her colleagues (2015), Levin et al. (2016) examined through interviews how biomedical researchers understand and enact openness in their everyday working lives. Researchers emphasised collaboration and co-operation with peers and communities. Furthermore, they highlighted the importance of submitting data to established databases, and the importance of facilitating access to resources through the availability of fully open or managed-access databases. They saw the value of several standards, covering both the format and quality of data, enabling the use, reuse, and circulation of resources. Many mentioned the inadequacy of existing repositories and databases. Researchers said it is important to ‘give back to society’ and that there is a moral duty to provide access to data and publications in places with fewer resources or infrastructures, be they physical, economic, or intellectual, also meaning non-academic contexts. (Levin, et al, 2016.)

Furthermore, Abele-Brehm et al. (2019) conducted a survey among members of the German Psychological Society and found that respondents were hopeful, but also possessed some fears regarding sharing research data. Both hopes and fears were highest among early career researchers and lowest among professors.

The most recent research on the topic conducted by Tedersoo and colleagues (2021) shows that data sharing has improved in the last decade, but there are statistically significant differences between disciplines on availability and willingness to share data. They found that, when requested to share their research data, scholars of humanities, psychology and the social sciences were among those who declined more often in comparison with colleagues from ecology, forestry, material for energy and catalysts and microbiology.

However, Scaramozzino et al. (2012, pp. 361) point out that ‘while the majority of researchers believe that colleagues should share their data, only a minority of respondents share their own data with individuals who did not help in gathering the data’. Several obstacles for sharing data have been presented in the literature and are common to all scholars despite the discipline. These include lack of appropriate technical and organizational infrastructures and support for storage and retrieval (e.g., Thessen and Patterson, 2011; Shearer and Furtado, 2017), ethical challenges (e.g., Australian National Data Service, 2017) and need for incentives and policies for data sharing (e.g., Fecher, et al., 2016; Kowalczyk and Shankar, 2011; Thessen and Patterson, 2011, Rowhani-Farid, et al., 2017, Van den Eynden and Bishop, 2014). Additionally, documenting data is labour intensive and requires time and resources (Koltay, 2017). A systematic review of previous studies reveals that even the concept of data sharing would need more clarity and consistency (Thoegersen and Borlund, 2022).

Researchers also should overcome their fears of misuse or misinterpretation (Koltay, 2017). Aleixandre-Benavent with colleagues (2020) surveyed 1178 scholars in Spain in 2015 about their research data sharing practices. The greatest fear of sharing research data among scholars in physics and technology, arts and humanities, social sciences and health sciences is because of legal issues (confidentiality and intellectual property rights), misuse or misinterpretation of data, and loss of authorship. Hodonu-Wusu and colleagues (2020) surveyed 135 researchers in 2018 in Malaysia. They found that, although researchers are aware of open data, they do not practice it; one reason for this is the lack of clear information on data privacy policy, potential misuse of data, and fear of losing publication opportunity.

It is well-known that in most research fields, the reward of research comes from publishing, not from data management. Incentives to share include the ethos of open science and peer review, the value of collaboration, the potential boost to the researcher’s reputation, and the dictates of reciprocity. (Borgman, 2012.) In their study Levin et al. (2016) reviewed literature demonstrating that open research is associated with media attention, potential collaborators, job opportunities and funding opportunities. There are signs that data sharing also confers a citation advantage. For instance, this has been seen in relation to studies of gene expression microarray data (Piwowar and Vision 2013) and astrophysics articles (Dorch, et al., 2015).

In the study by Van den Eynden et al. (2016) researchers were asked in an online survey why they share their data, and the most important reasons were that their funders require them to share their data, it is a good practice, it enables collaboration and contribution by others, and that it enables validation and/or replication of their research. Finally, sharing data and materials show that researchers value transparency and have confidence in their own research (Levin, et al., 2016). That said, it is important to remember that for achieving the expected benefits of data sharing, data must be reused by other researchers. Data sharing practices, especially motivations and incentives, have received far more study than the real use of data. (Pasquetto, et al., 2017).

Data sharing initiatives

The open science movement and data sharing are supported by several international initiatives. The Research Data Alliance was launched by the European Commission, the United States Government's National Science Foundation and National Institute of Standards and Technology, and the Australian Government’s Department of Innovation in 2013. The goal of this international community of researchers is to build and sustain the social and technical infrastructure needed to enable open research data sharing and reuse internationally and across disciplines. (Research Data Alliance, 2020.)

The Hague Declaration (2015) aims to foster agreement about how to best enable access to facts, data and ideas for knowledge discovery by removing barriers to accessing and analysing the wealth of data produced by society.

The Amsterdam Call for Action on Open Science (The Netherlands. Ministry of Education…, 2016, pp. 3) highlights that 'data sharing and stewardship is the default approach for all publicly funded research'.

The RECODE (Policy RECommendations for Open Access to Research Data in Europe) project leveraged existing networks, communities and projects to address challenges within the openness and data dissemination and preservation sector to produce policy recommendations for open access to research data based on existing good practise. They have provided Policy RECommendations for Open Access to Research Data in Europe (OpenAIRE, 2019).

Recent work by the LIBER’s Research Data Management Working Group centred on answers given by managers, librarians and technical staff with regards to the FAIRness of repositories and their data, misconceptions related to FAIR principles definition and implementation and to the complexity of the implementation. Their report also summarises and suggests several best practices. (Ivanović et al., 2019).

Furthermore, Science Europe, that is the association representing major public organisations that fund or perform research in Europe, has released a practical guide to the international alignment of research data management (Science Europe, 2019) and has also collected experiences of member organizations (Science Europe, 2020).

Data sharing and openness in Finland and Lithuania

Finland and Lithuania share similarities: for example, most of the scientific publishing takes place in other than the countries’ first languages. The level of policies and the general discussion on open access in different countries guide the researchers. In Finland, the status of openness of research data was originally implemented in a large-scale open science and research project in 2014-2017 by the Ministry of Education and Culture. The project was based on co-operation among many actors across many research fields. From June 2018 open science activities have been coordinated by The Federation of Finnish Learned Societies (2020). The development of national data management tool was organised as a Tuuli project (Ahokas, et al., 2017) funded by the Finnish Ministry of Education and Culture and coordinated by the Helsinki University Library. Approximately twenty Finnish research organizations were involved in the project, including more than forty experts in its working groups and subgroups.

In Finland, universities have published their data management guidelines, and a data citation roadmap has been conducted (Finnish Committee for Research Data, 2018) and in 2018, a data action plan on Opening the data – training, skills, change in attitude, rewards, infrastructure was published (UNIFI Working Group, 2018). The Finnish research community has created a Declaration for Open Science and Research that was approved by the National Open Science and Research Steering Group in December 2019 (The Federation of Finnish Learned Societies, 2019). The policy on openness and sharing of research data is in the making.

The Research Council of Lithuania acts in accordance with several regulations. First, the Commission Recommendations of 17 July 2012 on access to and preservation of scientific information (European Commission, 2012). Second, Draft Recommendation on Open Science (Unesco, 2021). Third, the Science Europe, Principles on Open Access to Research Publications (2015). Fourth, Regulations of the Research Council of Lithuania approved by the Resolution of the Seimas of the Republic of Lithuania (2009). Fifth, the provisions of the law on higher education and studies. The Research council in Lithuania acting as the institution responsible for coordinating the Open Access policy on the national level, has approved the Resolution Regarding the Approval of the Guidelines on the open access to scientific publications and data (2016).

The purpose of the Guidelines is to ensure the dissemination of research results and circulation of scientific knowledge, promote broader co-operation between researchers and reduce the amount of potentially identical or uncoordinated research, secure a better societal and economic return of research results.

While performing the role of science policy makers and implementer as prescribed in its Regulations, the Research Council of Lithuania, in co-operation with other institutions of Lithuania, such as the Lithuanian Academy of Sciences, the Ministry of Education and Science, the Ministry of Foreign Affairs, the Agency for Science, Innovation and Technology and others, seeks to expand international relations so that being part of the international research community could jointly address issues of data sharing relevant for Lithuania, Europe and the world.

Guidelines on international co-operation for 2016-2020 were adopted by The Research Council of Lithuania. The purpose of those Guidelines is to present international co-operation as a research policy tool that enables attaining the strategic objectives of the Council (The Research Council of Lithuania, 2019). Furthermore, in 2014, the Research Council of Lithuania became a partner of the international project Open access policy alignment strategies for European Union researchers.

Methods

This study is part of an international collaboration project on Data literacy and research data management, performed by a group of researchers in more than twenty countries during 2017. The survey instrument, consisting of twenty-four questions, was created by researchers from England, Turkey and France. The survey included questions on data used by the researchers, willingness and attitudes towards sharing data, and data management. The LimeSurvey online platform was used for data collection.

While there are many studies on research data management issues in large countries, there is a lack of research on this matter in small-language countries such as Lithuania and Finland. As of 2020 the population of Finland was 5,5 mln. and in Lithuania – 2,8 mln. Both countries have well-developed network of universities, but only the university of Helsinki is among the top 100 universities in the world in the most central university rankings (QS Quacquarelli Symonds, 2021). Finland and Lithuania have similar approach and history in adopting research data management policy.

The data used for analysis in this paper are available freely in a data repository:

Data collection in Finland

The survey data (in English, Finnish and Swedish) were collected by the researchers of the University of Oulu. The survey was distributed by personnel involved in the development of the national data management tool in the Tuuli project. These persons were contacted by e-mail and asked to distribute the survey in their organizations. Through this approach almost all of the research institutions of Finland were reached. The survey data were collected in June and July 2017 and the survey was targeted to all Finnish research organizations’ personnel, including also doctoral students: 469 responses were received.

Through this approach almost all of the research institutions of Finland were reached. The final respondents were from University of Oulu, University of Eastern Finland, Tampere University, Åbo Akademi University, University of Turku, University of Lapland, University of Vaasa, University of Helsinki, Aalto University, University of Jyväskylä but also some from other research institutes and organisations (VTT Technical Research Centre of Finland, Finnish Institute of Health and Wellfare, Finnish Institute of Occupational Health and Finnish Environment Institute).

Universities were the target of the survey, but even in them, the distribution failed to be very successful in some, because only one or two researchers answered. A reason for this could be possible misunderstanding or lack of time of the person asked to distribute the survey. Or even their uncertainty of the channels to be used for reaching researchers inside their organizations because in many cases this knowledge might be difficult to find, and the channels are scattered.

As position of researcher is usually not very stable, e.g., relating to different kind of short-term research grants and changing projects in the universities, the whole number of researchers in Finnish research institutes is difficult to evaluate. However, according to the university web pages the number of researchers in the universities reached with this survey was about 19,000 and this would make the response rate around 2%. The sample size is small, but we were able to reach at least individual researchers from several research organizations.

Data collection in Lithuania

The survey was translated into Lithuanian and data were collected by Vilnius University researchers in 2017. The survey was distributed using Vilnius University’s e-mail service, and surveys were completed by respondents using the online survey tool LimeSurvey. The survey was targeted to all academic personnel, researchers and doctoral students at Vilnius University with 2893 academic staff and 806 doctoral students: 202 responses were received giving a response rate of only 5.5%.

Low response rate was an issue for us and had an impact on what questions we addressed in data analysis. We decided not to discuss demographic questions. As Holbrook et al. (2007) found out, lower response rates heavily affects demographic representativeness. Low response rate showed a high level of disinterest in the topic. This could be seen as a bad thing because of under-representation of some disciplines, but trying to make higher response rate by repeating requests again and again could not bring desired results. A study by Curtin et al. (2000) showed that exclusion of hesitant respondents has a small impact on survey results. Data shows that only social sciences and biomedical sciences are reasonably represented. We had it in mind while discussing the results.

The research, in both countries, was performed in accordance with the Helsinki Declaration (World Medical Associations, 2018) concerning the researcher’s responsibility to protect the life, health, dignity, integrity, right to self-determination, privacy, and confidentiality of personal information of research subjects.

Conceptual model applied in this study

We partly adopted the conceptual model designed by Zuiderwijk and Spiers, which suggests what twelve categories that potentially influence academic researchers’ motivation to share research data openly or not and to re-use open research data or not (Zuiderwijk and Spiers, 2019):

  1. Researcher’s background
  2. Voluntariness
  3. Personal drivers and intrinsic motivations
  4. Facilitating conditions
  5. Trust
  6. Expected performance
  7. Social influence and affiliation
  8. Effort
  9. Researcher’s experience
  10. Legislation, regulation and policies
  11. Data characteristics
  12. Other

Our own experience and literature review led us to adopt a slightly narrower approach. We decided to look for what kind of differences there are among different research fields in sharing data and what kind of concerns and barriers the researchers meet. Following this approach, we employed a narrower version of the conceptual model involving the following categories:

  1. Researcher’s background (discipline, years involved in research)
  2. Voluntariness (research data openness, willingness to share research data)
  3. Facilitating conditions (ways of getting data)
  4. Trust (collaboration and data shared with other researchers)
  5. Researcher’s experience (foresee problems with data sharing, or not; ethical issues in sharing data with others; concerns for sharing data with others)
  6. Legislation, regulation and policies (familiarity with open access requirements)

Data analysis

In our analysis we use the combined responses from both Finland and Lithuania. The approach of combining data was chosen as it was considered more beneficial and inclusive than comparing these two small countries. The survey received a total of 671 fully completed responses from Finland and Lithuania combined. By combining the data from these two small countries the number of responses from each discipline was more representative (Adamek, 1994). We saw quite a high number of unfinished questionnaires. The non-response factors of the survey have been reported by Rudzioniene et al. (2018).

All calculations were performed using SPSS Statistics 23 for Windows. Cronbach’s alpha based on standardised items is 0.720 (N 671), which indicates adequate internal consistency of the questionnaire (DeVellis, 2003). Removal of any question would not result in significantly lower or higher Cronbach’s alpha.

In this paper, we focus on differences and similarities between research fields. In the questionnaire a list of subfields was given to the respondent and for the analysis we have combined the subfields to main classes. For this classifying we used the Common European Research Classification Scheme (European Commission, 1991).

Chi-squared tests were used to compare proportions across questions groups. Differences were considered statistically significant when the p < 0.05. Bonferroni corrections applied to p-values were used to reduce the risk of type I errors post-hoc analysis of Chi-squared tests (Laerd statistics, 2018). Chi-squared post-hoc tests using adjusted residuals were calculated.

Data analysis and results

Background information and ways of getting data

Data from 671 subjects (380 female, 278 male, eleven undisclosed, and two other) were analysed. A little over half (363, 54.1%) were academic staff, 291 (43.4%) were students and 17 (2.5%) other (professors emeriti, administration). In this study, we focus on differences and similarities between research fields, with the result shown in Table 1 below:


Table 1: Respondents by discipline
DisciplineNumberPercentage
Humanities9313.9
Social sciences20530.6
Natural sciences and mathematics649.5
Biomedical sciences21532.0
Technological sciences9414.0
Total671100

The time involved in research is shown in Table 2:


Table 2: Years involved in research
YearsNumberPercentage
<5 years21832.5
5-10 years17626.2
11-15 years8612.8
16-20 years639.4
> 20 years12618.8
Never involved in research20.3
Total671100

At least 73% (494) of respondents collected and created their own data. In other fields than Humanities at least 45% (264) of respondents answered that their data came from their own research group, but only 28% (26) from Humanities answered this way. More than half of respondents representing the Humanities (52.7%, 49) received their data always from multiple known sources, while in other fields this is up to 40% (Table 3).


Table 3: Ways of getting data
The wayHow do you usually get the data for your research? *
HumanitiesSocial sciencesNatural sciences and mathematicsBiomedical sciencesTechnological sciences
Create new data59 (63.4%)
-2.4
151 (73.7%)
.01
44 (68.8%)
-0.9
161 (74.9%)
0.5
79 (84%)
2.5
Create new dataFrom own research team/group at the university26 (28%)
-3.2
82 (40%)
-1.1
27 (42.2%)
-0.2
109 (50.7%)
2.7
46 (48.9%)
1.2
From own research network (or personal/professional connections)37 (39.8%)
0.3
71 (34.6%)
-1.4
30 (46.9%)
-0.8
78 (36.3%)
-1.1
43 (45.7%)
1.5
Always from one known source11 (11.8%)
3.2
8 (3.9%)
-0.9
3 (4.7%)
-0.1
8 (3.7%)
-1.1
4 (4.3%)
-0.4
Always from multiple known sources49 (52.7%)
3.6
76 (37.1%)
0.3
19 (29.7%)
-1.1
61 (28.4%)
-2.9
38 (40.4%)
0.9
*Multiple responses were possible. Each cell displays N, percentage within the science field, and adjusted residuals

According to the analysis statistically significant differences were detected in the way scientists get the data for their research (Table 3). Humanists were more likely than representatives from other fields to get data from known sources [always from one known source X2 (4, N 671) = 18,618, p = 0.001), always from multiple known sources X2 (4, N 671) = 18,617, p = 0.001)]. Moreover, humanists were least likely to create new data and to receive data from their own research team [Create new data X2 (4, N = 671) = 11,178, p = 0.025), from own research team/group at the university X2 (4, N 671) = 15,873, p = 0.003), from own research network (or personal/professional connections)X2 (4, N 671) = 10,373, p = 0.035)].

Only 10% (68) of respondents strongly agree that they are familiar with research data related open access requirements, while 8.9% (60) strongly disagree with this claim (Table 4). The combination of strongly agree and agree shows that most familiar with open access requirements are representatives of social sciences, 52.7% (108). However, no statistically significant differences were detected between research fields.


Table 4: Familiarity with open access requirements
Likert scaleI am familiar with the open access requirements (of research data) *
HumanitiesSocial sciencesNatural sciences and mathematicsBiomedical sciencesTechnological sciences
Strongly agree6 (6.5%)23 (11.2%)8 (12.5%)27 (12.6%)4 (4.3%)
Agree35 (37.6%)85 (41.5%)23 (35.9%)80 (37.2%) 40 (42.6%)
Neither agrees nor disagrees27 (29%)50 (24.4%)21 (32.8%)56 (26%)21 (22.3%)
Disagree12 (12.9%)33 (16.1%)10 (15.6%)33 (15.3%)17 (18.1%)
Strongly disagree13 (14%)14 (6.8%)2 (3.1%)19 (8.8%)12 (12.8%)
* Each cell shows N, and (percentage within the science field)

Data openness, collaboration and sharing data with other researchers

More than a half or the respondents shared their data with researchers of the same team (59%, 398), researchers in the same university (24.7%, 166) or researchers in other institutions (32.48%, 218). However, about a quarter (26.82%, 180) of respondents did not share their research data with other researchers. Almost half of humanists (49.5%, 46) said they did not share and collaborate, and only a third of them (33.3%, 31) shared data with their team. These figures were significantly lower than those representing other fields (Table 5) and the results were statistically significant [Do not collaborate and do not share X2 (4, N 671) = 39,391, p = 0.000), yes, with researchers in the same team X2 (4, N 671) = 45,075, p = 0.000), yes, with researchers in other institutions X2 (4, N 671) = 15,272, p = 0.004)].


Table 5. Collaboration and data share with other researchers
Way of collaborationDo you collaborate with other researchers and share data? *
HumanitiesSocial sciencesNatural sciences and mathematicsBiomedical sciencesTechnological sciences
Do not collaborate and do not share46 (49.5%)
5.3
63 (30.7%)
1.5
15 (23.4%)
-0.6
35 (16.3%)
-4.2
21 (22.3%)
-1.1
Yes, with researchers in the same team31 (33.3%)
-5.5
110 (53.7%)
-2.0
39 (60.9%)
0.3
153 (71.2%)
4.3
65 (69.1%)
2.1
Yes, with researchers in the same university19 (20.4%)
-1.0
49 (23.9%)
-0.3
13 (20.3%)
-0.9
60 (27.9%)
1.3
25 (26.6%)
0.4
Yes, with researchers in other institutions21 (22.6%)
-2.2
53 (25.9%)
-2.4
22 (34.4%)
0.3
85 (39.5%)
2.7
37 (39.4%)
1.5
* Each cell shows N, percentage within the science field, and adjusted residuals, which are divided by an estimate of the standard error. Multiple responses were possible.

The openness of research data of the participants varied. In biomedical and technological sciences only 8.74% (27) of researchers said that their data were openly available to everyone (Table 6), while that was the case for the third (33.3%, 31) in humanities, 25% (16) of representatives of natural sciences and mathematics and 17.6% (36) of social scientists. 42-54 % of respondents had their data available openly upon request. A restricted access was mentioned by 25.33% (170) of respondents and in 18.93% (127) of cases, the data were not available to anyone else.

Statistically significant differences were detected on answers on the openness of the data [My data are openly available to everyone X2 (4, N 671) = 36,703, p = 0.000), my data are openly available only to my research team X2 (4, N 671) = 18,252, p = 0.001), my data are not available to anyone else X2 (4, N 671) = 11,643, p = 0.02)]. Humanists more often than other researchers make their data openly available to everyone and it complies with the answer that their data are available not only to their research team. Social scientists were most likely to assert that their data are not available to anyone else.


Table 6: Research data openness
Way of opennessWhich of the following applies to your research data? *
HumanitiesSocial sciencesNatural sciences and mathematicsBiomedical sciencesTechnological sciences
My data are openly available to everyone31 (33.3%)
4.8
36 (17.6%)
0.5
16 (25%)
2.0
17 (7.9%)
-4.1
10 (10.6%)
-1.6
My data are openly available only to my research team22 (23.7%)
-4.1
99 (48.3%)
1.7
26 (40.6%)
-0.5
100 (46.5%)1.144 (46.8%)
0.7
My data are available openly upon request44 (47.3%)
0.3
86 (42%)
-1.4
35 (54.7%)
1.5
98 (45.6%)
-0.1
45 (47.9%)
0.4
My data have restricted access (e.g., only some parts of the dataset are accessible)21 (22.6%)
-0.7
48 (23.4%)
-0.8
15 (23.4%)
-0.4
58 (27%)
0.7
28 (29.8%)
1.1
My data are not available to anyone else17 (18.3%)
-0.2
53 (25.9%)
3
7 (10.9%)
-1.7
38 (17.7%)
-0.6
12 (12.8)
-1.6
* Each cell shows N, percentage within the science field, and adjusted residuals. Multiple responses were possible.

Issues connected to sharing data with others

40,5% (272) of respondents agreed or strongly agreed with the statement I am comfortable and willing to share my research data with others. 31,74% (213) of respondents neither agreed nor disagreed with the statement (Table 7). No statistically significant difference between science fields was found.


Table 7: Comfortable and willing to share research data
Likert scaleI am comfortable and willing to share my research data with others. *
HumanitiesSocial sciencesNatural sciences and mathematicsBiomedical sciencesTechnological sciences
Strongly agree15 (16.1%)
1.4
24 (11.7%)
0.0
12 (18.8%)
1.8
20 (9.3%)
-1.4
8 (8.5%)
-1.1
Agree33 (35.5%)
1.5
66 (32.2%)
1.3
20 (31.3%)
0.5
50 (23.3%)
-2.2
24 (25.5%)
-0.7
Neither agrees nor disagrees21 (22.6%)
-2.0
66 (32.2%)
0.2
21 (32.8%)
0.2
67 (31.2%)
-0.2
38 (40.4%)
0.8
Disagree17 (18.3)
-0.1
31 (15.1%)
-1.5
5 (7.8%)
-2.3
51 (23.7%)
2.4
20 (21.3%)
0.8
Strongly disagree7 (7.5%)
-0.6
18 (8.8%)
-0.3
6 (9.4%)
0.0
27 (12.6%)
2.0
4 (4.3%)
-1.8
* Each cell shows N, percentage within the science field, and adjusted residuals.

Problems with data sharing in the future were seen by 37.1% (249) of respondents disagreeing or strongly disagreeing with the statement I foresee no problems with sharing my research data (Table 8).

When examining all possible combinations of science fields and how comfortable and willing to share research data respondents are, no statistically significant differences were observed.


Table 8: Foresee no problems with data sharing
Likert scaleI foresee no problems with sharing my research data. *
HumanitiesSocial sciencesNatural sciences and mathematicsBiomedical sciencesTechnological sciences
Strongly agree13 (14%)16 (7.8%)9 (14.1%)18 (8.4%)6 (6.4%)
Agree25 (26.9%)59 (28.8%)19 (29.7%)47 (21.9%)23 (24.5%)
Neither agrees nor disagrees27 (29%)58 (28.3%)20 (31.3%)52 (24.2%)30 (31.9%)
Disagree20 (21.5%)53 (25.9%)8 (12.5%)72 (33.5%)32 (34%)
Strongly disagree8 (8.6%)19 (9.3%)8 (12.5%)26 (12.1%)3 (3.2%)
* Each cell shows N, and (percentage within the science field).

A majority, 79.3% (532) of respondents, agreed or strongly agreed with the statement of I perceive data ethics could be an issue when research data are shared with others (Table 9). There were statistically significant differences between the choice of Likert scale items: X2 (16, N 671) = 59.848, p < 0.000).

When examining all possible combinations of science fields and perceiving data ethics as an issue when research data are shared with others, it was seen that technological sciences strongly agree that this is an issue for them less often than other science fields, although no statistically significant differences were observed.


Table 9: Ethical issues to share data with others
Likert scaleI perceive data ethics could be an issue when research data are shared with others. *
HumanitiesSocial sciencesNatural sciences and mathematicsBiomedical sciencesTechnological sciences
Strongly agree34 (36.6%)
-1.2
100 (48.8%)
2.2
16 (25%)
-2.9
113 (52.6%)
3.7
21 (22.3%)
-4.2
Agree39 (41.9%)
1.1
69 (33.7%)
-1.2
20 (31.3%)
-1.0
72 (33.5%)
-1.3
48 (51.1%)
3.1
Neither agrees nor disagrees16 (17.2%)
0.7
29 (14.1%)
-0.3
19 (29.7%)
3.5
21 (9.8%)
-2.5
14 (14.9%)
0.0
Disagree2 (2.2%)
-0.8
5 (2.4%)
-1.1
6 (9.4%)
2.6
5 (2.3%)
-1.2
6 (6.4%)
1.6
Strongly disagree2 (2.2%)
-0.2
2 (1%)
-1.6
3 (4.7%)
1.3
4 (1.9%)
-0.6
5 (5.3%)
2.0
* Each cell shows N, and (percentage within the science field).

In sharing data half of the researchers in the fields of natural sciences and mathematics (33, 51.6%) and humanities (45, 48.4%) had no concerns. In other science fields this share was clearly smaller. Concerns included fear of losing the scientific edge, mentioned by 18.33% (123) of respondents (Table 10), especially in biomedical (42, 29.5%) and technological (31, 33%) sciences. Legal and ethical issues were mentioned by 241 (35.91%) of respondents, most often in the field of social sciences (90, 43.9%).


Table 10: Concerns for sharing data with others
Type of concernDo you have any concerns for sharing data with others? *
HumanitiesSocial sciencesNatural sciences and mathematicsBiomedical sciencesTechnological sciences
No concerns45 (48.4%)
2.4
79 (38.5%)
0.5
33 (51.6%)
2.5
66 (30.7%)
-2.4
27 (28.7%)
-1.8
Fear of losing the scientific edge 16 (17.2%)
0.7
22 (15.6%)
-1.8
12 (18.8%)
-0.2
42 (29.5%)
-0.1
31 (33%)
3.5
Legal and ethical issues35 (37.6%)
0.4
90 (43.9%)
2.9
8 (12.5%)
-4.1
78 (36.3%)
0.1
30 (31.9%)
-0.9
Misuse of data23 (24.7%)
-0.5
58 (28.3%)
0.6
7 (10.9%)
-3.0
68 (31.6%)
1.9
24 (25.5%)
-0.3
Misinterpretation of data17 (18.3%)
-2.7
61 (29.8%)
-0.1
14 (21.9%)
-1.5
76 (35.5%)
2.0
15 (36.2%)
1.4
Lack of resources (technical, financial, personnel, etc.)13 (14%)
-0.5
30 (14.6%)
-0.5
12 (18.8%)
0.7
35 (16.3%)
0.3
15 (16%)
0.1
Lack of appropriate policies and rights protection15 (16.1%)
-0.9
34 (16.6%)
-1.2
7 (10.9%)
-1.8
48 (22.3%)
1.3
26 (27.7%)
2.2
* Each cell shows N, percentage within the science field, and adjusted residuals. Multiple responses were possible.

The fear of data misuse was strongest in biomedical (68, 31.6%) and social (58, 28.3%) fields. Misinterpretation of data was most feared in biomedical (76, 35.5%) and technological sciences (15, 36.2%). Issues connected to the lack of technical, financial or personal resources and lack of appropriate policies and rights protection were mentioned less often than concerns for sharing data with others.

Statistically significant differences were detected on concerns for sharing data with others [No concerns X2 (4, N 671) = 17,560, p = 0.002), fear of losing the scientific edge X2 (4, N 671) = 12,986, p = 0.011), legal and ethical issues X2 (4, N 671) = 21,713, p = 0.000), misuse of data X2 (4, N 671) = 11,269, p = 0.024), misinterpretation of data X2 (4, N 671) = 12,706, p = 0.013), lack of appropriate policies and rights protection X2 (4, N 671) = 9.894, p = 0.042)]. No statistically significant differences were observed between disciplines.

When examining all possible combinations of science fields and concerns for sharing data with others, it was seen that researchers in technological sciences more often than others fear of losing the scientific edge (Table 10). Also, it was seen that representatives from natural sciences and mathematics least worry about legal and ethical issues. Alternatively, it was seen that biomedical sciences most worry about the misuse of data than other science fields. Compared to other science fields, humanists worry least about misinterpretation of data.

Sharing data requires appropriate databases, guidelines for the database and principles agreed among the research field and among research institutes. The lack of resources for sharing data and lack of appropriate policies was mentioned by 10-27% of respondents (Table 10).

Discussion

Sharing research data enables its reuse by others. Several international initiatives promote open science and data sharing. However, at the practical level, researchers encounter policies and practices that are still in the early stage of their development and this was also evident in the results of our study. Without established practices, the researchers of any field cannot be aware of them.

Most of the studies on data sharing and openness focus on larger countries in the Western world. Finland and Lithuania represent smaller European countries. Language may affect, for example, the enthusiasm of sharing data with other researchers. By combining the data from these two small countries the number of responses from different research fields were more representative. Also, by comparing this data with survey implemented in UK, France, Turkey we can have a broader picture of possible factors affecting open research data. We discuss the results following the theoretical framework of Zuiderwijk and Spiers (2019). Regarding voluntariness (research data openness, willingness to share research data) researchers are quite conservative. Only 16.39% (110) of researchers from Lithuania and Finland share their data with everyone, and the majority prefer more restricted access. For instance, almost half of respondents (43.36%, 291) say their data is available only to their team and also almost half (45.9%, 308) say their data is available openly upon request. Colleagues from bigger countries, such as UK, France and Turkey, are a rather more open. The results of a survey, using the same questionnaire, show that 27% of researchers share their data with everyone, 35% sharte open data only with their own research team, and 46% are willing to make data available only on request (Ünal, et al., 2019).

It is important to note that, despite more than half of respondents making their data open, we saw that less than half of respondents (40.5%, 272) agreed or strongly agreed they feel comfortable and willing. Over a third (213, 31.74%) of respondents neither agreed nor disagreed. While in the larger countries (UK, France, and Turkey) 56% of researchers strongly agree or agree to share research data willingly and comfortably (Ünal, et al., 2019).

Facilitating conditions (way of getting data). The way of getting data showed the signs of readiness to share data with others, and scientists were eager to employ inputs from their peers. At least 73% (494) of respondents collected and created their own data. In other fields, except humanities, at least 45% (264) of respondents said that their data came from their own research group, but only 28% (26) of humanists answered this way. In humanities the data are more often from known available sources. This confirms the findings of Tenopir and her colleagues (2015).

Trust (collaboration and sharing data with other researchers). Sharing data with collaborators is different from the level of sharing it publicly, e.g., through an open repository (Thoergesen and Borlund, 2022).

Our results show that more than half or the respondents shared their data with researchers of the same team (59%, 398) or the same university (24.7%, 166), or researchers in other institutions (32.48%, 218). Instead, half of humanists did not share their data with others. This reflects the nature of how research data are used and collected and traditions of conducting research. Although 31 (33,3%) of humanists said their data are openly available to everyone, there may be lack of appropriate technical infrastructures and support for storage and retrieval (e.g., Thessen and Patterson, 2011; Shearer and Furtado, 2017). This goes in line with previous findings of Scaramozzino and colleagues (2012) who found that only a minority of researchers share their data. The contradiction between the answers of the humanists may reflect their data, that might include e.g., literature (books), artefacts, works of art, letters, or other kinds of documents available for all.

By comparing the results of the same survey from the larger countries we saw that scholars from Lithuania and Finland are significantly less open and collaborative. For instance, 73% researchers in UK, France, and Turkey “collaborate and share data with researchers in the same team, 42% collaborate and share data with researchers in the same university, and 55% collaborate and share data with researchers in other institutions” (Ünal, et al., 2019, Collaboration...).

Researcher’s experience (foresee no problems with data sharing; ethical issues to share data with others; concerns for sharing data with others). A majority (79.3%) of respondents agreed that data ethics could be an issue when they share research data with others. Data ethics concern scientists, especially in fields where the focus of research is a human being, in social and medical sciences. A physician must take every precaution to protect the life, health, dignity, integrity, right to self-determination, privacy, and confidentiality of personal information of research subjects (see the Helsinki Declaration (World Medical Association, 2018)). This means that research data need to be anonymised and made so that an individual cannot be linked to data concerning her or him, before the data can be saved in a repository or shared. The results suggest that the object of data openness and sharing in some cases contradicts the requirements of data protection. Problems with data sharing in the future were seen by 37% of respondents disagreeing with the statement I foresee no problems with sharing my research data. While in UK, France, and Turkey about 44% of researchers foresee no problems with sharing research data (Ünal, et al., 2019).

A majority of respondents (79.3%, 532) agreed or strongly agreed that data ethics can be an issue when research data is shared. While 64% respondents from UK, France, and Turkey strongly agree or agree that data ethics could be an issue for data sharing (Ünal, et al., 2019).

Almost a third or respondents 27.27% (183) are afraid of misinterpretation of data and 26.82 (180) are afraid of misuse of data. Concerns regarding sharing data were misinterpretation and misuse of data, especially mentioned by technological, medical and social scientists. It goes in line with findings of Tenopir and colleagues (Tenopir et al., 2015). In our study researchers from natural sciences and mathematics were least likely to fear misuse of their data. Their research is more strongly basic than applied and this might explain this result. Interestingly, in the UK, France, and Turkey almost the same percent of researchers are concerned with misinterpretation of data (29%) and misuse of data (25%) (Ünal, et al., 2019).

Humanities researchers differ from others in worrying least about misinterpretation of data. Legal and ethical issues also concern humanists and technological scientists, while representatives of natural sciences and mathematics worry least about them. Koltay (2017) reminds us that ethics and possible misuse or misconceptions are aspects that should not be forgotten. Many aspects relating to this are still unsolved. Furthermore, technological scientists more often than others mentioned fear of losing the scientific edge.

Legislation, regulation and policies (familiarity with open access requirements). Lack of appropriate policies and rights protection was mentioned less often by 10-27% of respondents, depending on the field they represent, and lack of resources by 14-18%. One reason for this may be that around half of the respondents are not familiar with the requirements for opening data for reuse. However, there were no statistically significant differences between research fields. If there is no familiarity with the topic and no experience in sharing data, there might not be knowledge on whether appropriate services and resources are available. In previous studies time needed to prepare metadata and gaps in coding skills also have been mentioned (Cooper, 2021).

Researchers seem to want to have control over who uses their data. Most often this was evident in technological and biomedical sciences. This contrasts with the Levin et al. (2016) study where representatives from biomedical sciences outlined the importance of “giving back to society”. According to Scaramozzino et al. (2012), researchers might not be eager to share data with those who do not contribute to gathering the data as the process can be time consuming and data are considered valuable. However, behind the need to control the use of data there may also be ethical questions concerning the privacy of research subjects (World Medical Association 2018).

Conclusions

This study increases our knowledge on researchers’ attitudes towards openness and practices on data sharing. Our study provides insight on what should be considered while implementing research data openness policies. Furthermore, we support the idea that it is important to suggest different approaches regarding different research fields. Therefore, guidelines that are more precise are needed, and they should be developed in accordance with the principles of research ethics and consider the special requirements of different fields.

Results show a clear alignment with existing literature about how researchers perceive idea of sharing research data. By comparing our results with the results of previous research in the field (two to three years earlier), we saw that researchers still see the same reasons for not sharing their data. Changing the culture of data sharing is much harder than publication of new initiatives and legislation.

It is important to research small languages academic communities as they could have a unique approach to the topic in comparison with big languages academic communities. We compared survey data from Finland and Lithuania with data from UK, France, and Turkey. We found that respondents from Finland and Lithuania are more conservative regarding data openness. Counterparts from Finland and Lithuania are less collaborative, less eager to share with others, their data is less open, are less comfortable and willing to share research data and bigger part of respondents foresee more problems with data sharing than colleagues from UK, France, and Turkey.

Limitations and future research

This paper has some limitations that need to be addressed. First, the data used in this paper are part of a larger international survey. The collected data might be skewed since the participants mostly came from Finland, though the survey was administered in Finland and Lithuania. However, we think that cultural and social differences between the countries are not so strong that they might have affected the notions of research community on dealing with their research data. Moreover, Lithuanian and Finnish researchers are part of the international community and work at an international level. The study paves the way for future and more in-depth research. Second, the response rate was very low (from up to 2% in Finland and up to 5% in Lithuania). Only social sciences and biomedical sciences are better represented. The results show some tendency, but we cannot make strict conclusions. In future research, the change in attitudes and practices could be focused upon through a longitudinal study. Furthermore, the differences between the professorial level and the academic mid-level (i.e., academic members and PhD candidates) could provide an interesting approach.

Acknowledgements

The authors thank Elena Macevičiūtė and Tom Wilson for their comments and suggestions. Also many thanks for finding such a professional reviewers. Dear anonymous reviewers, it was a pleasure to work with you. It is so sad that you have to stay anonymous and we cannot invite you for a drink.

About the authors

Heidi Enwald is a University Lecturer and head of the discipline at Information Studies, Faculty of Humanities, University of Oulu, Finland. She can be contacted at: heidi.enwald@oulu.fi
Vincas Grigas is an Associate Professor at the Faculty of Communication, Vilnius University, Republic of Lithuania. He can be contacted at: vincas.grigas@gmail.com
Jurgita Rudžionienė is an Assistant Professor in the Faculty of Communication at Vilnius University, Republic of Lithuania. She can be contacted at: jurgita.rudzioniene@kf.vu.lt
Terttu Kortelainen is an emerita, who worked as university lecturer in Information Studies at the Faculty of Humanities, University of Oulu, Finland. She can be contacted at: terttu.kortelainen@oulu.fi.

References

Note: A link from the title, or from (Internet Archive) is to an open access document. A link from the DOI is to the publisher's page for the document.


How to cite this paper

Enwald, H., Grigas, V., Rudžionienė, J. & Kortelainen, T. (2022). Data sharing practices in open access mode: a study of the willingness to share data in different disciplines. Information Research, 27(2), paper 932. Retrieved from http://InformationR.net/ir/27-2/paper932.html (Archived by the Internet Archive at https://bit.ly/3QkOu0I) https://doi.org/10.47989/irpaper932

Check for citations, using Google Scholar