Minding the gap: investigating the alignment of information organization research and practice
Philip Hider, Hollie White, and Hamid R. Jamali.
Introduction. The issues that practitioners want researched and those that are studied by researchers are often considered not to align very well. This paper investigates the extent to which a gap between research and practice exists in the field of information or knowledge organization, using a novel index of topical overlap between research and practice discourses.
Method. The degree of alignment was measured by comparing samples of research-oriented and practice-oriented discourse published between 2013 and 2017. Information organization research was represented by scholarly articles, Information organization practice by professional blogs, webinars and conferences.
Analysis. The texts were analysed using software which identified the most frequently used terms; the number of top terms, following deletion of generic terms and normalisation, that overlapped between corpora constituted the index of topical overlap.
Results. The number of overlapping terms between information organization research and practice corpora was about halfway between the number of overlapping terms between highly similar and unrelated corpora.
Conclusions. The results suggest a fair degree of alignment between information organization research and practice. The index used needs further testing, but appears to be a promising, unobtrusive tool for comparing the degree of alignment between research and practice in different fields
'Gap n. -- Any opening or breach in an otherwise continuous object; a chasm or hiatus.' (OED Online)
The difference between what practitioners do and what researchers investigate is often called a gap (Booth, 2003; Davis et. al., 2013; Gummesson, 2014; Haddow and Klobas, 2004; McFarlene, Kahili and Johnson, 2014; Ponti, 2008; Pullins et. al., 2017; Rubins, 2014; Weston and Bain, 2015). The use of the term 'gap' implies the perception that research and practice within the same fields should be like a 'continuous object', something united and connected. A gap suggests it is not connected and that there is a divergence or divide. Previous discussions by McFarlene, Kahali and Johnson (2014), Yates (2015) and Nguyen and Hider (2018) have noted that applied fields are more likely to experience this perceived gap. Research investigating these phenomena is found in social work (Davis et. al., 2013; Rubins, 2014), business (Gummesson, 2014; Pullins, Timmonen, Kaski and Holopainen, 2017), health care (McFarlene, Kahali and Johnson, 2014; Yates, 2015), education (Weston and Bain, 2015) and library and information science (Haddow and Klobas, 2004; Nguyen and Hider, 2018).
While many researchers emphasize the gap between research and practice, others emphasize ways of bridging the gap and seek to identify those areas and researchers who are more successful in doing so. For instance, Bastow, Dunleavy, Tinkler and Aguilera (2014) discuss the connection between research and practice by examining the impact of research across the social sciences. Very little research, however, has focused on the relationship between research in smaller fields or subfields and their corresponding larger communities of practice. Existing approaches for measuring relationships tend either to be based on perception or assume that distinctions between the academy and practice are clear-cut, which is not the case for fields like librarianship and information science. Yet are these gaps between practice and research perception or reality? To what extent is there truly division or alignment between research and practice?
This paper examines the extent to which research and professional discourse are aligned within the subdomain of knowledge or information organization (hereafter information organization). We define information organization here as that part of librarianship and information science concerned with such activities as cataloguing, classification, controlled vocabulary, ontology, indexing and social tagging, though in the study we report the emphasis will be on cataloguing and metadata librarianship. The alignment, or overlap, between information organization research and practice is considered a precondition for research impact (and indeed, for practice to impact on research).
This paper begins by reviewing literature about the relationship between librarianship and information science research and practice, as well as the perception of that relationship. To expand on the discourse surrounding the topic, a keyword analysis study was conducted. Using purposive sampling techniques, information organization research article abstracts were compared with descriptions of information organization-focused practitioner continuing education and professional blogs from a five-year period (2013-2017). Results and discussion report normalized term frequencies as well as co-word analysis using VOSviewer software; contextualize findings that indicate the extent of alignment between information organization research and practice; and highlight the topics that most align. The potential of the methodology for future research is also outlined.
Many terms are used to describe the relationship between library and information science research and practice. These terms include, gap, divide, alignment and overlap: all of these words sit on a terminological continuum, either showing a connection or break between research and practice within the library and information science domain. It is unclear whether library and information science research and practice is aligned or unaligned. Previous studies have shown that the extent of the perceived divide or alignment varies by country and domain (Powell, Baker and Mika, 2002; Pymm and Hider, 2008; Schlögl and Stock, 2008).
Australian academics and practitioners in particular have been interested in evaluating this topic. Haddow and Klobas (2004) found 11 types of gaps between research and practice in regard to communication. These gaps include knowledge, culture, motivation, relevance, immediacy, publication, reading, terminology, activity, education and temporal. Pymm and Hider's (2008) research found that senior library staff saw value in consulting research articles, but this is in contrast to Haddow's (2001) earlier research that found newsletter publications as preferable to practitioners (though in this case both senior and less senior ones). In 2016 the Australia Library and Information Association (ALIA) and Charles Sturt University created the Relevance 2020 'series of research events'(Nguyen, 2017, p. 3) with 'the main purpose of connecting academics, researchers and practitioners in order to help align future research projects and activities in the Australian library and information science profession' (Nguyen, 2017, p.4). Nguyen and Hider (2018) report more detailed findings rom the focus groups conducted during the Relevance 2020 series. Commenting on the library and information science community, Nguyen and Hider (2018, p. 5) state that, 'it would appear that little LIS research is used to address practical issues' and that,
research, which tends to be carried out in academia, does not always originate from practice, nor necessarily solve problems in, or even guide practice. A lack of relevance may be compounded by a perception amongst some practitioners that research is more the domain of the ivory tower and not something that could help them much in their professional activities. (p. 3)
Jamali's (2018) research also confirms the concerns highlighted in Relevance 2020, as well as Nguyen and Hider's (2018) evaluation from that project. Jamali (2018) interviewed seven practitioners and concluded (albeit from a small sample size) that academic-led research was often problematic stating that, 'academic research lacks practical implications as their research problems do not originate from practice'(p. 8). His research found that Australian information professionals do not, on the whole, believe that research conducted by academics in Australian library and information science programmes is relevant to information practice.
Most of the research studies conducted about this topic use qualitative (interview or focus group) data collection methodologies, focused on practitioner perceptions of the relationship between research and practice. They thus tend to be obtrusive and contain the associated risk of self-reporting biases. Practitioners may be keen to report their view of the value of the research as much as their actual use of it. Conversely, academics may be keen to promote the impact credentials of their research. In order to test the reality of these perceptions, further research is needed using different methodologies.
Historically, the knowledge and information organization domain has had a rich history of investigating research trends (Dahlberg, 1997; Hjørland and Albrechtsen, 1999; McIlwaine, 2003; López-Huertas, 2008; Smiraglia, 2012). Saumure and Shiri (2008) found that over the last half century knowledge organization research topics have shifted from an emphasis on indexing and abstracting to a focus on classification and cataloguing. Saumure and Shiri see this shift as having been motivated by the advent on the internet. While knowledge organization principles remain key in pre- and post-Web time periods, metadata is a major theme more recently.
Choi and Lee (2016) conducted a study of user-focused studies in the area of knowledge organization using metadata. Looking at a ten-year span of articles and dissertations published from 2005 to 2014, they used text analysis software (WordStat) to perform a quantitative co-word analysis in order to create topic clusters and identify major themes in recent information organization research. The research reported here extends this methodology in order to compare the themes and topics discussed in recent information organization research and practice.
Influenced by Choi and Lee's (2016) work, this study compared samples of English-language content over the five-year period, 2013-2017, one intended to reflect issues discussed by information organization researchers and the other issues discussed by information organization practitioners. For the purposes of this research, information organization was defined a little narrowly, in terms of cataloguing and metadata librarianship, as it was in this area that published content in both the research and practitioner spheres was relatively abundant. The sample of information organization research discourse was derived from all the peer-reviewed articles (excluding editorials, book reviews, reports, etc.) published during the reference period in the two main English-language research journals of the information organization field, Cataloging & Classification Quarterly and Journal of Library Metadata and those peer-reviewed articles deemed by the authors to cover information organization topics and published during the reference period in two other major English-language journals for information organization research, Library Resources and Technical Services and Technical Services Quarterly (the four aforementioned journals being those identified by Terrill (2016) as the top for information organization research, by number of articles).
The numbers of articles identified for the sample from each journal are set out in Table 1. Bibliographic information about the articles was obtained from the Scopus database; the titles and abstracts of the articles were used for the analysis.
|Cataloging & Classification Quarterly||197|
|Journal of Library Metadata||74|
|Library Resources and Technical Services||26|
|Technical Services Quarterly||24|
The sample of information organization practitioner discourse comprised datasets from three different types of source: professional blogs, Webinars and conference sessions. The blogs were selected from those listed on the Planet Cataloging site, which has, for many years, been aggregating English-language blogs for professional audiences with an interest in library cataloguing and metadata. Those blogs still accessible in November 2018 and that were active during some or all of the period 2013-2017, and that were mainly about cataloguing and metadata, were chosen for data collection and are listed below.
- 025.431: The Dewey blog
- Coyle's InFormation
- ISKO UK
- Metadata Matters
- Universal Decimal Classification blog
All the posts published on these blogs in the reference period were extracted, except in the case of some that were clearly >off topic (i.e. not directly about cataloguing or metadata). Both titles and the bodies of 352 posts were included in the dataset.
For the Webinar and conference session datasets, a list of organizations based in North America and known to be active in the provision of continuing education for information organization professionals (e.g. American Library Association groups, such as the Association for Library Collections and Technical Services; the Medical Libraries Association; and the Special Libraries Association) was compiled. Their Websites were then searched for details of Webinars and conference sessions on what were promoted as topics of interest to information organization professionals, held during the reference period. The titles and descriptions of forty-four Webinars and fifty-one conference sessions were extracted and collated to form the second and third dataset representing practitioner discourse respectively.
The samples of content (i.e. corpora) were subjected to co-word analysis using VOSviewer software, version 1.6.7, one of the leading applications for visualising the inter-relationships between texts, authors, journals, and so forth. A stop list was applied to remove generic terms (such as the names of months, countries and words such as article); terms were then identified and normalised using text analysis; the most frequent in each corpus were then listed and related to each other using clustering algorithms. The counting method was binary, i.e. the presence of a term in each record counted as 1 (regardless of frequency in each record). The same number of the most frequent terms in each corpus was analysed to cancel out differences in sample size. Comparison of the overlap among the corpora and comparison with baseline measures indicated the amount and nature of topic alignment between the corpora.
Results and discussion
Results and discussion from this study are presented in four sections: topics in practice; topics in research; corpora comparisons, and terminology comparisons.
Topics in practice
The three practice-oriented content sets were combined in two different ways. In the first way, we combined the content of the three sets and analysed it as a single corpus. In the second way, we separately analysed the terms in each set and then aggregated the top terms of the three corpora. Both methods produced the same result, in terms of the top terms, as well as the number of them, that were in common with those of the research-oriented corpus. Figure 1 shows a map of most frequent terms in the combined practice-oriented content (blog, Webinar and conference corpora).Colours show clusters of terms that have more association (i.e. more frequently co-occur). The farther two terms are from one another, the less likely they are to appear together in a unit of analysis (e.g. a blog post).
Figure 1's visualisation shows three distinct clusters (green, red and blue) of topics found in practice-oriented content. The green cluster focuses on general classification topics, with terms like note, class and a variety of terms with the word number. Red content covers a wide range of concepts related mainly to cataloguing and metadata, including terms such as cataloguing, data, RDA, vocabularies and linked data. The blue cluster focuses on Dewey Decimal Classification-specific terms, such as DDC, Dewey and Dewey number. (One of the more prolific blogs was the Dewey blog.) Links can be seen between all three main clusters, with some overlap between the blue and green clusters.
The list of the 20 most frequent terms from the practice-oriented content (combined and analysed as a single corpus) is presented in Table 2. The most frequent term is number by quite some distance.
Topics in research
The topic map in the research-oriented content shows a more diverse range of topics with a greater number of clusters. Five groups of clusters (blue, purple, red, yellow and green) were identified from the research-oriented content, as seen in Figure 2. The blue cluster focuses on cataloguing topics related to bibliographic description, including terms like bibliographic description, ISBD and semantic web. The purple cluster represents cataloguing topics related to the Functional Requirements for Bibliographic Records and related bibliographic models, including terms like functional requirements, FRBR, user tasks and bibliographic records. The red cluster represents topics related to library practice, terms, such as academic libraries, collection, project and cataloguer. The yellow cluster covers the cataloguing code, Resource Description and Access, including terms like RDA, access and implementation. Finally, a small green cluster represents vocabulary control, with terms including vocabularies, subject headings and linked data.
Term frequency results show that Resource Description and Access-focused terms are most frequent; the terms RDA and resource description are the top two terms by some margin. Table 3 shows the top twenty terms based on term frequency from the research-oriented content.
To evaluate the extent of overlap or alignment between research and practitioner-based outputs, a series of comparisons between pairs of corpora were conducted. The first comparison examined the overlap between the top 100 terms from the information organization research corpus and those from a journal assumed to have little subject commonality with information organization research, namely the Journal of Parasitology. Only three of the two sets of 100 terms overlapped: combination, difference and evidence. None of these terms are inherently information organization-specific terms. In any case, three was thus set as the lower limit for this index of topical inter-corpora alignment.
The second comparison compared the top (100) terms from the information organization research corpus with those from another information science journal that occasionally covers some information organization topics, namely Information Processing and Management. This second measurement identified six common terms. None were inherently information organization-specific, however.
The third comparison gauged the number of common terms, out of the top 100 terms, that two corpora of very similar content, might share. To conduct this measurement the individual datasets from the two main information organization journals, Cataloging and ClassificationQ uarterly and the Journal of Library Metadata were used. This produced fourteen common terms. Due to the very similar cope of the journals, fourteen was considered the upper limit of this index, at least for information organization discourses.
The fourth comparison evaluated the overlap between the information organization research corpus and the combined practice corpus. The resulting overlap was nine common terms, as noted earlier, half-way between the lower (three) and upper (fourteen) limit benchmarks.
The fifth, sixth and seventh comparisons evaluated the overlap between the research corpus with each of the three practice corpora (blogs, Webinars and conference sessions). Overlap was greatest with the webinar corpus: fourteen common terms, the same as the upper limit established in the third measurement explained earlier. This overlap suggests that information organization-related Webinars tend to be a little more aligned with information organization research than information organization-related conference sessions and blogs. All comparisons are shown in Table 4.
|Corpora||Number of common terms in top 100|
|Articles vs Webinars (F)||14|
|Cataloging & Classification Quarterly vs Journal of Library Metadata articles (C)||14|
|Articles vs conferences (G)||12|
|Articles vs practice sources combined (D)||9|
|Articles vs blogs (E)||8|
|Articles vs Information Processing and Management (B)||6|
|Articles vs Journal of Parasitology (A)||3|
Terminology comparisons were also conducted to analyse the nature of the overlap (and differences) between research and practitioner-based outputs. The nine top terms shared by the information organization research and combined practice corpora, as seen in Table 5, are clearly related to particular topic areas, such as linked data and FRBR. Anecdotal evidence would also suggest that linked open data and the Semantic Web is an area of particular interest to both academics and practitioners in the field.
|Term||Frequency in practice||Frequency in research|
Table 6 presents the top twenty most frequent terms in each of the four information organization corpora (articles, blogs, conference sessions and Webinars). The data shows the relative significance of different topics in each set. For instance, while RDA is the top term in research articles, it is not present in the top twenty terms in any of the three practice sets.
|functional requirement||17||webdewey||57||alcts camms||5||element||5|
|semantic web||12||blog post||29||institution||4||implementation||4|
Term comparisons show similar results to those found by Saumure and Shiri (2008). The top terms from the various corpora likewise suggest that both researchers and practitioners share a considerable interest in aspects of descriptive cataloguing, though whether aspects of the topic are the same is not so clear.
The overlap of top terms between the information organization research and practice corpora indicate that there is a fair amount of alignment between research reported in key information organization research journals and major forums for professional information organization discussion. Whether this alignment is due to information organization research specifically engaging with important issues in information organization practice and, conversely, information organization practitioners noting and acting upon information organization research, is another matter, and the next step in this investigation would be to drill down in those broad areas that have been identified as aligned, to gauge levels of engagement and impact. In other words, we have identified certain overlaps and associations, but not yet the nature of these research-practice relationships: to what extent are they causal and in which direction(s)? Does research lead practice or vice versa? It may depend on the particular topic area. It may also change over time. On the other hand, we might discover that while both researchers and practitioners are both preoccupied with certain subjects, they are not concerned with the same aspects of these subjects. For example, RDA may be discussed in terms of its usability by practitioners, but in terms of its theoretical merit, say, in the research literature.
The perception of a gap, at least in this subdomain of librarianship and information science, is not substantiated by the findings of this study. In some ways, it is not surprising that there is a fair degree of overlap between research and practice in a field such as information organization. After all, practitioners often contribute to journals such as Cataloging & Classification Quarterly and Journal of Library Metadata, while information organization academics often have a background in practice. It would be interesting, however, to see if other fields where this is also the case, including other subdomains of library and information science, produce similar levels of alignment using this study's index. It would likewise be interesting to ascertain the actual level of practitioner contribution to research in information organization and other fields, for example by examining author affiliations.
The methodology we employed for this study is, we believe, a new approach to measuring alignment between research and practice. The unobtrusiveness of the data collection has clear advantages, but the approach is not without its limitations. Automatic text analysis can be hampered by the vagaries of language, with words de-contextualised; it should also be borne in mind that journal articles and blog posts, for example, tend to be written in different styles. An alternative approach would be to manually index the content of the texts, though of course this would require considerable amounts of time and resources; any controlled vocabulary used might also be biased toward research or practice. Larger corpora would have enabled longer lists of frequently cited terms to be analysed, generating a more calibrated measure of overlap. The corpora themselves may not be fully representative of the discourses as a whole, particularly in the case of practitioner discourse. In this study, there were only a small number of blogs used. The reference period studied was a reasonably lengthy in the context of developments in the information organization field, but it should also be noted that the alignment between research and practice is likely to be somewhat dialectical and not totally synchronous.
Automatic text analysis produces an overview of textual and conceptual overlap, rather than a detailed picture: as such it is a good starting point for a more nuanced, manual analysis, perhaps triangulated with interpretations from authors and readers. The application of such methods may reveal a complex interplay between research and practice discourse, as well as the extent to which researchers and practitioners are aware of each other's preoccupations and helping to address them.
About the authors
Philip Hider is professor and head of the School of
Information Studies at Charles Sturt University (CSU), Australia. He
received his PhD from City University, London and has worked at CSU
since 2003. His research interests centre mainly around information
organisation and librarianship education. The second edition of his
text, Information Resource Description, was published in
2018. He can be contacted at email@example.com.
Hollie White is lecturer in Libraries, Archives, Records and Information Science (LARIS) in the School of Media, Creative Arts and Social Inquiry (MCASI) at Curtin University in Perth, Australia. She received her PhD in Library and Information Science at the University of North Carolina at Chapel Hill. Her research and teaching interests include, metadata, information organization, institutional repositories, library assessment and cross-cultural knowledge organization. She can be contacted at firstname.lastname@example.org.
Hamid R. Jamali is a senior lecturer at the School of Information Studies at Charles Sturt University, Australia. He received his PhD in information science from University College London in 2008 and his research interests are in the broad areas scholarly communication and bibliometrics. He can be contacted at email@example.com.
- Bastow, S., Dunleavy, P., Tinkler, J. & Aguilera, N. (2014). The impact of the social sciences: how academics and their research make a difference. Los Angeles, CA: Sage Publications.
- Booth, A. (2003). Bridging the research-practice gap? The role of evidence based librarianship. New Review of Information and Library Research, 9(1), 3-23.
- Choi, I. & Lee, H. (2016). A keyword analysis of user studies in knowledge organization: the emerging framework. In Proceedings of the 14th International ISKO Conference (pp. 116-124). Würzburg, Germany: Ergon.
- Dahlberg, I. (1997). Current trends in knowledge organization. In Organización del conocimiento em sistemas de información y documentación (pp. 7-25). Zaragoza, Spain: Universidad de Zaragoza.
- Davis, S., Gervin, D., White, G., Williams, A., Taylor, A. & McGriff, E. (2013) Bridging the gap between research, evaluation and evidence-based practice. Journal of Social Work Education, 49(1), 16-29.
- Eldredge, J. D. (2000). Evidence-based librarianship: an overview. Library Hi Tech, 24(3), 341-354.
- Gap. OED online. Retrieved from: http://www.oed.com.
- Gummesson, E. (2014). The theory/practice gap in B2B marketing: Reflections and search for solutions. Journal of Business and Industrial Marketing, 29(7/8), 619-625.
- Haddow, G. (2001). The diffusion of information retrieval research within librarianship: a communications framework. (Unpublished doctoral dissertation). University of Western Australia, Perth, Australia.
- Haddow, G. & Klobas, J. E. (2004). Communication of research to practice in library and information science: closing the gap. Library and Information Science Research, 26(1), 29-43.
- Hjorland, B. & Albrechtsen, H. (1999). An analysis of some trends in classification research. Knowledge Organization, 26(3), 131-139.
- Jamali, H. (2018). Use of research by librarians and information professionals. Library Philosophy and Practice. Retrieved from https://digitalcommons.unl.edu/cgi/viewcontent.cgi?article=4921andcontext=libphilprac
- López-Huertas, M. J. (2008). Some current research questions in the field of knowledge organization. Knowledge Organization, 35(2/3), 113-136.
- McFarlene, E., Kahili, A. & Johnson, J. A. (2014). Insights in public health: Bridging the research to practice gap to prevent maternal stress and depression. Hawai'i Journal of Medicine and Public Health, 73(6), 195-196.
- McIlwaine, I. C. (2003). Trends in knowledge organization research. Knowledge Organization, 30(2), 75-86.
- Miller, F., Partridge, H., Bruce, C., Yates, C. & Howlett, A. (2017). How academic librarians experience evidence-based practice: A grounded theory model. Library and Information Science Research, 39(2), 124-130.
- Nguyen, L. (2017). Relevance 2020: LIS research in Australia. Canberra, Australia: The Australian Library and Information Association. Retrieved from https://read.alia.org.au/sites/default/files/documents/alia-relevance-2020-lis-research-in-australia-online.pdf (Archived by WebCite® at http://www.webcitation.org/76U31oGTF)
- Nguyen, L. & Hider, P. (2018). Narrowing the gap between LIS research and practice in Australia. Journal of the Australian Library and Information Association, 67(1), 3-19.
- Partridge, H. L., Thorpe, C.E. & Edwards, S.L. (2007). The practitioner's experience and conception of evidence based library and information practice: an exploratory analysis. In Proceedings 4th International Evidence Based Library and Information Practice Conference. Chapel Hill, NC:University of North Carolina.
- Ponti, M. (2008). A LIS collaboratory to bridge the research-practice gap. Library Management, 29(4/5), 265-277.
- Powell, R. R., Baker, L. M. & Mika, J. J. (2002). Library and information science practitioners and research. Library and Information Science Research, 24(1), 49-77.
- Pullins, E. B., Timmonen, H., Kaski, T. & Holopainen, M. (2017). An investigation of the theory practice gap in professional sales. Journal of Marketing Theory and Practice, 25(1), 17-38.
- Pymm, B. & Hider, P. (2008) Research literature and its perceived relevance to university librarians. Australian Academic and Research Libraries, 39(2), 92-105.
- Rubins, A. (2014). Bridging the gap between research supported interventions and everyday social work practice: a new approach. Social Work, 59(3), 223-230.
- Saumure, K. & Shiri, A. (2008). Knowledge organization trends in library and information studies: preliminary comparison of the pre- and post-web eras. Journal of Information Science, 34(5), 651-666.
- Schlögl, C. & Stock, W. G. (2008). Practitioners and academics as authors and readers: the case of LIS journals. Journal of Documentation, 64(5), 643-666.
- Smiraglia, R. P. (2012). Knowledge organization: some trends in an emergent domain. El Profesional de la Información, 21(3), 225-227.
- Terrill, L. J. (2016). The state of cataloging research: an analysis of peer-reviewed journal literature, 2010–2014. Cataloging & Classification Quarterly, 54(8), 593-611.
- Weston, M. E. & Bain, A. (2015). Bridging the research-to-practice gap in education: a software-mediated approach for improving classroom instruction: bridging the research-to practice gap. British Journal of Educational Technology, 46(3), 608-618.
- Yates, M. (2015) Research in nursing practice: bridging the gap between clinicians and the studies they depend on. AJN, American Journal of Nursing, 115(5), 11.