Book review: Bibliometrics and research evaluation

Gingras, Yves. Bibliometrics and research evaluation: uses and abuses Cambridge, MA: MIT Press, 2016. xii, 119 p. ISBN 978-0-262-03512-5. £20.95/$26.00

This admirably brief book (just 91 pages of text, plus another 26 of notes and index) is a revised and updated translation of the French original, published in 2014. The translator is the author himself.

The abuse of bibliometric measures under the guise of 'research evaluation' by governments and university administrators has been addressed in a number of recent works, some of which have been reviewed here (for example, the collection by Cronin and Sugimoto). As Gingras says:

Since the first decade of the new millennium, the words ranking, evaluation, metrics, h-index and impact factors have wreaked havoc in the world of higher education and research. (p. vii)

and his object:

is to present in a succinct manner the basic concepts and methods of bibliometrics in order to demonstrate that they have a much broader scope than just research evaluation, which is a relatively recent (and unwise) application of its methods. (p. ix)

There are only four chapters and a 'Conclusion' in the book and Chapter 1 deals mainly with the history of bibliometrics (pointing out that the practice is older than the name), and the gradual transfer of its methods from the management of journal collections in libraries, to science policy in the 1970s, and research evaluation in the 1990s.

Chapter 2 deals with the valid uses of bibliometrics, showing how the Science Citation Index (which ultimately becomes the Web of Science) and now Scopus enabled scholars to map the bibliographic characteristics of their fields, providing tools for historians and sociologists of science, and revealing the structure of disciplines, the changes over time (e.g., the increase in multi-authored papers, and the internationalisation of scholarship), and the network relationships among journals, disciplines and countries. In other words, bibliometrics can be a valuable tool, when used appropriately.

In Chapter 3, the author moves on to research evaluation, noting that such evaluation has long been used in areas such as the peer review of journal papers, and promotion and appointment decisions - again, usually by peer review. With the emergence of the Science Citation Index, however, it seemed that the numbers associated with citations offered a more 'objective' means of evaluation, and so the misuse of bibliometrics began. In the chapter, Gringas demolishes the notion that the h-index does anything more than arbitrarily combine the number of papers published with the number cited, pointing out that a researcher who published three papers that received 60 citations each, would have an h-index of only 3, whereas one who, over the same period, produced ten papers, each of which was cited eleven times, would have an index of 10. How this nonsense comes to be accepted as providing a judgement of 'quality' is difficult to understand.

The misusue of the journal 'impact factor' is also discussed in this chapter, covering all of the flaws that make it manipulable and invalid as any kind of assessment of the quality of individual papers. And, yet, we have university departments demanding that their faculty publish in 'high impact factor' journals, on the assumption that, by association, accepted papers will somehow acquire the 'quality' that is attributed to the journal. We know this to be spurious nonsense and yet business schools in the UK require the publication of papers only in journals that appear in an 'approved' list and, if a young academic, needing to publish, is unable to get his/her papers into journals on that list, their promotion or appointment prospects will be jeopardised.

In Chapter 4 the analysis of the misuse of bibliometric and other ranking measures is taken further, the author pointing out, for example, that the Shanghai index, which proposes to identify the 'top' universities in the world, is completely inadequate for the purpose. The author points out that the Free University of Berlin and Humboldt University could move up or down 100 places in the ranking on the basis of whether or not the institution is deemed to have been associated with Albert Einstein's 1922 Nobel prize. Gringas points out that any valid indicator must satisfy three criteria: first, it must be adequate for the purpose intended. To take an example of my own, many universities require their academic staff to satisfy a number of criteria for evaluation for promotion or tenure, typically, research, teaching, administration, and contribution to the community. These criteria are often unweighted; the candidate is expected to satisfy all of these criteria to a specific degree. And, yet, how many submissions for tenure have been rejected when all of the criteria have been met to some degree, but the academic's research publications have not been in 'high impact factor' journals? Clearly, no single number could represent 'performance' in all of the required criteria, and, more to the point, no easily derived numbers exist at all for some of the criteria.

Secondly, a measure must be sensitive to the pace of change in the measured entity. Thus, if an institution moves up 100 places in a single year in the Shanghai index, something must be wrong with the index, simply because, as Gringas says, universities are like supertankers; they find it difficult to change course quickly. A measure should also have sensitivity of another kind, i.e., it should reflect the diversity of the institution and, clearly, when one of the measure employed is the number of publications in Nature and Science (as it is in the Shanghai index), this completely ignores strengths that might exist in the social sciences and the humanities.

Finally, a measure should be 'homogeneous in its composition' and, again, the Shanghai index is composed of several heterogeneous measures, meaning that the composite is measuring nothing. The author asks the sensible question, 'Why are invalid indicators used?' One interesting answer proposed is:

As more and more nonacademics take the reins of higher education institutions, they seem to be more responsive to market forces, "branding", and the search for money, than to academic principles and values. And it is certainly significant that managers are particularly fond of gurus selling their changing buzzwords. (p. 79)

In the final two and a half pages of the Conclusion, Gringas draws a parallel between the current situation of research evaluation in universities and Hans Christian Anderson's tale of The Emperor's New Clothes. The comparison is telling, and the author is to be congratulated on taking the part of the little boy who points the finger and declares the Emperor to be naked.

Professor T.D. Wilson
Editor-in-Chief
February, 2017

How to cite this review

Wilson, T.D. (2017). Review of: Gingras, Yves. Bibliometrics and research evaluation: uses and abuses Cambridge, MA: MIT Press, 2016. Information Research, 22(1), review no. R591 [Retrieved from http://informationr.net/ir/reviews/revs591.html]

Information Research is published four times a year by the University of Borås, Allégatan 1, 501 90 Borås, Sweden.

Gingras, Yves. Bibliometrics and research evaluation: uses and abuses Cambridge, MA: MIT Press, 2016. xii, 119 p. ISBN 978-0-262-03512-5. £20.95/$26.00

Professor T.D. Wilson Editor-in-Chief February, 2017

How to cite this review

Professor T.D. Wilson
Editor-in-Chief
February, 2017