Salganik, Matthew J. Bit by bit. Social research in the digital age Princeton, NJ: Princeton University Press, 2018. xxii, 423 p. ISBN 978-1-691-15864-8. $35.00/£27.95
The invention of the Internet, the World Wide Web, e-mail discussion groups, and social media, has opened up huge opportunities for social researchers to collect data in new ways. This major work is, perhaps, the most up-to-date, and certainly one of the best written works on social research methods in the digital context. The author is a Professor of Sociology at Princeton University, with previous experience in government and at Microsoft Research. He notes in the Preface, that his PhD was based on an online experiment, in which 100 people participated while he slept - the kind of experiment that many PhD students would love to enjoy! And, of course, the fact that people can participate in your research while you sleep is one of the advantages of digital social research.
The book has six chapters: Introduction, Observing behaviour, Asking questions, Running experiments, Creating mass collaboration, Ethics, and The future. In spite of the complexity of some of the issues, the writing is clear and accessible, and will make this text a very suitable basis for developing a course on the subject.
The author introduces the book by reference to a telephone survey on wealth and poverty carried out in Rwanda on a random sample taken from the 1.5 million customers of the country's biggest mobile phone service. The results tallied with those of the Demographic and Health Survey, but 'was about ten times faster and fifty times cheaper'. This instance encapsulates the advantages of digital social research, access to potentially massive amounts of data, speed, and value for money.
Two themes run through the book, which the author called 'readymades' and 'custom mades': readymades are existing sources of digitally accessible data, created, for example, by governments and capable of being sampled online and analysed in ways that differ from the original purpose of the data set. Custom mades are, perhaps, what we normally think of in social research: the invention of research questions that require the custom collection of new data. Matthew Salganik intends the book to appeal to both types of researcher and also to two other types: specialists in big data who have not been involved previously in social research, and social researcher who have not previously used digital data collection methods.
Chapter 2 on observing behaviour, deals with the nature of big data and its characteristics that are either good or bad for social research :
- generally helpful for research: always-on and non-reactive
- generally problematic for research: incomplete, inaccessible, non-representative, drifting, algorithmically confounded, dirty, and sensitive (p. 41)
Following an analysis of these ten characteristics, three research strategies are proposed, counting things, forecasting things, and approximating experiments. These strategies, of course, do not always pay off, and Salganik shows how Google's Flu Trends application, 'did not outperform a simple and easier-to-understand heuristic' (p. 48). The author rightly suggests that when such methods are employed, it is useful to check against some known baseline.
In Chapter 3, Asking questions, the author focuses upon surveys, rightly noting that the survey method is more affected by the digital environment than is qualitative research employing in-depth interviewing. In this Chapter, the author deals with sources of error in surveys, the difficulties and potential bias of the questions that are asked, the problems on non-probability sampling, and the diversity of approaches offered by the digital age.
Chapter 4, on experiments, draws upon a wide range of literature and would serve as an excellent introduction to experiments in general. However, it is the instruction on how to design and implement one's own digital experiments that will be of great practical use in this context. As the author notes:
Not only can researchers run massive experiments, they can also take advantage of the specific nature of digital experiments to improve validity, estimate heterogeneity of treatment effects, and isolate mechanisms. These experiments can be done in fully digital environments or using digital devices in the physical world. (p. 202)
Collaborative research across institutions, countries and disciplines is highly prized by funding agencies these days, so the discussion of collaboration in Chapter 5 is appropriate. The author categorises digital collaboration, or "mass collaboration" as human computation, open call, and distributed data collection. Human computation involves such things as using hundreds or even thousands of collaborators to achieve some research task: the author cites the Galaxy Zoo, which involved 100,000 participants in classifying galaxies.
An interesting example of the open call mode of collaboration is the Netflix Prize, awarded for the best algorithm to predict user ratings for films, more than 20,000 teams registered for the prize in 2007 and the prize (of one million dollars) was awarded to a joint Austrian/USA team in 2009.
eBird is a collaborative data collection project involving profession ornithologists and bird-watchers: by 2015 there were 250,000 participants who had reported 260 million sightings. One can only imagine it would how much it would cost to achieve this through a funded research project which had to hire people to go out and do the observing!
The accessibility of Internet-based data raises the key question of research ethics, which is the subject of Chapter 6. One suspects that this may turn out to be one of the most frequently-cited chapters in the book. The author sets out the basis of a principles-based approach:
That is, researchers should evaluate their research through existing rules—which I will take as given and assume should be followed— and through more general ethical principles. (p. 282)
and believes that 'shared ethical norms and standards' can be developed, which will be shared by researchers and the public. If this can be achieved, then responsible and beneficial social research can be undertaken in the digital environment.
In Chapter 7, the author restricts his views on the future (very wisely, since no one can predict it!) to the blending of readymades and custom-mades, participant-centred data collection, and, not surprisingly, ethics.
Each chapter has a guide to additional reading and an 'Activities' section, i.e., exercises designed for use if the book is used as a course title. I have no doubt that the book, deservedly, will find many users developing new courses in digital social research methods, and, in conclusion, I can only recommend that if you have any intention of carrying out digitally-based social research, including information research, this book will be absolutely essential reading.
Professor T.D. Wilson
Editor-in-Chief
February, 2018
How to cite this review
Wilson, T.D. (2018). Review of: Salganik, Matthew J. Bit by bit. Social research in the digital age Princeton, NJ: Princeton University Press, 2018. Information Research, 23(1), review no. R624 [Retrieved from http://informationr.net/ir/reviews/revs624.html]
Information Research is published four times a year by the University of Borås, Allégatan 1, 501 90 Borås, Sweden.