header
vol. 21 no. 4, December, 2016


Proceedings of ISIC: the information behaviour conference, Zadar, Croatia, 20-23 September, 2016: Part 1.

Combining contextual interviews and participative design to define requirements for text analysis of historical media


Ben Heuwing, Thomas Mandl and Christa Womser-Hacker


Abstract
Introduction. The information behaviour of digital historians when analysing corpora of text is largely unexplored. However, understanding the motivations and the processes of research is essential when designing system support for text analysis. In a project in the area of research on historical educational media, we therefore conduct a contextual user study, exploring current processes of text analysis in five projects in the field. Representatives of the project analyse the results in participative workshops for requirements analysis.
Method. The study combines contextual interviews and a participative analysis of the results.
Analysis. Contextual interviews have been analysed regarding existing categories: retrieval and access, use of tools, process of analysis, and the contextualisation of sources. Participants analysed requirements for tool support in the project in a workshop setting.
Results. Users expect digital tools for automated analysis to provide quantitative overviews of contents early in the research process, but also to support existing, interpretative processes of analysis. For a more detailed quantitative analysis, means of comparison within and across several layers of context have to be provided. Based on the findings from the interviews and workshops, guidelines for the design of information systems for text analysis in digital history studies with large corpora are defined.
Conclusion. The process presented in this study can be applied to similar projects in the field of digital humanities. In addition, we propose guidelines that may be useful in similar contexts within the discipline.


Introduction

The last years have seen a rising interest in the application of digital tools to research problems in the humanities. In this context, the analysis and interpretation of texts is certainly the most important research activity (Toms and O’Brien, 2008). Digital approaches to research in the humanities are highly diverse. However, many tools that have been developed have failed to attract the attention of their intended audience, most probably because they do not support the processes and mental models of their intended users (Warwick, 2012). In general, humanists expect tools to primarily respect and enhance their existing research practices (Gibbs and Owens, 2012). Creators of tools therefore have to study carefully the motivations and processes in a discipline as well as the specific research interests of a project. This helps to avoid the effect that research questions are adapted to the tools, instead of permitting them to evolve jointly in an iterative way. In a project, a high level of collaboration between humanists and technologists is necessary to create tools which can inspire new and innovative research questions, e.g. when existing resources can be visualised or interlinked in new ways (Collins and Jubb, 2012).

This study takes the position that digital tools will be accepted and adopted in the humanities if they support and progressively enhance current research practices as opposed to providing means for questions that the discipline itself does not consider to be relevant. Because of this, we present a contextual, user centred design process and combine it with participative methods. The goal of this process is to define requirements for tools that enable multi-dimensional text analysis in the context of the study of historical media.

Context of research: Corpus for textbook research

The overall motivation to use tools for text analysis in our project consists of the goal of understanding how textbooks represent the world during the period of the German Empire from 1871 to 1914. During this period, the textbooks that children used in school were their most important, and in many cases their only source of knowledge about the world outside of their direct experience. The project examines in which ways these books represent views of other countries and the role of Germany in the world. For the first time, researchers will be able to pursue related research questions based on a representative full-text corpus. The corpus includes textbooks produced for a variety of age groups, types of schools, regions, and religious denominations. This provides new opportunities for comparisons between different contexts. The version of the corpus used in the project currently contains 2,935 textbooks with 605,830 pages, mostly for the subjects of history and geography. The books are available both as scanned images and in full text format created through optical character recognition. The corpus is publicly accessible (cf. http://gei-digital.gei.de).

Information behaviour in the humanities

The common research interest of disciplines such as literature studies, archaeology or historical research that constitute the field of humanities is the study of artefacts produced by the human mind. A rich variety of information behaviours can be expected because of the diversity of research goals in these disciplines (Palmer and Cragin, 2008). Moreover, Rieger (2010) argues that use of technology in the humanities should be analysed at the more granular level of different specialities within a discipline. In addition, the respective level of technology adoption may have an influence on information behaviour. Because of this, recent studies have focused on digital humanities projects, i.e. projects where researchers primarily analyse digital resources or try to find new ways of working with their data (Toms and O’Brien, 2008; Collins and Jubb, 2012; Given and Willson, 2015).

An important development for the humanities is the massive increase in the availability of digitized texts, and the use of search engines has become the most common mean to find relevant collections and primary as well as secondary sources (Toms and O’Brien, 2008). Ease of access to digitized resources may have an impact on the type of research conducted. However, important contextual cues provided by the physical artefact may be lost when only accessing digital versions of texts (Rimmer, Warwick, Blandford, Gow, and Buchanan, 2008; Toms and O’Brien, 2008). In contrast, scans of high quality and the possibility to create links to other resources may also provide new possibilities to contextualise sources (Collins and Jubb, 2012). Regarding the analysis of sources, researchers may choose to select, to interlink, and to combine contents for further analysis (Collins and Jubb, 2012). In a survey on the use of tools, researchers in the area of digital humanities rate those tools as most useful to their research practices that enable analysis by comparing texts, creating concordances (textual context), and analysing distributions of words or phrases (Toms and O’Brien, 2008, p. 118).

The process of historical research has been described as a succession of the following stages: (1.) problem selection, (2.) planning of data collection, (3.) actual data collection, (4.) analysis and interpretation, and (5.) presentation of findings (Uva, 1977; adapted from Sinn and Soares, 2014). In many cases, however, this process may be more iterative than linear (Sinn and Soares, 2014). A literature review of studies on historians’ information behaviour emphasizes the activity of orientation within collections and archives, the construction of contextual knowledge about sources, and, based on this, the assessment of sources regarding the current research interest (Rhee, 2012). Encountering useful but unexpected information while browsing and searching is perceived to be more likely during visits of physical libraries and archives. Hassan (2013) therefore states the need to improve support for these ‘creative’ aspects of historical research in search applications in digital libraries and archival collections. These aspects might also be of importance for the analysis of large corpora.

Regarding the automated support for text analysis, a general tendency to distrust results of text-mining techniques has been noted in the discipline (Gibbs and Cohen, 2011). Historians expect digital tools to mainly improve their existing research processes and to support established methods (Gibbs and Owens, 2012). If new tools require the introduction of new methods and procedures, these should be consistent with established approaches. Therefore, it is important to consider the research contexts of historians when creating specialised tools for information access and analysis. For example, a user centred process has already been applied to the definition of requirements for the collection and analysis of archival materials (Boukhelifa, Giannisakis, Dimara, Willett and Fekete, 2015), emphasizing the need for personal collections of materials, and the possibility of annotating texts with private notes.

Research goals and methods

In our research, we target the support that automated text analysis can bring to the analysis of large collections of historical publications. This study focuses on the observation and discussion of the information behaviour of historians while retrieving, accessing, and analysing published educational media as primary sources. Existing studies on the information behaviour of historians (e.g. Hassan, 2013; Rhee, 2012; Sinn and Soares, 2014) concentrate on retrieval and access in archives and libraries. From these studies, we know about the importance of different levels of contextualisation (Rhee, 2012), and the role of serendipity and creativity when searching for sources (e.g. Hassan, 2013). We investigate the role of these principles for the process of analysing sources and propose guidelines on how these might be incorporated into tools for the automated analysis of large corpora of texts. However, the area of research, the sources that are being analysed, and the methods used in a research project have to be considered when drawing conclusions regarding the requirements in other projects.

The study presented combines observational methods with participative design workshops for a collaborative analysis of results. The findings provide the foundation of an analysis of requirements for tools that support the research goals. Contextual studies form an essential part of user centred design (Holtzblatt, Wendell and Wood, 2004). While the methods described here may lack some of the rigour of the ethnographic methodologies they have been derived from, they provide means to collect insights into specific research processes and help to enable interdisciplinary communication.

Contextual interviews

Interviews in the context of the workplace of participants provide initial insights into current processes of the analysis of historical educational media. Observations at the workplace may help designers and developers to learn about the nature of work tasks, about advantages and disadvantages of available tools and existing information structures, and about the use of improvised tools for interpretation and analysis. However, the process of historical research takes too much time to be easily observable. We therefore conduct retrospective, contextual interviews. In the interviews, historians recount strategies and behaviours during a recent or ongoing research project. Being at their workplace provides them with the option to refer to the documents they have created and to the tools used in the process. These artefacts facilitate recollection of the process and help to make the process transparent to the interviewer. An important advantage of this approach is the possibility to collect examples of documents and tools in the form of digital copies or screenshots.

During the interview, the interviewer first asks questions about the background and research experiences. After this, the participant chooses an ongoing or recently completed project to present in more detail. The interviewer here takes the role of an apprentice, e.g. as a research assistant, and probes for explanations of specific concepts and processes. After questions about the research interests of the project and the most important types of primary and secondary sources, the participant recounts the process of the research project.

Participants

Five researchers from our partner institution participated in the study. Four of the participants have a background in history and one in political science (Table 1). The project with a political science background shares important goals with those with a historical background. However, it provides a contrastive example regarding the methods that have been employed.

The research projects presented by the participants are part of their doctoral dissertation, with the exception of one participant (P1_H), a PhD-student who presented the completed research project for his master’s thesis (P1_H). Details about the research focus cannot be disclosed because participants were promised an adequate level of anonymity. Instead, Table 1 presents the types of sources and a summary of the research process for each case. In two cases, the participants are also members of the project (P1_H, P2_H).

Table 1: Participants of the interviews and their projects
P#: Background Sources & process of analysis
P1_H: History Quantitative analysis of sources (historical atlases from the 20th century). Analysis of example sources and definition of sub-topics. Contextualisation based on recensions of publications. Classification of maps and extraction of facts for comparison. Comparative, qualitative analysis of sources within categories.
P2_H: History & sociology Scanning complete issues of relevant scientific journals (19th century) as primary sources online to find relevant topics. Access to paper versions and analysis of table of contents. Collection of background knowledge about authors. Following citations to find additional material.
P3_H: History Examination of sources (classroom films, early 20th century) in archives and creation of protocols with linear, detailed summarizations of content. Identification of recurring narratives. Planned: Detailed analysis of single sequences and of recurring narratives.
P4_P: Political science Primary sources: textbooks from the 20th century. Discussion of categorization scheme and coding guidelines with other project members. Coding of a sub-sample of sources and revision of codebook. Planned: Quantitative analysis of codings and qualitative analysis of important aspects.
P5_H: History & political science Analysis of archival sources about textbook production regarding the discussion of topics of interest to the research process. Analysis of sources on the public reception of textbooks. Planned: Analysis of textbooks from 20th century based on issues identified in the previous steps.

Analysis

The analysis of the results focuses on existing categories (e.g. process of retrieval and analysis, tools and information structures in use, contextualisation and assessment of sources) instead of a more extensive open coding process. This procedure corresponds to the goals of the research project: the analysis and definition of requirements for tool support. Notes taken during the interviews provide the basis for the analysis and make it possible to quickly converge on the most important and salient aspects. The recordings provide the option to validate the notes taken during the interviews and to add more details. Collected samples of documents and tools help to understand and to contextualize the descriptions of the research process. The most important results of this analysis are the main informational concepts mentioned and the activities of analysis during the research process. Activity models (Holtzblatt et al., 2004) are used to describe the process of analysis, presenting each step in relation in the context of identified steps of an activity, with goals, descriptions and examples of each step. The models help to identify common pattern across projects. In addition, the definition of goals for each step helps to focus later design activities on the motivations of the participants to perform certain activities, instead of fixating on the action themselves, which might be influenced mainly by the skills and tools that are available. Workshops with the project team, which includes two of the historians who are also participants of the interview study, later provided the possibility to discuss and to validate the results of the interviews based on the concepts and activities that have been identified.

Results

The following sections present the results grouped according to the categories of analysis: retrieval and access of documents, tools used for analysis, processes of analysis, and contextualisation of sources.

Retrieval and access

During the initial phases of a project, researchers collect references to interesting sources and consider to which extent they are accessible to them. Personal inquiries with colleagues who are interested in similar topics help to find additional relevant collections and publications: ‘At the beginning, I consulted bibliographies, as I was planning to analyse scientific literature regarding its usage of maps and visualisations […]’ (P1_H). In some cases, they have been shared in the form of thematic bibliographies (P1_H, P2_H). Researchers collect available sources from libraries and create their own extensive thematic bibliographies at the beginning of a project. In archives, researchers create copies or take pictures of sources if they are permitted to do so. Gaining access to primary or secondary sources in archives or in remote libraries may require considerable efforts. A range of archival finding aids, e.g. local online library catalogues and full-text search on digitized contents, helps to locate sources.

Even though participants value access to archives and libraries highly, they also perceive a selection bias towards sources that can be found online and in digital formats. In addition, all participants try to collect every item of interest as a digital copy, to make them accessible whenever they need them during an analysis.

Tools for analysis

Researchers mostly use digital tools to collect and in some cases to quantify metadata about sources. Creating and maintaining information structures for the sources appear to be important activities during both the collection and the analysis of sources. Examples include researchers managing sources in reference management software or in custom spreadsheets (P1_H, P5_H), organizing and browsing pictures of maps in folders and subfolders (P1_H), and extracting facts and statistics about historical events into a spreadsheet to enable comparisons between publications (P1_H). Participants use these structures primarily for linking and re-finding resources, and in one case for an early quantitative analysis (P1_H). Researchers apply text-recognition (OCR) to scanned documents, which, while still perceived as inaccurate, is valued as a possibility to provide full-text search within the documents.

Some researchers (P4_P, P5_H) also employ specialised software packages that support qualitative research methods through the manual annotation of texts (i.e. ‘coding’ in the terminology of social sciences). In the political science project, the researcher used specific functionality, such as defining and revising codes or formulating coding guidelines. In contrast, one historian stated that he would not use the program like it is intended to be used: ‘I am a historian and I do not code like social scientists would.’ (P5_H). Instead, this participant perceives the program as a way to create an overview over all references to specific concepts identified in the texts to enable him to return easily to relevant parts of a text. The participant also complained that existing software often does not support chronological views of references, which are specifically useful for historical research: ‘I cannot display the historical course of events and in my opinion there is no program which can do this in an adequate way.’ (P5_H). However, users may find creative ways to circumvent restrictions of the programs: In this case, the field for the page number in a reference management software holds the date of creation to keep the attachments of an entry, such as scanned documents or excerpts, in chronological order.

Process of analysis

The process of information analysis consists of two major activities (Table 2): the definition of the focus of research, and the iterative, more detailed analysis of sub-topics.

Define focus of research: At the beginning of a project, participants structure information about relevant primary and secondary sources as bibliographies. A project-specific categorization appears to help to narrow down or to broaden research questions early in a project. The accessibility of representative sources determines whether a researcher can investigate a topic further. Criteria for the inclusion of sources into the analysis are developed, but may be iteratively adapted during the course of a project. A sub-sample of the sources can be analysed in order to derive first criteria for analysis (P2_H, P3_H, P4_P). For example, P4_P first accessed books by picking works from a relevant shelf in the library, and later selected nine books for a pilot analysis. During this activity, usually sub-topics emerge, which are later analysed in a more structured way. At this point, researchers often define a selection of textbooks or relevant parts of textbooks, which then remains relatively stable during the main analysis.

Iterative definition of criteria & analysis: The main analysis focuses on sub-topics. Analysis and writing may be closely related (P2_H). Researchers assess sources and contextualize them on several levels (cf. next section). In addition, researchers may derive dimensions for the analysis of textbooks from other types of sources, e.g. based on existing discourses in the society (from newspapers and similar publication), from sources about textbook production (P5_H), and from theory or existing research (P4_P). Most participants do not necessarily define the dimensions of analysis in a formal way, e.g. in a codebook. In contrast, the researcher with a political science background keeps a codebook with notes and discusses these categories with other researchers in the project. Historians may instead summarize sources according to categories relevant for the argumentation (P3_H) and collect excerpts in the structure of the intended document outline (P2_H). References to secondary as well as to relevant primary sources provide a connection to the wider discourse of interest. It may be difficult to summarize results that are based on different levels of analysis but are interlinked, because they have to be presented as linear texts (P5_H).

Table 2: Activity models with most important activities - some activities may be optional and can occur in different order
Activity model 1: Define focus of research Activity model 2: Iterative definition of criteria & analysis
· Define first criteria for the selection of sources
· Create overview of available sources
· Access samples of relevant sources
· Define a collection for further analysis
· Browse documents to find interesting characteristics
· Contextualize results using secondary literature and additional sources
· Analyse documents regarding criteria
· Collect excerpts and summarize findings

Levels of contextualisation

The analysis of the research process underlines, that both the publications that have been analysed as well as the contents of interest need to be contextualised at different levels. Researchers extensively apply their own background knowledge and consult additional sources to enable this multi-faceted and contextual approach to analysis. Reasoning relative to the historical context needs to draw on this background knowledge in order to make inferences, e.g. a reader may only understand whether a text depicts an event as positive or negative based on the likely evaluation of its consequences from the point of view of the author and at the time of writing. Participants mentioned different examples of contextualisation, which can be summarized as follows:

Contextualisation does not appear to be supported well in existing tools, e.g. the ability to present diachronic displays of all information items and at the same time to filter by available metadata. In many cases, this would require at least two timelines, representing both the date of publication and the temporal focus of the content. In addition, it is important to be able to manually highlight subtle contextual cues within the publications and to integrate or to refer to external knowledge and sources.

Participative requirements analysis

Discussing user needs based on experiences in other projects helped to define and to prioritize requirements in the project. An early analysis of the notes and recordings of the interviews results in 139 concepts that potentially have an impact on text analysis. In an affinity diagramming session (Holtzblatt et al., 2004), six members of the project team work in pairs to organize a subset of these concepts into groups, using prints of concept cards and note-its (Figure 1). Three groups of concepts emerge:

In addition, one column has been left unnamed. It includes concepts such as political opinions, detection of omissions, diachronic changes, and the relevant textual contexts of terms and might therefore be called ‘contextual analysis’.

The process of discussing these concepts from several perspectives helps to establish a common vocabulary within the project team and to define criteria for additional metadata. Important facts emerged which may have appeared obvious to historians but not to the information scientists in the project, e.g. the existence of distinct types of textbooks targeted at students or at teachers. A first discussion of different types of expressed political opinions helped to distinguish simpler, explicit forms of opinionated statements from implicit ones.

Examples of the affinity diagramming activity
Figure 1: Examples of the affinity diagramming activity (left) and the collaborative annotation of activity models (right)

For the discussion of the process of information analysis, important activities are extracted from the results of the interviews to create activity models (Holtzblatt et al., 2004). In a subsequent workshop, the activity models are presented to the project members as posters (Figure 1). In the workshop, project members annotate activities with questions and ideas regarding their applicability, their relevance, and possible problems of their realisation in the current research project. For example, the missing documentation of the collection emerged as a problem. As a result, it might be necessary to control results of analysis for biases within the collection. For analysis, it has to be taken into account that texts may be included into the collection multiple times, due to reissues of books, parts taken from other authors (both with and without reference), and the inclusion of external source materials.

Guidelines for automated support of text analysis of historical media

The analysis of requirements primarily targeted user needs for text analysis within the ongoing project. Nevertheless, many of these requirements can be summarized as more general guidelines to make them transferable to other projects. In the interviews, researchers presented quantitative approaches to analysis as a means to provide an overview of available sources at the beginning of the project. For the analysis of larger corpora, it remains an open question whether quantitative analysis becomes more important in the discipline, paving the way for new research questions. Alternatively, researchers may use it primarily for the identification of interesting sources, which they then interpret manually. Based on the discussions in the workshops, we expect a combination of both: An increased reliance on quantitative measures, but also the need to support identification of relevant sources for interpretative analysis. The presentation of the guidelines therefore follows three higher-level goals: (1.) identify dimensions that are interesting for analysis, (2.) support intellectual analysis of sources in more detail, and (3.) enable quantitative comparisons in context.

Goal 1: Identify dimensions for analysis

Tools should provide an overview of the contents of a corpus to help to identify relevant documents, but also to find topics and dimensions that might be of interest for further investigations.

  1. Use of overview visualisations to identify trends and outliers in different contexts.
  2. Relative metrics help to identify interesting themes within publications or across the groups of one contextual dimensions.
  3. Definition of sub-collections for further analysis.

Goal 2: Support intellectual interpretation

For the interpretation and publication of results, tools should offer support for intellectual analysis, e.g. notes and links at the level of contents, but also at the level of aggregated results of analysis.

  1. Manual annotation of documents and of entities within documents helps to integrate background knowledge of researchers.
  2. Integration of manual annotations into search and analysis (e.g. as filters or as comments in visualizations).

Goal 3: Quantitative comparisons in context

For further automated analysis, it will be of importance to compare or to correlate several contextual dimensions. Researchers will probably find it more difficult to understand and to explain more complex setups of text-mining that provide insights at a semantic level (Chuang, Ramage, Manning, and Heer, 2012). Therefore, it should be possible to validate the results with additional methods which involve less steps of text processing, e.g. on the level of term-frequencies, as well as with interpretative methods which focus on example documents that have been identified in the process.

  1. Enable comparisons and analysis of correlations between document sets within and across sub-collections.
  2. Provide identification and disambiguation of named entities and links to external sources of knowledge, e.g. existing ontologies.
  3. Offer different methods of text analysis to validate results.
  4. Enable retrieval of relevant example documents during text analysis for manual validation.
  5. Enable the analysis of correlations of variables to control for biases within a corpus.

In the project, these guidelines drive the design of prototypes (non-functional mock-ups as well as a data-driven interface for search and analysis). During the design process, participants of the project have discussed and prioritized the guidelines. As a result, the need to identify new dimensions for analysis (guidelines 1 & 2) and increased support for quantitative, multi-dimensional analysis while controlling for biases in the corpus (guidelines 6 & 10) currently take highest priority. In addition, the guidelines will guide the evaluation of the set of tools created in the project, e.g. to help to define tasks for user testing.

Discussion

The results of the studies presented are primarily useful for this project and for other projects within the specialisation of historical textbook research. In this context, a comparatively small sample of participants is sufficient for a contextual, qualitative user study. In addition, the findings are validated in an ongoing process of user centred design and evaluation. For example, the discussion of results in participative workshops can be seen as a form of respondent validation (Blandford, 2013) and helps to improve the credibility of the results and to confirm their relevance for historical research. However, the starting phase of research projects may have been over-represented in the interviews because in three cases the project referred to as an example was ongoing and not yet in the phase of presenting findings.

We expect central findings from this study to be applicable to similar areas within digital history. Most notably, the area of textbook research shares important characteristics with the analysis of newspaper articles, which also involves the analysis of large, published corpora (e.g. Allen, 2011; Blevins, 2014). Both areas focus on the analysis of publications regarding their representation of knowledge and sentiments about the world, while in other specialities, historians often concentrate on the identification and assessment of facts represented in the sources (Rhee, 2012). Because of this, the guidelines present requirements at a more general level, which helps to transfer findings from this study to other projects.

Additionally, the methodology used in this study is applicable to the design of information systems in similar projects within the digital humanities. Collaboration across disciplines has been identified both as a challenge (Lin, 2012) and as a crucial factor of success (Collins and Jubb, 2012) for projects in the digital humanities. However, the qualitative analysis of results would have profited of a wider approach to analysis, considering more contextual aspects of information behaviour as proposed in the framework of contextual work analysis (Fidel and Pejtersen, 2004).

The results strongly indicate that both contextualisation (Rhee, 2012) and support for serendipity (Hassan, 2013), which have been presented as important for information retrieval for historical research, also need to be represented during the analysis of a historical corpus with text-mining tools. In addition, the contextual interviews demonstrated the importance of a quantitative overview of available sources in existing processes, which helps to converge on a more specific topic for analysis. The iterative process of analysis and the need for comparative analysis suggests similarities to analysis in media research (cf. e.g. Bron et al., 2012). Further research needs to concentrate on the role of different forms of representation, e.g. combinations of sequential, facetted and hierarchical representations (Blandford, Faisal, and Attfield, 2014), to support processes of sensemaking based on large historical corpora.

Conclusions

Analysis of historical media appears to be an almost completely digitized process, although manual access to sources is still valued highly. Many creative methodologies and tools have been and are being created to improve support for text analysis in the humanities. Therefore, instead of arguing for a new, universal workbench, we prefer to present a collection of guidelines that describes means of support for the study of historical media in large corpora based on automated text analysis. From the user studies and the priorities given, a likely workflow for the analysis of historical media presents itself: During an initial exploration of the corpus, researchers select interesting document-groups and variables (e.g. term-counts, document-cluster or identified semantic topics) for a more detailed analysis. Recommendations of interesting variables in sub-groups can support this process. For closer analysis, measures can be compared across contexts. However, the potential influence of biases in the distribution of the corpus within these contexts has to be controlled. At the same time, researchers collect interesting documents for a more detailed, interpretative analysis.

In this study, contextual methods helped to discover and discuss important user requirements for the analysis of historical texts. During the process, it is important to present the results to the project team to avoid the substitution of direct communication between team members. Despite the successful application of these methods to the behaviours and processes of analysis, it proved to be difficult to reach a common understanding of the historical research questions in the project. One reason can probably be found in the open ended and explorative nature of historical research. Another factor may be that researchers trust the tools to provide starting points for analysis based only on the data. To confront this problem, researchers from both disciplines will conduct further cooperative experiments to explore specific sub-topics of interest. This will not only help to improve the tools created in the process, but also foster the discussion about which methodologies are accepted in the discipline. Interactive prototypes have been developed and enable first analytical explorations of the corpus. These tools will be evaluated in context towards the end of the project. The evaluation can refer to the criteria identified in this study.

Acknowledgements

We want to express our gratefulness to the participants of the study for their time, their openness, and their enthusiasm. This study was made possible by funding of the Leibniz Association under grant number SAW-2014-GEl-2.

About the authors

Ben Heuwing works as a postdoctoral researcher in information science at the University of Hildesheim, Universitätsplatz 1, 31141 Hildesheim, Germany. He received his Ph.D. in 2015 from the University of Hildesheim. His research interests include information seeking and use, information interaction, and user centered design.

Thomas Mandl is Professor in Information Science at the University of Hildesheim, Universitätsplatz 1, 31141 Hildesheim, Germany. He received his Ph.D. from the University of Hildesheim and his research interests are information retrieval and human-computer interaction. He can be contacted at mandl@uni-hildesheim.de

Christa Womser-Hacker is full professor of Information Science at the University of Hildesheim, Germany. She received her PhD and Venia Legendi from Regensburg University. Her research interests are in the area of cross-language information retrieval, information behaviour, and human-computer interaction. Her e-mail address is: womser@uni-hildesheim.de.

References
  • Allen, R. B. (2011). Visualization, causation, and history. Proceedings of the 2011 iConference, 11, (pp. 538-545). New York: ACM.
  • Blandford, A. (2013). Semi-structured qualitative studies. In The encyclopedia of human-computer interaction (2nd ed.). Aarhus, Denmark: The Interaction Design Foundation. Retrieved from https://www.interaction-design.org/literature/book/the-encyclopedia-of-human-computer-interaction-2nd-ed/semi-structured-qualitative-studies (Archived by WebCite® at http://www.webcitation.org/6hieyfFvc)
  • Blandford, A., Faisal, S. & Attfield, S. (2014). Conceptual design for sensemaking. In W. Huang (Ed.). Handbook of human centric visualization (pp. 253-283). New York: Springer.
  • Blevins, C. (2014). Space, nation, and the triumph of region: A view of the world from Houston. Journal of American History, 101(1), 122-147.
  • Boukhelifa, N., Giannisakis, E., Dimara, E., Willett, W. & Fekete, J.D. (2015). Supporting historical research through user-centered visual analytics. In E. Bertini & J. C. Roberts (Eds.). Proceedings of the EuroVis workshop on visual analytics. Cagliari, Italy: The Eurographics Association. Retrieved from https://hal.inria.fr/hal-01156527/document (Archived by WebCite® at http://www.webcitation.org/6iV6tbpO2)
  • Bron, M., van Gorp, J., Nack, F., de Rijke, M., Vishneuski, A. & de Leeuw, S. (2012). A subjunctive exploratory search interface to support media studies researchers. In Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval (pp. 425-434). New York: ACM.
  • Chuang, J., Ramage, D., Manning, C. & Heer, J. (2012). Interpretation and trust: Designing model-driven visualizations for text analysis. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 31, 443-452.
  • Collins, E., & Jubb, M. (2012). How do researchers in the humanities use information resources? LIBER Quarterly, 21(2), 176-187.
  • Fidel, R. & Pejtersen, A. M. (2004). From information behaviour research to the design of information systems: the Cognitive Work Analysis framework. Information Research, 10(1). Retrieved from http://informationr.net/ir/10-1/paper210.html (Archived by WebCite® at >http://www.webcitation.org/6hietvl2O)
  • Gibbs, F. W. & Cohen, D. J. (2011). A conversation with data: prospecting victorian words and ideas. Victorian Studies, 54(1), 69-77.
  • Gibbs, F. W. & Owens, T. (2012). Building better digital humanities tools: toward broader audiences and user-centered designs. Digital Humanities Quarterly, 6(2).
  • Retrieved from http://www.digitalhumanities.org/dhq/vol/6/2/000136/000136.html (Archived by WebCite® at http://www.webcitation.org/6hielmlgL)
  • Given, L. M., & Willson, R. (2015). Collaboration, information seeking, and technology use: A critical examination of humanities scholars’ research practices. In Hansen, P., Shah, C. & Klas, C.-P.(Eds.). Collaborative information seeking (pp. 139-164). Cham, Switzerland: Springer.
  • Hassan, L. (2013). Assessing the information needs of historians working with digitised primary sources in the UK: a sequential mixed methods study. Unpublished doctoral dissertation, University of Huddersfield, Huddersfield, U.K. Retrieved from http://eprints.hud.ac.uk/19321/(Archived by WebCite® at http://www.webcitation.org/6hiefvfT9)
  • Holtzblatt, K., Wendell, J. B. & Wood, S. (2004). Rapid contextual design: A how-to guide to key techniques for user-centered design. San Francisco, CA: Morgan Kaufmann.
  • Lin, Y. (2012). Transdisciplinarity and digital humanities: Lessons learned from developing text-mining tools for textual analysis. In Berry, D. M. (Ed.). Understanding digital humanities (pp. 295-314). Basingstoke, UK: Palgrave Macmillan.
  • Palmer, C. L. & Cragin, M. H. (2008). Scholarship and disciplinary practices. Annual Review of Information Science and Technology, 42(1), 163-212.
  • Rhee, H. L. (2012). Modelling historians’ information-seeking behaviour with an interdisciplinary and comparative approach. Information Research, 17(4). paper 544 Retrieved from http://InformationR.net/ir/17-4/paper544.html (Archived by WebCite® at http://www.webcitation.org/6hiebr1MJ)
  • Rieger, O. Y. (2010). Humanities scholarship in the digital age: the role and influence of information and communication technologies. Unpublished doctoral dissertation, Cornell University, Ithaca NY, U.S.A. Retrieved from http://core.kmi.open.ac.uk/download/pdf/4914923.pdf (Archived by WebCite® at http://www.webcitation.org/6hieVnoKx)
  • Rimmer, J., Warwick, C., Blandford, A., Gow, J. & Buchanan, G. (2008). An examination of the physical and the digital qualities of humanities research. Information Processing & Management, 44(3), 1374-1392.
  • Sinn, D. & Soares, N. (2014). Historians’ use of digital archival collections: The web, historical scholarship, and archival research. Journal of the Association for Information Science and Technology, 65(9), 1794-1809.
  • Toms, E. G. & O’Brien, H. L. (2008). Understanding the information and communication technology needs of the e-humanist. Journal of Documentation, 64(1), 102-130.
  • Uva, P. A. (1977). Information-gathering habits of academic historians: report of the pilot study. Upstate Medical Center, State University of New York, Syracuse (ED148423). Retrieved from http://eric.ed.gov/?id=ED142483 (Archived by WebCite® at http://www.webcitation.org/6hie8eh5V)
  • Warwick, C. (2012). Studying users in digital humanities. In Warwick, C. & Terras, M. (Eds.). Digital humanities in practice (pp. 1-21). London: Facet Publishing.
How to cite this paper

Heuwing, B., Mandl, T. & Womser-Hacker, C. (2016). Combining contextual interviews and participative design to define requirements for text analysis of historical media In Proceedings of ISIC, the Information Behaviour Conference, Zadar, Croatia, 20-23 September, 2016: Part 1. Information Research, 21(4), paper isic1606. Retrieved from http://InformationR.net/ir/21-4/isic/isic1606.html (Archived by WebCite® at http://www.webcitation.org/6mHhosF7b)

Check for citations, using Google Scholar