Appendix 7. Papers submitted to journals

Appendix 7.1 - Information seeking and mediated searching. Part 1. Theoretical framework and research design

A. Spink
School of Information Science and Technology
Pennsylvania State University, USA

T.D. Wilson, N.J. Ford, D. Ellis*, A.E. Foster
Department of Information Studies
University of Sheffield, UK

ABSTRACT

Our project has investigated the processes of mediated information retrieval (IR) searching during human information-seeking processes to characterize progressive changes and shifts that occur during an information seeking process. This has included information seekers' situational contexts; information problems; uncertainty reduction; cognitive styles; and cognitive and affective states. We have also sought to characterize related changes over time, and examine changes in information seekers' relevance judgments and criteria, and characterize their differences. Few studies have investigated these issues. The research has involved observational, longitudinal data collection in the U.S. and U.K. Three questionnaires were used for pre-and post-search interviews: reference interview, information seeker post-search and search intermediary post-search questionnaires. In addition, the Sheffield team employed a fourth set of instruments in a follow-up interview some two months after the search. Related search episodes, with a professional search intermediary using the Dialog Information Service, were audio taped and search transaction logs recorded. The findings are presented in four parts. Part I presents the background, theoretical framework, models, and research design used during the research. Part II is devoted to results related to uncertainty. Part III provides results related to successive searching. Part IV reports findings related to cognitive styles, individual differences, age and gender. Further papers will discuss findings from this complex research project.

 

INTRODUCTION

The research reported in the series of papers of which this is the first, was the result of a marriage of two, separately funded investigations. The first, in funding and in time, was a project supported by the National Science Foundation and undertaken at the University of North Texas by Dr. Amanda Spink (now at The Pennsylvania State University). The aim of this project was to study the process of successive searching in information retrieval, investigating the nature, manifestations, and behavior of successive searching in IR, to derive criteria for use in the design of IR interfaces and systems supporting successive searching behavior. In addition, the nature and extent of successive IR search episodes by a set of users over time characterizing the changes that occur in user situational context, user information problem, cognitive and affective states of the user, and consequently in the queries was to be studies, as well as, changes in user's relevance judgments and criteria.

The second project was supported by the British Library Research and Innovation Centre and was proposed by Dr. T.D. Wilson in collaboration with his colleagues, Dr. David Ellis, and. Nigel Ford. The aim of this project was to investigate several aspects of information seeking and searching in the context of a theoretical model of the problem-solving process, and incorporating existing models of successive searching and information seeking behavior, notably the theoretical perspectives of Kuhlthau (1993) and Ellis (1989). Within an overall model of the problem process, this project addressed research goals relating to the following:

    Evaluate models of information searching in information retrieval (IR) systems;
  • Test whether the proposed model of information-searching as related to problem solving is valid for the population in question;
  • Establish whether the use of Kuhlthau's (1993) model of information searching as a stage process fits the suggested model of multiple searches in a problem solving strategy;
  • Examine whether Ellis's (1993) behavioral model of the search process is a more appropriate model in the problem-solving context, and
  • Explore whether the concept of individual differences (e.g., cognitive styles) is valuable in explaining differences in problem solving and searching behavior in searching.

Both projects adopted a common theoretical framework so that commonly developed research instruments could be employed in the two projects. In evolving the theoretical framework we had in mind that information-seekers with a broad problem (as distinct from the search for a specific fact) often seek information in stages over extended periods and use a variety of information resources. As the time progresses, information seekers may search the same or different digital information environments, such as IR systems - on-line databases, CD-ROM databases, on-line public access catalogs (OPACs), the Web or digital libraries, as well as document collections of various kinds, for answers to the same or evolving problem. The process of searching information environments, possibly over time, in relation to an evolving, information problem is the major focus of our research presented in a series of related papers of which this is the first. The phenomena we examined were search episodes within an information seeking process. Our goal was to make a contribution to the modeling of information seeking behavior, i.e., user modeling. Key variables in the analysis are changes or shifts in behavior such as search strategy or language use during search episodes over time, and a key constant is the same or evolving information problem. The evolution, if any, of an information problem and other cognitive, affective and situational variables may be mapped and the interactions among variables explored.

A growing body of research is beginning to explore information-seeking processes over time, especially since the development of Web-based services and the migration of 'traditional' on-line search services to the Web. IR and Web interfaces are designed to help users in various ways in their searches of digital environments, but IR systems generally follow a single search paradigm. That is, they are designed and operate on the assumption that an information seeker's search episode is an end in itself, unrelated to other searches or the information seeking process more generally. Research in this area is growing in significance as the size and variety of information resources in IR systems, digital libraries and the Web grow exponentially; the problem of searching becomes critical. Our research project presented in this paper is oriented toward exploring the human dimensions in the design of IR interfaces and search engines as well as to a more general understanding of the processes of information seeking and their relationship to the nature of problems and the stages people go through in reaching solutions.

Our approach to this research is predicated on a conceptual and theoretical framework drawn from previous studies that are outlined in the next section of the paper.

CONCEPTUAL AND THEORETICAL FRAMEWORK

Our research is embedded in a theoretical framework that draws on previous studies in the fields of both IR and human information behavior (HIB). The field of IR has developed as two largely unconnected but related sub-fields, one that focuses on systems aspects, and the other that focuses on the human, cognitive and interactive aspects. The field of HIB is related to the interactive human-cognitive-interactive sub-field of IR and seeks to investigate the broader issues related to human processes for seeking and using information. However, interactive IR research and HIB research have been largely unconnected despite their mutual interest in areas of human information-related behavior. The appropriate integration of elements of both fields is growing in importance, particularly to further the development of more effective, theoretical models, and Web and IR systems design and evaluation (Spink, 1999; Vakkari, 1999; Wilson, 1999). Our research seeks to contribute to the further integration of interactive IR and HIB research. Researchers have called for the gap to be bridged as both fields have matured theoretically and intellectually with the emergence of models and research findings from empirical studies of human information behavior, searching and seeking (Spink, 1999; Vakkari, 1999; Wilson, 1999). Our theoretical framework extends that for relevance developed by Spink, Greisdorf and Bateman (1998) and a model of human information seeking developed by Wilson (1999).

Interactive search episodes are represented by interactive IR models, including Ingwersen (1992, 1996), Belkin, Cool, Stein and Theil (1995), and Saracevic (1996a, 1997). Over time, movements or shifts may take place during interactive search episodes and between searches, including changes in tactics, the definition of the information problem, strategies, terms, feedback, goal states, or uncertainty.

Time as a factor may be represented by:

  1. Human problem solving processes, represented in Wilson's (1999) problem-solving model of information-seeking behavior in which interactive search episodes provide the informational framework to the problem-solving process through which the user's uncertainty level is reduced,
  2. Human information seeking stages, represented by Kuhlthau's Information Search Process Model (1993),
  3. Information Seekers' successive searches over time related to the same or evolving information problem developed by Spink (1996, 1999)

We suggest that our theoretical framework is a basis for the development of our theoretical and empirical research towards integration of interactive IR within information-seeking contexts, and for exploring information seekers' interactive search episodes within their changing information-seeking behavior.

Time

In our theoretical framework a set of situated actions occurs during IR interactions over time. Mizzaro (1998) also includes time as a key element in IR. An information seeker makes judgments during an evolving information-seeking process or during successive search episodes. We suggest that sets of problem-solving activities engaged in by individuals may be used as the primary means of examining the way time is used and that the problem-solving process also serves as the framework within which other aspects of information seeking and searching may be explored. Thus, problem solving may involve information-seeking activities, within which search episodes take place. Within those episodes interactions of various kinds occur, e.g., the relatively mechanical tasks of entering search terms and pressing keys as well as the mental tasks of selection and relevance judgment formation in response to retrieved items.

In our theoretical framework, the top-level process is the human problem solving process that is discussed in the next section of the paper.

Problem-Solving Process

Problem solving is used as the top-level device to explain why people engage in information seeking behavior. Wilson's Problem-Solving Model (1999) presents information-seeking behavior as goal-directed behavior, with the resolution of the problem and/or the presentation of the solution as the goal. In moving through each of the stages of problem identification, problem definition, problem resolution, and solution presentation, 'uncertainty' must be resolved and individuals are seen as engaging in interaction episodes with information sources (including people and other sources as well as IR systems) to resolve uncertainty. Of, course, the attempt to resolve uncertainty may actually increase it and, therefore, the model provides for feedback at each stage. Within the problem solving process, information seeking is related to Kuhlthau's Search Process Model (1993) as a highly developed model of the information seeking process. Ellis's (1989) behavioral characteristics are seen as applying to search activities at any stage of the problem-solving process or at any Kuhlthau's search stages.

To extend the theoretical framework, the problem solving process is seen as including one or more information seeking episodes. In addressing the nature of searches during an information-seeking process, a major research effort has to be directed toward the development or adaptation of appropriate model or models of information seeking and interactive IR. We start with the information-seeking models summarized below. The objectives of use of these models are to: (i) identify a useful subset of the features or variables in search episodes, and (ii) define content of questionnaires to enable collection of data related to these variables and to shifts and transition between episodes.

Information-Seeking Episodes

Results from information seeking studies support the notion of successive searches of digital information environments over time by showing that humans progress through a series of stages, adopt different strategies and exhibit different information behaviors at different stages of their information-seeking process (Ellis, 1989; Kuhlthau, 1993). Kuhlthau's Search Process Model (1993) is currently the most developed stage-model of the information seeking process. Kuhlthau (1993) found that the information seeking process of library patrons occurred in six clearly defined stages related to the cognitive, affective states and search activities of the users, including task initiation, topic selection, prefocus exploration, focus formulation, information collection, and search closure. Although Kuhlthau did not investigate the use of IR systems by library patrons, her findings suggest that IR system users continue to collect and seek information throughout their information-seeking process using or requiring different types of information, conducting different types of searches, and using different search terms and strategies at different stages of an information seeking process (Kuhlthau, Spink & Cool, 1992). Ellis (1989) and Ellis, et al., (1993) define the following characteristics of information seeking behavior, without typifying these as stages: Starting, Chaining, Browsing, Differentiating, Monitoring, Extracting, Verifying, and Ending.

Wilson (1998) suggests that the Ellis and Kuhlthau models may be viewed as closely related, if a stage process is imposed on Ellis's characteristics. Under this revision, the activities of Chaining and Monitoring are seen as a deeper specification of Kuhlthau's Collection stage. In Wilson's problem-solving model, the Kuhlthau and Ellis models describe behavior within one loop of a problem-solving stage.

The next section of the paper discusses the relationship between uncertainty and information seeking in our theoretical framework.

Uncertainty

The concept of uncertainty has had little treatment in information science, except in defining information as that which reduces uncertainty. Under this definition, data that, for example, increases uncertainty is not information. Ingwersen (1992) has defined the relationship between uncertainty and information seeking as action undertaken to resolve doubts that cannot be resolved by thinking alone; this is broadly equivalent to Belkin's (1985) concept of the anomalous state of knowledge. Kuhlthau and Ledet (1996) suggest the need for further research on uncertainty from the user's perspective to understand the full range of impact of uncertainty in all of its manifestations on the information-seeking behavior of human beings in the conduct of their daily lives. This project is intended to contribute to that research agenda.

Cognitive Styles

Cognitive styles represent a class of variables, which may influence information seeking at all levels across our model. Witkin has noted how cognitive style would appear to exercise pervasive effects across intellectual and social functioning, extending even to basic perception (Witkin, Moore, Goodenough & Cox, 1977). Theoretically it would seem possible that cognitive style differences may influence problem perception and uncertainty tolerance as well as strategic and tactical approaches to problem solution (Ford, 1999). Empirical work within information science suggests that cognitive style differences can influence information seeking dialogues with databases (Ford & Ford, 1993) and search tactics at the level of differential use of Boolean operators (Ford, Wood & Walsh, 1994).

In our theoretical framework, the next level down process in the interactive search of a digital information environment within an information seeking process. This is discussed in the next section of the paper.

Interactive Search Sessions

Interactive search sessions, depicted in interactive IR models, take place within human information seeking processes. Research into the human or cognitive (user modeling) aspects of IR is also in its infancy with a growing body of research on users interactivity and measures for observing user interactivity (Saracevic, Mokros, Su & Spink, 1991). Major theorized interactive IR models have recently emerged, but have yet to be empirically tested. Researchers are also beginning to investigate the context of users' searches and evaluation (Ellis, 1997) and to identify key elements in a user's single search of an IR system. IR interactions related to the single search episode can be represented by different theoretical interactive IR models - Ingwersen's Cognitive Model of IR Interaction (1992, 1996), Belkin, Cool, Stein and Theil's (1995) Episodic Interaction Model, and Saracevic's Stratified Model of IR Interaction (1996a, 1997).

Our research initially uses the Saracevic Stratified Model of IR interaction (1996a) within our integrated model of information seeking and searching. The model views the interaction as a dialogue between participants, user and computer (system) through an interface at a surface level; furthermore, each of the participants is depicted as having different levels or strata. Interaction is the interplay between various levels. On the user side elements involve at least these levels: cognitive, affective, and situational. We include in our theoretical framework elements from information seeking models and interactive IR models that describe the phenomena of successive and related searches of digital environments by humans during an information seeking process.

Successive Searching Behavior

Much IR research has overlooked any successive searches related to the same information problem. Important elements within single searches have been identified, including feedback types and effective search term selection strategies (Spink & Saracevic, 1997). The differences between end-user and intermediary searching behaviors have also been investigated (Hsieh-Yee, 1993). Research has also shown that end-users perform different search sessions over time (related to a different information problem) including searches of successive databases or IR systems (Saracevic, Mokros, Su & Spink, 1991). They found that 18 (45%) of academic users had previous, mediated, on-line search on the same topic, frequently with the same search intermediary. Studies by Huang (1992), and Robertson and Hancock-Beaulieu (1992) have also identified successive searches by users.

Recent studies show that users also conduct successive searches over time when seeking information to solve an information problem. Spink (1996) showed that successive IR searches are a fundamental aspect of users' behavior when seeking information related to an information problem. Data from survey interviews with 200 academic users shows that for the same information problem: (1) 113 (56.5%) users had conducted more than one IR search, (2) 43 (21.5%) users had conducted five or more IR searches, and (3) many users had conducted successive searches at different stages of their information seeking process. A recent study showed that Web users also perform successive searches of the same Web search engine when seeking information on a particular topic over time (Spink, Bateman & Greisdorf, 1998).

A growing body of studies is beginning to examine the patterns of users' successive search behavior. Data from several recent studies highlights the weakness of research based on the single search approach and the need for studies that classify and categorize users successive search behavior. Recent studies show that users conduct successive IR searches when seeking information related to a particular information problem. Our research highlights the need for longitudinal research at a problem-level of analysis as opposed to a single search level of analysis of searching behavior.

Our research encompasses specific processes or phenomena that play a crucial role in IR interaction: the notion of relevance, user modeling, selection of search terms, and feedback types (Spink & Saracevic, 1997). Queries, commands and responses are surface phenomena that can be observed and analyzed as to nature, content, and shifts on the surface level, as planned here. But a more significant and more difficult connection has to be made to the other levels on both the user and computer sides, and to the feedback loops among all levels, that affects shifts. These shifts take place within a set of situated actions.

Set of Situated Actions

Shifts

Finally, we have to deal with the difficult concept of shifts; a widely discussed concept, which has not been specifically elaborated in IR. A shift signifies a change from one state to another of any variable involved, or from one phase or action to another, based on a reason and geared toward a result. Previous studies and models suggest some initial types of shifts for analysis, including shifts in feedback behavior (Spink, 1997), types of search terms (Spink & Saracevic, 1997), interactive search focus (Robins, 2000) and interactive intentions (Xie, 2000). Research is needed to identify taxonomy of shifts related to interactive searching. However, methodologically it is not clear how to record all given shifts and how to categorize and describe given changes and their outcomes. Thus, an important part of the research has been methodological in nature: establishing and testing various methods for categorizing and measuring shifts related to interactive IR from the data obtained.

Similar methodological problems are encountered in analysis of phase transition in communication discourse, particularly in negotiations (Holmes, 1992). There are significant similarities between such phase transitions and the type of activities users engage in while negotiating with IR systems for an effective search, hence the parallel. In general, we can think of a shift space within an episode and between episodes in which changes can be mapped. Shifts will be characterized, depending on the variable, as to changes in focus, content, direction, substitutions, trade-offs, and other characteristics. Shifts may also be viewed as an effort in managing change toward a result, and as such, as a trial and error procedure, as often mentioned in describing the IR process. We also study changes or shifts in relevance judgments.

Relevance Judgments

An important variable examined in our research is relevance. Relevance in IR is an attribute or criterion reflecting the effectiveness of interactive exchange of information between users and IR systems in a communicative contact. The interaction involves different levels or strata at which interaction are inferred producing an interdependent system of relevances. Thus, there is a distinction between system or algorithmic relevance; topical or subject relevance; cognitive relevance or pertinence; motivational or affective relevance; and situational relevance or utility (Saracevic, 1997).

Recent relevance research also highlights the need to investigate how users' relevance judgments and relevance criteria change over successive searches. Relevance studies have focused on the nature of users' relevance judgments during a single search. Automatic relevance feedback techniques, based on users' judgments of highly relevant items, are more effective than traditional approaches to IR systems (Spink & Losee, 1996). Recent research by Bateman (1998) is also beginning to explore how users' relevance judgments and relevance criteria change over successive searches. Bateman's study focuses on the changes in users' criteria for documents judged highly relevant by a user over successive searches.

Recent research on levels and regions of relevance judgments suggests that users' partial relevance judgments also play an important role in the search process and are linked to shifts in the user's information problem during single and successive searches (Spink, Greisdorf & Bateman, 1998; Spink & Greisdorf, 2001). In our theoretical framework, building on Spink, Greisdorf and Bateman (1998), relevance judgments as situated actions consist of relevance levels based on five manifestations: Systems or algorithmic relevance, Topical or subject relevance, Cognitive relevance or pertinence: Situational relevance or utility, Motivational or affective relevance and degrees of relevance are situated within one of four relevance regions - highly relevant, partially relevant, partially not relevant, and not relevant. Thus, the region of a user's relevance judgment can be situated as to relevance level and relevance degree. For a finer grain analysis, many more regions of relevance can be delineated as the granularity of relevance regions is sharpened. An overlay of the two dimensions (level and region) of a relevance judgment is represented on the set of situated actions. A user also makes a relevance decision at a specific point in time during or after the IR interaction, and a graphical representation of such decisions related to retrieved texts could also be plotted.

RESEARCH OBJECTIVES

The objectives of our research are to investigate human information seeking and searching processes in the context of mediated on-line searching to derive models and criteria of use in systems design. In particular, our research seeks to characterize progressive changes and shifts that occur in users' information seeking and searching processes, including: user situational context; user information problem; uncertainty reduction, cognitive and affective states of users; and consequently in their queries, and users cognitive styles, user relevance criteria and judgments.

RESEARCH DESIGN

Data Collection

This empirical research is based on observation of real-life, as opposed to a laboratory, situations. The information problems driving information seekers in the research are user-initiated not imposed on our study participants, as in a laboratory study. A weakness of empirical studies is that the results should be treated as hypotheses for generalization and further confirmation. Data were collected from a set of 198 information seekers engaged in tasks or problems that produce real information needs and, consequently, search operational IR systems with or without assistance by a professional search intermediary. Every effort has been made to preserve the reality of situations, observations, and recordings.

The data collected during our research included: (i) search transaction logs, (ii) numerical data and responses to given questionnaires, (iii) texts retrieved and assessed relevance judgments, and (iv) in Sheffield, responses to a standard test of cognitive styles. The research was conducted for eighteen months in the U.S. and two years at the University of Sheffield. Table 1 list the basic data from our study.

TABLE 1

Data were collected on a total of 198 cases: 87 at UNT and 111 at Sheffield. The US sample had a much higher proportion of female clients than that of the Sheffield sample but the difference is not statistically significant (Chi-squared = 3.08, sig. >.05). The age characteristics of the two sets of cases were as follows: The age range was very similar in the US and UK participants, as shown in Table 2.

TABLE 2

US participants had a slightly higher proportion of clients aged under 30, and a slightly higher proportion of clients in the 40 to 49 group, while at Sheffield a higher proportion was in the 50 to 59 age group. However, the differences are very small and, overall, therefore, we can say that there was no statistically significant difference. The UK and US clients were distributed over these four categories as shown in Table 3.

TABLE 3

Clients were classified by broad discipline, i.e., humanities; 'pure' social sciences, such as economics, political science, sociology, etc.; applied social sciences, such as social welfare and social administration; pure science; medicine; and engineering. The numbers of humanities and medical clients were rather small and the former were incorporated into the pure social sciences group, while the latter were included in the pure science group. This gave four discipline categories.

Institutions

Information seekers participating in the research were primarily from the University of North Texas (UNT) [http://www.unt.edu] and the University of Sheffield [http:// www.shef.ac.uk].

Search Intermediaries

Three trained search intermediaries performed mediated on-line searches at the University of North Texas and one search intermediary searched at the University of Sheffield.

Information Seekers

A total of 198 information seekers participated in the project. As an inducement, free mediated DIALOG on-line searches were offered. E-mail calls for participation were issued to recruit information seekers and to specify what was offered and required. In Sheffield, respondents to a regular call for research participants in the information searching teaching of students received an explanation of research undertaken and details of their engagement. Information seekers were generally engaged in research, development, planning, or a similar project, that created information problems and needed information support from IR systems, including searching of networked information resources, such as the Web.

Procedures

Pre-Search Interview: In this first interview, a detailed description of the participant's problem was obtained, together with responses to interview questions and responses to a questionnaire, which covered, for example, problem stage, Kuhlthau's (1993) stages, feelings about the progress of the work, other information seeking activities, and uncertainty.

On-line Search and Post-Search Interview: Immediately before the search, Sheffield participants completed a test mounted on the PC, which automatically recorded various dimensions of cognitive style (Riding, 1991). During the search, computer logs were kept, together with audiotapes of the interaction between information seeker and the search intermediary. After the search, the participants completed another questionnaire on aspects of the search and, again, on their certainty/uncertainty with regard to different stages of problem resolution. The search intermediary also completed a search assessment instrument.

Follow-Up Interview (Sheffield Only): conducted a minimum of two months after the search seeking an evaluation of the retrieved material, and a repetition of the previously used instruments.

Questionnaires

Three questionnaires were used to record various aspects of context that are connected to context and not record able in transactions: an information seeker pre-search (reference interview) and post-search, and search intermediary post search. The aim of the pre- and post-search questionnaires was to capture the information seeker's state in a number of areas before and after their search. This allowed the measurement of changes or shifts by information seekers resulting from their search. The questionnaires were based on those used in major studies of on-line searching by Saracevic, Kantor, Chamis and Trivison (1988) and Saracevic (1989), with the addition of items relating to uncertainty, problem-solving state, Kuhlthau's (1993) stages and affective states, and Ellis's (1989) behavioral characteristics and with changes to wording and scale type. In most cases a 'visual analog' scale was adopted (DeVellis, 1991), where the respondent marks a point on a line connecting the polar points of an item. One advantage of this type of scale, which makes it appropriate to a longitudinal study, is that it is potentially sensitive, making it, '…especially useful for measuring phenomena before and after some intervening event…' (DeVellis, 1991) Again, appropriately for this investigation, when a measure is used more than once, it is difficult for respondents to remember exactly how they marked at item on the earlier occasion, so that a respondent bias towards consistency in responding is avoided.

[Copies of the questionnaires are available at http://www.shef.ac.uk/~is/publications/unis/app1.html].

The Sheffield team also used a questionnaire to collect information on changes in relevance judgments on retrieved items some two months after the initial on-line search, after the information seeker had acquired the documents. These questionnaires capture answers to cognitive, affective, situational, relevance, and process variables suggested by the reviewed models, and were pre-tested and revised during pilot applications. Before the mediated search each information seeker completed a consent form (U.S. only), a demographic form and a pre-search questionnaire. After the mediated search each information seeker completed a post-search questionnaire.

Cognitive Styles Test

Holist/serialist differences: A shortened version of Ford's Study Processes Questionnaire (Ford, 1985) was used to assess the holist and serialist information processing differences identified by Pask (1979). Measures consisted of (a) holist scores (b) serialist scores; and (c) a measure of bias to holist or serialist (computed by subtracting serialist from holist score). Field-Dependent/-Independent Differences: Riding's (1991) Cognitive Styles Analysis (CSA) is a measure that offers computerized administration and scoring. The CSA measures Witkin's field-dependence/-independence (Riding & Sadler-Smith, 1992).

Information Seeking and Searching Context Variables

Saracevic, Kantor, Chamis and Trivison (1988) provide a general model of information seeking and retrieving. In Figure 1 we present an extension to this general model that is used as a basis for our data collection.

FIGURE 1

The model provides a general overview of the variables selected for our study. Information seekers were asked to give their perceptions on a number of issues discussed below:

Information Problems

Human information problems were a key element in our study. In order to analyze the participant's information problems we sought to capture the context of the problem by asking the clients to describe the problem. At UNT this was done by asking the clients to write a description of their information problem and the search terms they envisaged, while at Sheffield the whole context was obtained through the detailed pre-search interview with the client. The context of the information problem included the pre-search, problem-solving stage of the information seeker.

Data Analysis

At this point, the analysis of data is still in train. Most of the analysis to date has been quantitative in character, employing standard statistical tools from the SPSS package. The focus has been on identifying relationships in the data and in testing hypotheses. In addition to this introductory paper, three more are in preparation: one deals with correlations between uncertainty and other aspects of the client's search for information such as the Kuhlthau stages and affective variables; the second deals with the relationships between personal characteristics, including cognitive style, and information-seeking behavior; and the third explores the limited amount of data available on clients who carried out successive searches over the duration of the project.

Qualitative data were also collected, in the form of the pre-search interviews, the tape-recorded interactions between client and intermediary, and the search logs. These data will be used in further papers and will also be drawn upon, as appropriate, to illuminate aspects of the clients' information-seeking behavior.

Possible qualitative methods include: content analysis, structuring of taxonomies depicting structure and relations of various types of actions and specific variables, derivation of various diagrams and structures to describe shifts, derivation of semantic roadmap diagrams, and principles and criteria derived from grounded theory research. Further papers on specific issues will describe the methodologies employed, as appropriate.

CONCLUSION

In this paper we have presented a theoretical framework for understanding the actions taken by information-seekers in their search for information, including the actions involved in the interactive search process, where an intermediary was employed. We have explored the changes that take place over time in the problem stage in which they are engaged and in their level of uncertainty about the enterprise, and we have related these variables to other variables derived from the work of other researchers such as Kuhlthau and Ellis. We have a limited number of cases of clients who returned for more than one search and, in these cases, have data on the changes that took place in the variables of interest.

Our theoretical framework consists of a set of situated actions, including levels and regions of relevance judgments and other human judgments, within interactive search episodes over a period of time. The period of time consists of interaction time (i.e., that involved in the on-line search process) and information-seeking time (i.e., the total time spent in seeking information from any source), and can be represented as human information seeking stages and successive searches over time related to the same or evolving information problem. Acts, decisions and judgments take up more time than the interactions with IR or other systems. For example, judgments about the relevance of documents may continue beyond the interaction period and decisions to use IR systems clearly must pre-date interaction with them. Furthermore, the interactive search episodes are not instances but periods of time, of varying duration.

The integration of theoretical framework with that of Wilson (1999) links interactive search episodes to his uncertainty reduction episodes and the problem-solving process, over time, provides one explanation of the need for successive interactions. The implication of the proposed model is the potential ability to isolate a user's situated actions at particular points in time, assessing levels of relevance, regions of relevance, and problem process (represented by information seeking stage and successive searches). This could lead to major implications for system design and design criteria. In particular, a user's successive interactions could be analyzed in terms of relevance judgments, uncertainty reduction, and problem solving stage. The resulting complex picture, if derived through interaction with the user and displayed graphically as an aid to searching, could improve the interactivity of IR systems and, when logged, could provide a rich source of data for research.

The further development of the theoretical framework for research to: (1) integrate interactive IR research within information-seeking research, (2) explore users' interactive search episodes within their changing information-seeking contexts, (3) examine relevance judgments within users' information seeking processes, (4) broaden relevance research to include the concurrent exploration of relevance judgment level, region and information-seeking or problem-solving phase, and (5) conceptualizing and exploring interactive IR evaluation with an information seeking context. Such research would allow the development of an integrated view of a user's interactive IR processes within their changing information-seeking context.

Our theoretical framework has strengths and weaknesses. A key strength of the theoretical framework is the focus on the larger picture that embraces information seeking and information searching, and the drawing together of major concepts - situated actions, relevance, IR interaction and time. Other concepts, such as feedback, representation, information problem, and context, are also incorporated into the framework. The theoretical framework can integrate existing and future research and models from IR and information seeking. A further strength is the framework provided for gathering, plotting and testing data from users. Theoretical frameworks also have weaknesses. The framework of tends to focus on major dimensions and not specific differences in information-seeking contexts and is also specifically related to the IR context, not information-seeking in general. Despite these limitations, the authors believe their framework derived from previous studies to provide a reasonable and heuristic approach from which to build further theoretical and empirical research.

Our approach area should be considered exploratory. Obviously, there are many kinds of shifts on a number of levels. For instance, on the surface level there are semantic, syntactic, and logic shifts in selected search terms and statements; on the situational level there are shifts in problem definition; there are shifts in focus, and so on. It is not even clear how various shifts should be classified. In other words, by necessity, shifts and transitions will be a major focus of the proposed research. Currently, IR systems and interfaces provide limited assistance to information-seekers over related search episodes. This research will not specifically design new or improve old interfaces to accommodate users' successive search episodes. However, design criteria derived here will be significant in the design of human-oriented systems.

ACKNOWLEDGMENTS

This paper resulted from studies funded by a National Science Foundation POWRE Grant 1998-99 and British Library Research and Innovation Centre Grant. The authors acknowledge the valuable comments by Tefko Saracevic of Rutgers University to the development of this paper.

This is a draft of a paper published in Journal of the American Society for Information Science and Technology, Volume 53, No. 9, 2002, 695-703

REFERENCES

  • Bateman, J. (1998). Changes in users' relevance criteria: An information seeking study. Unpublished Doctoral Dissertation. School of Library and Information Sciences, University of North Texas.
  • Belkin, N. J., Brooks, H. M., & Oddy, R. N. (1982). ASK for information retrieval: Part I: Background and theory. Journal of Documentation, 38(2), 61-71.
  • Belkin, N. J., Cool, C., Stein, A., & Theil, S. (1995). Cases, scripts, and information-seeking strategies: On the design of interactive information retrieval systems. Expert Systems With Applications, 9(3), 379-395.
  • Ellis, D. (1989). Behavioural approach to information retrieval system design. Journal Of Documentation, 46, 191-213.
  • Ellis, D. (1993). Modeling the information seeking patterns of academic users: A grounded theory approach. Library Quarterly, 63(4), 69-86.
  • Ellis, D. (1997). The dilemma of measurement in information retrieval. Journal of the American Society for Information Science, 47(1), 23-36.
  • Ford, N. (1985). Styles and strategies for processing information: Implications for professional education. Education for Information, 3(2), 115-132.
  • Ford, N. (1995). Levels and types of mediation in instructional systems: An individual differences approach. International Journal of Human-Computer Studies, 43,241-259.
  • Ford, N. (1999). IR and creativity: Towards support for the original thinker. Journal of Documentation, 55(5), 528-542.
  • Ford, N., & Ford, R. (1993). Toward a cognitive theory of information accessing: An empirical study. Information Processing and Management, 29(5), 569-585.
  • Ford, N., Wood, F., & Walsh, C. (1994) Cognitive styles and searching. On-line & CDROM Review, 18 (2), 79-86.
  • Holmes, M. E. (1992). Phase structures in negotiation. In L. L. Putnam & M. E. Roloff, eds. Communication and negotiation. Newbury Park, CA: Sage. pp. 83-105
  • Hsieh-Yee, I. (1993). Effects of search experience and subject knowledge on the search tactics of novice and experienced searchers. Journal of the American Society for Information Science, 44(3), 161-174.
  • Huang, M. H. (1992). Pausing behavior of end-users in on-line searching. Unpublished Doctoral Dissertation. University of Maryland.
  • Ingwersen, P. (1992). Information retrieval interaction. London: Taylor Graham.
  • Ingwersen, P. (1996). Cognitive perspectives of information retrieval interaction: Elements of a cognitive IR theory. Journal of Documentation, 52(1), 3-50.
  • Kuhlthau, C. C. (1993). Seeking meaning: A process approach to library and information services. Norwood, NJ: Ablex Publishing.
  • Kuhlthau, C. C., & Ledet, N. (1996). The relationship of information and uncertainty in information seeking. Proceedings of the 2nd International Conference on Library and Information Science: Integration in Perspectives, October 13-16, 1996. Edited by Peter Ingwersen and Neils Ole Pors: Royal School of Librarianship.
  • Kuhlthau, C. C., Spink, A., & Cool, C. (1992). Exploration into stages in the information search process in on-line information retrieval: Communication between users and intermediaries. Proceedings of the 55th Annual Meeting of the American Society for Information Science, 29, 67-71.
  • Mizzaro, S. (1998). How many relevances in information retrieval? Interacting with Computers, 10(5), 1998, 303-320
  • Pask, G. (1979). Final report of SSRC Research Programme HR 2708. Richmond (Surrey). System Research Ltd.
  • Riding, R.J. (1991). Cognitive Styles Analysis. Birmingham: Learning and Training Technology.
  • Riding, R.J., & Sadler-Smith, E. (1992). Type of instructional material, cognitive style and learning performance. Educational Studies, 18, 323-340.
  • Robertson, S. E., & Hancock-Beaulieu, M. M (1992). On the evaluation of IR systems. Information Processing and Management, 28(4): 457-466.
  • Robins, D. (2000). Shifts in focus on various aspects of user information problems during interactive information retrieval. Journal of the American Society for Information Science, 51(10), 913-928.
  • Saracevic, T. (1989). Modeling and measuring user-intermediary-computer interaction in online searching: Design of a study. Proceedings of the 52nd Annual Meeting of the American Society for Information Science, 26, 75-80.
  • Saracevic, T. (1996a). Modeling interaction in information retrieval (IR): A review and proposal. Proceedings of the Annual Meeting of the American Society for Information Science, 33, 3-9.
  • Saracevic, T. (1996b). Relevance reconsidered. Information science: Integration in perspectives. Proceedings of the Second Conference on Conceptions of Library and Information Science. Copenhagen, Denmark (pp. 201-218).
  • Saracevic. T. (1997). Extension and application of the stratified model of information retrieval interaction. Proceedings of the Annual Meeting of the American Society for Information Science, 34, 3-9.
  • Saracevic, T., Kantor, P., Chamis, A. Y., & Trivison, D. (1988). A study of information seeking and retrieving: Background and methodology. Journal of the American Society for Information Science, 39(3): 161-176.
  • Saracevic, T., Mokros, H., Su, L., & Spink, A. (1991). Interaction between users and intermediaries during online searching. Proceedings of the 12th Annual National Online Meeting, 12 329-341.
  • Spink, A. (1996). A multiple search session model of end-user behavior: An exploratory study. Journal of the American Society for Information Science, 46(8): 603-609.
  • Spink, A. (1997). Interaction in information retrieval (IR): successive searches by users over time. National Science Foundation 1997 POWRE Grant proposal.
  • Spink, A. (1999). Toward a theoretical framework for information retrieval within an information-seeking context. In: T.D. Wilson, ed. Proceedings of the 2nd International Conference on Information Seeking in Context, August 12-15, 1998. Sheffield, U.K. London: Taylor Graham, 1999.
  • Spink, A., Bateman, J., & Greisdorf, H. (1998). Successive searching behavior during information seeking: An exploratory study. Journal of Information Science, 25(6), 439-449.
  • Spink, A., & Greisdorf, H. (2001). Regions and levels: Mapping and measuring users' relevance judgments. Journal of the American Society for Information Science, 52(3), 226-234.
  • Spink, A., Greisdorf, H., & Bateman, J. (1998). From highly relevant to not relevant: Examining different regions of relevance. Information Processing and Management, 34(2/3), 257-274.
  • Spink, A., & Losee, R. M. (1996). Feedback in information retrieval. Annual Review of Information Science and Technology, 31: 33-78.
  • Spink, A., & Saracevic, T. (1997). Interaction in information retrieval: Selection and effectiveness of search terms. Journal of the American Society for Information Science, 48(5): 382-394.
  • Strauss, A., & Corbin, J. (1990). Basics of qualitative research: Grounded theory procedures and techniques. Newbury Park, CA: Sage Publications.
  • Vakkari, P. (1999). Task complexity, information types, search strategies and relevance: integrating studies on information seeking and retrieval. In: T.D. Wilson, ed. Proceedings of the 2nd International Conference on Information Seeking in Context, August 12-15, 1998. Sheffield, UK. London: Taylor Graham (pp. 35-54).
  • Wilson, T. D. (1997). Information behavior: An interdisciplinary perspective. Information Processing and Management, 33(4), 551-572.
  • Wilson, T.D. (1999). Models of information behavior research. Journal of Documentation, 55(3), 249-270.
  • Wilson, T. D. (1999). Exploring models of information behaviour: the 'Uncertainty' Project. In: T.D. Wilson, ed. Proceedings of the 2nd International Conference on Information Seeking in Context, August 12-15, 1998. Sheffield, UK. London: Taylor Graham (pp. 55-66).
  • Wilson, T. D., Ellis, D., & Ford, N. (1997). Uncertainty in information seeking. A Research Proposal to the British Library Research and Innovation Centre.
  • Witkin, H.A., Moore, C.A., Goodenough, D.R., & Cox, P.W. (1977). Field-dependent and field-independent cognitive styles and their educational implications. Review of Educational Research, 47, 1-64.
  • Xie, H. (2000). Shifts in interactive intentions and information seeking strategies in interactive information retrieval. Journal of the American Society for Information Science, 51(9), 841-857.

Front Page Contents

Uncertainty in information seeking, by Professor Tom Wilson, Dr. David Ellis, Nigel Ford, and Allen Foster
Library and Information Commission Research Report 59
ISBN 1 902394 31 3      ISSN 1466-2949
Grant number LIC/RE/019