Search behaviour in electronic document and records management systems: an exploratory investigation and model
Department of Information Studies, School of Media, Culture and Creative Arts, Curtin University, GPO Box U 1987, Perth, Western Australia, 6845
Deputy Vice Chancellor and Head of Campus, The University of Notre Dame Australia, 19 Mouat Street, Fremantle, Western Australia, 6959
Business School, University of Western Australia, 35 Stirling Highway, Crawley, Perth, Western Australia, 6009
Electronic document and records management systems are organisational repositories that enable knowledge workers to register, update and access corporate documents and records (hereon referred to as corporate information) they either create or employ in the course of their work in the organisation. In this article information comprises documents and records. A record is a finalised version of a document that cannot be modified. Documents and records stored in corporate electronic systems are assigned a unique file number. As such, these systems have functionalities that automatically track documents as they are checked in and checked out, manage version control, and maintain audit trails of corporate information. These systems are primarily designed and managed by records managers, and consequently, are developed and benchmarked according to the voluntary international best practice standard for records management principles: ISO 15489 Parts 1 and 2: Information and Documentation - Records Management (International Organisation for Standardisation 2001a, 2001b), hereafter referred to as ISO 15489.
ISO 15489 offers eight records management principles that guide professional practice: 1) policies; 2) procedures; 3) metadata standards; 4) classification schemes and thesauri; 5) retention and disposition schedules; 6) security permissions; 7) training; and 8) monitoring and auditing (Joseph 2012). These eight principles provide a benchmark of excellence for the implementation of records management programmes. The key ISO 15489 records management principles implemented in the design of electronic document and records management systems are metadata, classification schemes, retention and disposition schedules, and security and access controls.
In short, these systems provide the infrastructure for records management throughout a record's lifecycle from creation or capture to destruction or permanent storage of paper and electronic documents and records (see Joseph 2008). There are essentially two different types of design views (tree or virtual database) possible in these systems. A tree view design resembles the hierarchical folder structure view reflected in the network drive depiction of Microsoft's Windows Explorer. In contrast, a virtual database design offers no structural view of the classification folders. Instead, users must rely upon the system's search and registration windows to perform the later activities. Hence, in the virtual database view, users are unable to visualise where their information is virtually filed or how they might navigate to it.The basic structure and functions of such systems including a detailed description of functionalities and how document and records management systems differ are described in Joseph (2008).
The research problem defined
While electronic document and records management systems are increasingly part of the technological infrastructure in many corporations, their application is less well understood, partly because of the dearth of research on this issue. We have little understanding about how users search for information stored in such systems, or the relationship between their use of these systems and other corporate knowledge sources. Further, there is little documentation about the process of search that is undertaken, and the different ways in which a user might operate if a search proves complex. These concerns gave rise to the research question investigated in this paper: How do users search for information from electronic document and records management systems in order to perform their work tasks?
Here we report on an exploratory study in which forty users were tracked as they undertook several different searches in the electronic document and records management system. The resultant analysis of the different search processes contributed to the development of a search behaviour model for this system. Early search behaviour models published by the author (Singh et al. 2007a, 2007b, 2008c and 2008d) did not address how users' tasks start their search and what happens when users are confronted by difficult searches. The revised model assists in describing how users search and the processes that they follow in making particular decisions. We also illustrate the differences between simple and difficult searches and the interactions between these systems and other corporate information sources.
While there is little research on search behaviour using electronic document and records management systems, the broader domains of information search theory, information seeking behaviour and information retrieval offer some cues as to how users might interact with such systems. There are various definitions of information seeking behaviour in the literature, generally related to 'how people need, seek, give and use information in different contexts' (Pettigrew and McKechnie 2001: 44). Fisher et al. (2005: xix) suggest manage could be added to this definition. Thus, information seeking relates to the process of identifying an information need, then sourcing and accessing the necessary information avenues to address that need. Bates (2005: 58) points out that the 'act of searching for information is, in itself, a very important part of the general behaviour of information seeking'.
Information retrieval is a significant activity in the information seeking process. Its origins are from database searching, where the information seeker wished to identify a range of resources from a wider pool of options. Meadow et al. (2007: 2), for example, suggest information retrieval 'involves finding some desired information in a store of information or a database'. This aligns with Marchionini's (1995: 6) definition that information retrieval 'implies that the object must have been "known" at some point; most often, those people who "knew" it organized it for later "knowing" by themselves or someone else'. This explains why Meadow et al. (2007) described information retrieval as a communication process, as in a sense it is a means by which authors, creators or registrants of the information in the computer system communicate with the users of the system. The assumption that the object is knowable and therefore retrievable underpins much of the theory base in the information science discipline.
Over time, how terms have been used has varied or they have been used interchangeably. For example, the terms information behaviour (Belkin 1980, 2000; Belkin, et al. 1982; Wilson 1984, 2005), information seeking behaviour (Branch 2002; Ellis 1989; Kuhlthau 1988, 1993, 1999, 2005; Leckie 2005; Leckie et al. 1996; Meho and Tibbo 2003; Wilson 1984, 2005), information seeking processes (Branch 2002; Marchionini 1995), information seeking activities (Ellis 1989; Meho and Tibbo 2003) and information retrieval (Belkin 1980, 2000; Belkin et al. 1982; Ingwersen 2005; Saracevic 1997) have all been employed to describe the process of addressing the information need.
In this paper we will demonstrate that search behaviour in electronic document and records management systems comprises two related elements. First, it operates as information seeking processes, a series of cognitive choices that draw on a range of decision making strategies. In turn, these are enacted as a series of information seeking activities, that is, information search and retrieval tasks that are undertaken within those processes to address the information need. Our definition is in line with Wilson's (2000: 49) definition that described information seeking behaviour as 'the purposive seeking for information as a consequence of a need to satisfy some goal. In the course of seeking, the individual may interact with manual information systems (such as a newspaper or a library), or with computer-based systems (such as the World Wide Web)'.
Review of search behaviour theory
It is first important to understand why users seek information. For the most part, information seekers will have identified their information need (Wilson 2005), or they may have a perceived information need (Krikelas 1983). In some instances, the seeker may also recognise a gap in their existing knowledge (Dervin 1992; Dervin and Foreman-Wernet 2003), or know that their knowledge is insufficient: an 'anomalous state of knowledge' (Belkin 1980; Belkin et al. 1982).
To fulfil their information need, users engage in information search activities: across a number of information sources, including people sources (Wilson 2005); from their preferred information sources (Krikelas 1983); and generally from information sources they perceive requires the least amount of their effort (Zipf 1949).
Researchers have developed a number of information seeking behaviour theories and models to describe users' information search processes and/or activities, primarily in the library and information studies discipline and, more recently, in the Internet or online searching environment. However, there is a gap in the literature on how knowledge workers search for corporate information registered in their organisation's information repositories and related to their work task needs.
There is considerable research exploring the information search processes that users demonstrate. Kuhlthau's (1988, 1993, 1999, 2005) longitudinal study of students' information seeking behaviour concluded that information search in libraries is a process of construction, and that there are common patterns and processes evident in users' search experiences (Kuhlthau 2005: 230). These search processes were also reported in the Leckie model (Leckie et al. 1996) which focused on the search behaviour of professionals. Marchionini's (1995) information seeking process model captured the processes and sub-processes users apply when searching in electronic information repositories. These offered a framework of the search processes that are typically demonstrated by searchers as they address their information need.
An alternative approach to exploring search is to examine the search activities that are integrated into the search process. To assist with the design of online library catalogue systems, Ellis (1989) studied the search behaviour of social science academics. He observed their engagement with the following six information search processes and activities: starting, browsing, differentiating, chaining, monitoring and extracting. Expanding on Ellis' (1989) model, Meho and Tibbo (2003) studied the search behaviour of social scientists searching on the Internet. Their study affirmed Ellis' model in the Web environment, but it distinguished information search processes (searching, processing, accessing and ending) from search activities (starting, chaining, browsing, etc.). These studies have offered an insight into the processes and search activities that assist with resolving an information need.
However, apart from Leckie's (1996) model, these studies frequently drew on university or school student samples and an information need that was largely exploratory in nature, based in a library setting: the searchers were generally open to finding any suitable items, rather than a highly specific outcome. These examples show the intense focus that researchers have employed in mapping how users search for information. However, researchers have not examined these search processes or activities in the context of electronic document and records management systems in an applied work setting. In a workplace context, the information need is quite different, operating as a targeted and very purposeful desire to find a specific record to address a specific need, from these systems. Thus, the current research on information search has drawn on a different search context, in an empirical business setting.
Research has also explored the different factors that affect users search behaviour. The users' task has been identified as a primary factor affecting subsequent search behaviour (Byström 1999, 2002, 2005; Byström and Hansen 2005; Byström and Järvelini 1995; Hackos and Redish 1998; Hansen 2005; Leckie et al. 1996; Saracevic 1997; Vakkari 1999, 2003). Many other factors are reported to affect users' search behaviour including self-efficacy (Debowski 2001); affective behaviour (Nahl 2005; Saracevic 1997; Savolainen 2011); and the search strategy training users have received (Branch 2002; D'Alessandro et al. 2004; Debowski 2001; Debowski et al. 2001; Lucas and Topi 2004). Further, Saracevic (1997) identified that the users' interaction with the computer system at the interface and surface levels (cognitive, affective and situational factors) affected their information retrieval behaviour. Likewise, Ingwersen's (2005) 'integrative framework for information seeking and interactive information retrieval' focused on users' interaction with the information retrieval system emphasising how their cognitive state influences their search behaviour.
The complex motivations and influences that individuals bring to a search task are still the subject of considerable research. However, this emergent field highlights the importance of exploring the ways users interact with their search environment particularly if the task increases in difficulty. For example, Debowski et al. (2001) identified task complexity as an influence on the ways in which people approached a search task. There has been little investigation as to how the level of difficulty of a search task influences the search processes that a user follows. The increased difficulty can require careful structuring of the investigative processes or (probably) the application of different approaches. However, there is little empirical research on what happens to the search process as the task moves from a simple requirement to one that is less easily resolved. Further, the information need can be highly influential in driving a person's commitment. A student participating in a research study, for example, may be less motivated then a knowledge worker with an urgent information requirement.
There is also considerable research on several other aspects: the decision processes Web users employ in deciding when to stop their search (Mansourian 2007), Web users' search patterns in terms of frequency and duration of each website visit (Bhatnagar and Ghose 2004), and Cothey's (2002) longitudinal study of college students' transformation from browsing to eclectic Web information searching behaviour. These findings highlight the cognitive processes linked to information seeking behaviour and how with experience user search behaviour may improve.
The past research into search behaviour has offered a useful guide as to how users search for information, what information sources they turn to, the search processes and activities they engage in and how their search behaviour is affected by various factors. However, there is no specific theory or model relating to the search behaviour of electronic document and records management system users, and little evidence as to how users search for work-related information within a specific organisational setting.
Our study therefore aimed to build on the existing theory base and to examine these common frameworks in a new information seeking context: that of searching within an electronic document and records management system.
Search behaviour in electronic document and records management systems defined
Wilson (2000: 49) describes information seeking behaviour as the 'micro-level of behaviour employed by the searcher in interacting with information systems of all kinds'. Wilson's (2000) definition is in line with our definition of the search behaviour of electronic document and records management system users which is the information seeking process and activities that these users employ to identify or access corporate information held in these systems. Search behaviour is deemed to start from the time a user commences their search to when they decide to stop the retrieval process (Ellis 1989; Kuhlthau 2005; Marchionini 1995; Meho and Tibbo 2003).
The two aspects involved in understanding the search behaviour of these users are: 1) search processes and 2) search activities. Search processes in these systems comprise a number of sequential but iterative stages of judgement, options selection and decision-making (Ellis 1989, 2005; Henefer and Fulton 2005; Kuhlthau 1988; Leckie 2005; Leckie et al. 1996; Marchionini 1995; Marchionini and White 2007; Meho and Tibbo 2003). The searcher draws on a number of sources of information and feedback cues to determine their next response. These search processes are selective and influenced by the cues from the previous stage. They may also be influenced by the searcher's knowledge of the task and the information context.
Search activities refer to the actions users enact during the iterative process of moving the information search from start to closure. These actions include browsing, navigating or extracting information. Both information search processes and search activities comprise the search behaviour of users of these systems.
The aim of this study was therefore to build a model that outlined the common processes that users employed and to test their usability in explaining how people interacted with such systems. This paper supersedes earlier models previously reported in articles by the author (Singh et al. 2008b, 2008c), offering a more detailed exploration of the decisions taken by users within different stages of the search process.
An empirical research method was used in this research to investigate the search behaviour of electronic document and records management system users. The search behaviour patterns of forty users from four different organisations were mapped. Interviews with each user were conducted and sample searches were reviewed using a protocol analysis (Ericsson andSimon 1993), and 104 flowcharts were developed from the self-reported and observed searches. Thus, the search behaviour was triangulated using a range of investigative techniques. This strategy adhered to Yin's (1984) recommendation that construct validity be supported by the use of multiple data collection methods. In contrast, previous information seeking studies by Ellis (1989), Meho and Tibbo (2003), Branch (2002: 14) and Ingwersen (1982: 173) used only a single research method.
Four Australian government institutions employing three different electronic document and records management system architectures (HP TRIM, e-Docs and Objective) participated in the research between September 2005 and February 2006. A range of systems was selected to ensure the results were sufficiently robust across different platforms. The choice of organisations was predicated on their systems meeting the ISO 15489 standards and principles. Each operated with a different user interface based around a tree and/or virtual database view. The Objective and e-Docs systems were designed with both a tree view of the classification folder structure and virtual database interface view. Of the two HP TRIM systems, one user interface was designed with the virtual database view only, while the other was designed with both the classification folder structure and virtual database view.
The study participants in each organisation comprised one records manager and ten system users. In total, four records managers (three males and one female) and forty users (fifteen males and twenty-five females) participated in the research. Table 1 shows the age distribution of participants.
|20 - 29||30 - 39||40 - 49||50 - 59||60 - 69|
Interviews with the four records managers were conducted to explore how the systems had been implemented and the ways in which the records management standards and national practice principles were reflected (see Singh et al. 2008a). The managers then identified ten experienced users of their electronic document and records management system who could be interviewed about their search behaviour.
The search processes employed by users were explored from three perspectives during the interviews. Firstly, each participant was asked to describe broadly the process and the range of strategies they generally employed when searching for a document or record. Secondly, they were asked to describe their preferred search style and the different options they generally employed. This offered a useful comparator to the system design and training foci that were in place within the organisation. Their preferred style also provided a scaffold for the normal protocol they tended to follow when searching. Thirdly, users were then asked to recall their last simple and difficult searches, and to demonstrate those two retrospective search processes, using the protocol analysis method. These searches were observed and mapped as a series of steps and decision processes. The observation also noted the point at which the search was stopped. The protocol analysis tool tested whether users actually exhibited the behaviour they had described in their preferred search descriptions.
Thirty semi-structured interview questions (Appendix 1) were grouped into six broad topical segments (Table 2, Column 1) to measure each user's preferred search behaviour when undertaking a typical search. Overall, the shaded questions in Appendix 1 were designed to investigate each individual's search behaviour.
|Semi-structured topical segments||Purpose and description|
|(a) Usage||Why use the system and what types of information were sought?|
|(b) Searching patterns in the system||What search methods were used, what were the preferred search methods, how were search results followed, how and when a decision to stop search is made, what are the difficulties when searching?|
|(c) Classification scheme||Is the user familiar with, and do they understand, the scheme used? How do they find the scheme?|
|(d) Situational and time factors||Does time affect searching and is a time limit applied when searching?|
|(e) Training||What training has user received and how they did they find it?|
|(f) Design of the system||What are user's view of the current design and what changes would they like made?|
A component of searching includes using the classification schemes employed by the particular system. In this paper, ISO 15489's definition of classification is adopted: the 'systematic identification and arrangement of business activities and/or records into categories according to logically structured conventions, methods, and procedural rules represented in a classification system' (International Organisation for Standardisation, 2002a: 2). Classification schemes are designed to facilitate the creation and retrieval of records, including electronic records, particularly where large amounts of information are involved. Three questions (21 to 23, Appendix 1) were designed to clarify the user's familiarity with and application of the classification scheme. Further, participants were asked to describe what information they sought and what activities they performed to gather information in recent searches. Participants were also asked what training they had received working with the system (questions 26 to 29, Appendix 1).
Site visits were arranged with each of the four organisations. A researcher spent a total of four days at each site, engaged in data collection activities. On day one, the researcher reviewed internal records management documentation, observed a demonstration of the electronic document and records management system to determine how the organisation's records management programme was implemented, and then conducted an interview session with the records manager using semi-structured interview questions.
For the next three days, hour-long interview sessions were held with each of the ten users of the system in the organisation. First, each user was asked to complete a short questionnaire that provided background information about them. Then they described their preferred search style when using the system and followed this with a demonstration of how they conducted their last simple and difficult searches. Generally these searches were conducted in their desks. A protocol analysis was employed to capture the different search behaviour that users recounted and demonstrated during these interviews.
The interviews and protocol analyses were transcribed. Table 3 summarises the steps described to measure the search behaviour of each individual user.
|Measurement of||Research method used||Information sources||Resources developed|
|Individual search behaviour of 40 users||Semi-structured interview questions for users||Transcribed semi-structured interview data from users||Forty self-reported flowcharts|
|Protocol analysis||Transcribed protocol analysis data from users||Thirty-eight observed flowchart for simple search|
|Twenty-nine observed flowchart for difficult search|
First, a flowchart was plotted for each of the forty users showing their description of their preferred individual search behaviour, using a similar approach to that employed by earlier researchers (Krikelas 1983; Kuhlthau 1988, 1993, 1999, 2005; Leckie et al. 1996; Wilson 1984, 2005). These flowcharts are referred to as self-reported flowcharts whilst the subsequent flowcharts are referred to as observed flowcharts. The flowcharts documented the different stages of search that were executed and mapped the decision paths taken by each individual.
Secondly, for thirty-eight of the forty users, their observed search behaviour when they conducted their last simple search during the protocol analysis was plotted. (Two participants reported on difficult searches rather than simple.) Twenty-nine users were able to identify difficult searches during the interviews. Eleven users stated that they had not or did not encounter difficult searches, or that it had been so long since their last difficult search that they were unable to recall the details.
In total, 107 flowcharts were drawn and analysed to determine the individual search behaviour for each user including (in most instances) their observed simple and difficult search behaviour.
Once the flowcharts for each user were plotted, an analysis of the flowcharts revealed that the self-reported searches offered a comprehensive view of each participant's individual search behaviour: identifying all the different approaches that they employed when seeking information using the system. The search behaviour observed in specific searches matched the preferred styles, but using a sub-set of the options. The self-reported search flows were taken as the most comprehensive and definitive source when capturing the final aggregated search behaviour of users. The observed searches reflected more targeted approaches to address a specific search problem. Therefore, it was decided to aggregate the 40 individuals' self-reported preferred search processes and search activities and then overlaying the responses for each to answer the research question: what is the search behaviour of electronic document and records management system users? The steps taken to aggregate the forty individual search behaviour are summarised in Figure 1.
Aggregation of forty self-reported flowcharts
The self-reported flowcharts for all ten users in the same organisation were mapped onto a single flowchart. The aggregated measure was determined by visual comparison of each of the ten self-reported flowcharts, observing similarities and differences in search behaviour from one flowchart to the other. These steps were repeated to aggregate the search behaviour model for each of the four organisations. The final aggregated measurement of the search behaviour flowchart drawn for each organisation incorporated all the search processes and activities, collated by visual observation of all ten self-reported flowcharts.
Search behaviour model of users' of the electronic document and records management systemFigure 2 offers an aggregated view of the preferred search behaviour of the participants.
All forty users reported performing a linear sequence of search processes from the time they started a search to when they ended it (Figure 2). These conformed to seven stages:
- Stage 1: Start Search
- Stage 2: Formulate Search Strategy
- Stage 3: Execute Search
- Stage 4: Process and Evaluate Search Results
- Stage 5: Access Search Results
- Stage 6: Decision Making about Search Results
- Stage 7: End Search.
The following sections describe users' reported search behaviour from the interviews in each stage of their information search process.
Stage 1: Start search
Stage 1 is a crucial stage in the search model as it determines the search approach and thus the subsequent behaviour or strategies the user may engage in. All users reported they searched the system because they had a work task that required information from the system to complete it. The users started their search by conducting a task assessment activity, where they clarified the work task, identified their search task and confirmed their task knowledge, as presented in Table 4.
|Work Task||Search Task||Task Knowledge|
|Request from P12's boss to find a specific invoice.||Find invoice requested by boss for the organisation whose company name included the word Flood.||P12 knew that the invoice was registered in the system, and was aware that she could search using the Contact metadata field as she knew the name of the company included the word Flood.|
Stage 2: Formulate search strategy
As Figure 2 illustrates, users could apply a range of approaches to formulate their search strategy in Stage 2. Sixty per cent (24 users) had awareness of one search formulation strategy, 35% (14 users) had awareness of two different strategies and 5% (2 users) had awareness of all three strategies.
When search results were displayed, users frequently browsed the following metadata elements: Title (98%, 39 users), Date (33%, 13 users) and Author (10%, 4 users). Given that Title metadata are a key element in the search and retrieval of records, it is essential that entries into this field are as accurate and meaningful as possible. None of the users reported using the free text search function to formulate searches; they noted that this approach produced too many results not specific to the search query, which necessitated trawling through the results unproductively.
Stage 3: Execute search
Stage 3 (Figure 2) presents the act of executing the search formulated in Stage 2 by hitting the enter button on the keyboard or using the mouse to navigate the tree folder structure. Each of the users reported they executed their search in Stage 3 using these options.
Stage 4: Process and evaluate search results
In Stage 4 (Figure 2), all users processed and evaluated their search strategies by browsing the document title/parent folders of documents, date, and document/file numbers to make their selection. That is, users reported they browsed through the search results to evaluate and ascertain if they had found the information they sought. Subsequently, they refined their search criteria either to reduce the number of search results to a manageable few or to focus on finding the sought records. Common sub-stage information seeking activities reported by the users included the following.
- Changing (98%, 39 users) the selection of metadata fields (by document title, date created/registered, author) and varying the search terms in the metadata fields.
- Sorting (23%, 9 users) search results to display information in a preferred order. Most frequently, users sorted metadata by date created, author or document title, displayed chronologically or alphabetically.
- Filtering (50%, 20 users) search results (in Organisation B and D's systems only). These systems were designed to enable users to filter their search results by particular record types created by their departments or groups only. Although users had the functionality to filter their search results to display only records relevant to their department each time they executed their search, automatic filtration to match the search criteria was defaulted in each user's system and all 20 users in organisations B and D reported filtering as the automatic default. Or
- Navigating and browsing (30%, 12 users) the classification scheme folder structure using a hierarchical (tree) view to locate the document or record.
Stage 5: Access search results
In Stage 5, users reported they accessed and launched (opened) the documents or records matching their search criteria. Depending on the design of the system, it was sometimes possible to scan and verify items before launching them.
Stage 6: Decision making about search results
In Stage 6, if users were successful in launching their document or record, they scanned it and verified its contents. Users reported that the activities of launching, scanning and verifying a document enabled them to confirm that the contents matched the search criteria. In Organisation C, where the system was designed with a viewer tab at the bottom of the search results window, users reported they performed Stage 6 before Stage 5.
Stage 7: End search
Stage 7 (Figure 2) presents how users reported they decided to end their searching. When users extracted the required document or record, they closed the search, reflecting a successful outcome. Closed search were generally identified by participants as 'easy'. Otherwise, users decide to stop their search, citing a number of causes that could trigger the decision. In this case, further action was required to resolve the task. Generally, users regarded stopped searches as being 'difficult'.
Observations of search behaviour when simple and difficult searches were performed
The previous section explored how the participants generally conducted a typical or model search. However, preferred styles may alter when a real search is undertaken. This section reviews the users' actual search strategies as they demonstrated their last simple and difficult searches. A detailed description of users' search behaviour when they stopped their difficult search is also explained in this section.
The forty sampled users perceived a simple search in the system as one that required minimum effort to search and retrieve the sought information (in the form of document or record) to complete their work or search tasks. In all the observed flowcharts for simple searches, users were successful in finding and retrieving the required information and were able to close their searches. This was because there was a match between their search terms and the data entered into the metadata fields. For instance, if they entered the invoice number into the invoice metadata field, the system immediately retrieved the relevant invoice. Likewise, when users formulated their search strategy by navigation using the tree view structure, they found their sought information filed or classified into the folder where they thought it should be. Searches were considered simple because they matched the users' cognitive thinking in the way information was registered and classified in the system.
Users perceived a search to be difficult when they had to spend more than five minutes and considerable effort to retrieve the sought information. Users reported the key reasons for their search difficulty were that the sought information: was not meaningfully titled; had inadequate or inaccurate metadata entered for it; was classified into folders users would not consider searching; was not accessible in the system by them; and/or was not registered into the system therefore did not exist in the system (Joseph, 2010: 39).
In difficult searches, users were not able to successfully close the search. Instead, they had to stop and decide how best to acquire the information. Only 27 of the 40 users (67.5%) reported a difficult search experience. The remaining 13 (32.5%) could not identify difficult searches.
In the event of search difficulty, 38% (8 users) retried the search if required. During the performance of the simple searches in Stage 4, no user was observed returning to Stage 2 to reformulate a search strategy, but 37% (10 users) exhibited this behaviour during a difficult search. Sixty-seven percent (18 users) who engaged in a difficult search were able to find the required information in a second attempt, and then closed the search. However, 33% (9 users) could not find the required information and had to stop their difficult searches. Figure 3 describes users' search behaviour once they decided to stop their difficult searches.
Figure 3 shows that when users stopped their difficult search, they verified their current task knowledge by checking other information sources and/or by seeking help. They checked other information repositories in the organisation to verify the information was not stored elsewhere, or sought help from people resources such as their colleagues, the Records Section or the HelpDesk for help with searching for the information. Users then determined if the sought information had indeed been found. If so, they closed the search. If not, they assessed whether their updated task knowledge would enable them to retry their search in the system. In that case, they retried their search formulation strategy by returning to Stage 2 of the information search process. Otherwise, they stopped the search. These search activities, where users check other information sources, are also similar to Bates' berrypicking information search model, where library users were reported to berrypick and gather information from different information sources (Bates 1989: 410).
The simple and difficult searches performed by users both validated the reported search behaviour users depicted in Figure 2 and enabled other insightful observations of their search behaviour. For instance, the interviews with users did not make it possible to verify search behaviour once a search was stopped. However, when users demonstrated difficult searches it was possible to identify additional search activities such as how they sought help from people resources and/or checked other information sources or persevered to complete their search or to retry it in the system if they felt confident about finding the information the second time around. These observed additional search behaviour after users stopped their search are incorporated into the search model presented in Figure 4. Thus, Figure 4 is an enhanced model building on Figure 2 and Figure 3, illustrating users' behaviour in Stage 7 once they decided to stop their search and derived from observations of their difficult searches. Figure 4 is the final search behaviour model derived from this research.
Search behaviour in Stages 1, 2, 4 and 7
Search behaviour in the Stages 1, 2, 4 and 7 in the search model (Figure 4) are particularly worth discussing and comparing with the literature reviewed.
Stage 1: Start search
In Stage 1, it was observed that a user's work task, search task and task knowledge together triggered the start of a search process in Stage 1 (Figure 2) (Byström 1999, 2002, 2005; Byström and Hansen 2005; Byström and Järvelini 1995; Hackos and Redish 1998; Hansen 2005; Vakkari 2003). Users' task assessment resonates with Wood's (1986) component task complexity as users juggled the different cues from their work task, search task and task knowledge to start their search. If their subsequent task assessment led them to conclude that their information need (Wilson 2005) could not be satisfied with their existing state of knowledge, this initiated a search to fill the information gap (Dervin,1992) for their anomalous state of knowledge (Belkin, 1980). The definition of the terms in italics are as follows:
- Information Need is defined by Krikelas as the 'state of uncertainty recognised by the individual user' (Henefer and Fulton, 2005: 226).
- Information Gap refers to a state in which a person perceives a gap in their existing knowledge structure in order to make sense of the situation, probli or task at hand. This concept is derived from Dervin's 'sense-making' theory (Dervin, 1992; Dervin and Foreman-Wernet, 2003).
- Anomalous state of knowledge (ASK) is defined by Belkin as an inadequacy in the user's state of knowledge with respect to a probli or task that prevents them from resolving the probli or task at hand (Belkin, 1980; Belkin et al., 1982).
Each user's work task generated their search task. Based on their knowledge of the work task and search task, users reported a clear understanding and awareness of what information was sought: that is, their task knowledge (Ellis, 1989: 179; Wildemuth 2004: 247). Typical task knowledge included:
- who authored the information and whether the information was authored by the user;
- some words in the title of the document or record that they recalled or were referred to by colleagues;
- possible date ranges when the information was created or registered into the system; or
- an invoice number, contact details of the organisation, document number.
Having confirmed their existing task knowledge users reported that they then started to formulate a search strategy. Based on their task knowledge, they determined whether they authored or filed the sought information, or knew where it was stored in the system. Seventy-five percent (30 users) with access to a tree view reported they navigated to search for the items if they had authored or filed them, or knew where they were filed. Eighteen percent (7 users) reported that they recalled the search conducted previously and whether it had been saved into their favourites shortcuts, or if it was possible to access the information from their recent items folder. Shortcuts included saved searches, recent edits and items stored using the favorites function in the system. They then moved to Stage 2, where they formulated their search strategy. This series of cognitive coordination acts engaged in by users as they decided on the best search formulation strategy for their task reflects Saracevic's (1997) and Ingwersen's (2005) cognitive approaches to information search, and Wood's (1986) coordinative complexity of task.
Stage 2: Formulate search strategy
In Stage 2 users who had awareness of more than one search formulation strategy selected their preferred strategy based on their task knowledge and the nature of their search task. An impressive 98% (39 users) reported that they formulated their search strategies using metadata fields. Eighteen percent (7 users) reported they formulated their search strategy by retrieving from pre-existing shortcuts. Users reported that if they had previously conducted the search, they used the shortcut function to retrieve their search results, either from their recent search history or from their stored saved searches. This confirms findings on successive searching by Spink (1996) and Spink et al. (1999: 478), where users of information retrieval systems were observed engaging in a 'process of repeated searches over time in relation to a given, possibly evolving, information problem'. This capacity to tap into previous searches confirms that current information retrieval systems and interfaces assist users in successive search episodes, in contrast to the lack of system functionalities reported in earlier research by Spink et al. (2002: 726). This could be because such functionality may not have been available at the time Spink et al. (2002) conducted their research.
Thirty percent (12 users) reported that they navigated or browsed through the classification scheme presented via the tree view if they remembered where the record was filed or if they had filed the record themselves using the folder structure. In Organisation B, navigation was not possible given the virtual database design. In Organisation D both navigation and browsing were possible, but users who were not trained in them were not aware of these search strategies.
In their interviews users were asked, What is your preferred way of searching for information in the system? The three most preferred metadata fields for searching were Title (68%, 27 users), Document or Application Type (30%, 12 users), and Author (18%, 7 users). Application Type refers to what application was used to create the document or record; for example, MS Word, Excel, PowerPoint or MS Outlook for emails. These findings vary from Gunnlaugsdottir's (2006: 205) PhD research, which reported that the most commonly reported searched metadata fields by users was the name of the sender or receiver, date (received or created) and the free text search option.
The self-reports show that Stage 2 is another crucial stage in the search model. This is the point at which users determine which search strategy to take, based on their interpretation of the search task and their task knowledge. Stage 2 reveals the effect of the variable factors, search task, task knowledge and training, on users' subsequent behaviour patterns as they refine their search.
Stage 4: Process and evaluate search results
Whilst browsing, users indicated that they evaluated the search results against their search criteria in Stage 2 to see if the outcome matched their search requirements. If there were no matches, they continued browsing through the remaining search results. If users did not find what they were searching for, or if there were too many search results, they either reformulated their search strategy in Stage 2 or used a refined search criterion. Thirty-nine users (98%) refined their search criterion by filtering (50%, 20 users) and/or sorting (23%, 9 users) their search results. Users were also observed refining their search by changing their selection of metadata fields and varying the search criteria terms in the metadata fields. These latter search activities exhibited the trace and vary search tactics reported by Bates (1979: 208). Bates (1979) defines the:
- trace search tactic as 'to examine information already found in the search in order to find additional terms to be used in furthering the search' (p. 208); and
- vary search tactic as 'to alter or substitute one's search terms in any of the several ways' (p. 208).
Stage 7: End search
The research indicated that users either closed or stopped their search. The aim is for users to be able to close their search as it indicates their success in finding and retrieving their sought information. Hence, it is important to understand why users stop their search.
All forty users stated that a common cause for stopping their search was that possible options at their disposal had been exhausted without finding the required information. These exhaustive search options included using preferred search methods to seek information from the system or accessing their task knowledge to refine their searches by varying the words in the document title or other metadata fields. After this initial phase of searching, users reported they were usually satisfied with their attempt to find the information even though they were unsuccessful. Their confidence in their searching skills enabled them to conclude that if they were unable to find what they were seeking it was time to stop the search. Their view was that the information was most likely poorly titled, misclassified or not registered at all. A similar reason for deciding to stop a search was given by the social scientists studied in Ellis's (1989) research.
Users reported that past information search experiences in these systems contributed to their decision to stop searching further. Previous experience suggested that the information could not be found because there were spelling errors, abbreviations or acronyms used when titling the documents or records during registration. They also decided to stop their searching if they realised the information sought was not filed into a folder they would logically file into or seek information from. Users reported the logic used for selecting folders for filing information differed from that used by others or by the Records Section, implying that the classification schemes were ambiguous and/or subjective. Users were aware that if the document was not found using their preferred search methods, then it was likely that the information was not registered but was stored elsewhere.
Thirty-nine (98%) of the forty users mentioned that the time available to search did not affect their searching. Likewise, they confirmed that they did not apply a time limit when deciding when to stop searching further. Rather, the importance of the work task and/or information need determined whether they continued searching or stopped the search in or out of the system. When presented with a work task and search task, they generally used their preferred search methods based on their task knowledge. If these preferred search methods did not find the information they were after then they stopped their search.
Users who performed the role of personal assistant or Record Focal Point (trained power users, usually secretaries or administrative staff) often conducted searches on behalf of their managers. This group stated that the time available for searching could affect the way they searched, and this was especially so if their managers had imposed time-sensitive deadlines for them to find the information. However, they also mentioned that time did not affect their personal searching. One participant (Administrative Personnel, P13) stated strongly that he had very little time for searching and hence applied a time limit for all his searching.
Users reported a decision to stop searching was also influenced by the following:
- they had exhausted all possible search options known to them and were confident they had been sufficiently thorough with their search methods, but the item still could not be found (100%). Often this occurred when users realised their task knowledge was either inaccurate or insufficient for searching;
- they simply could not find the information sought in the search results displayed (100%);
- they had spent between two to thirty minutes searching (30%, 12 users);
- they suspected the information could be stored elsewhere in other information repositories such as network drives, email systems or other business applications (8%).
Users were observed using two of the five cognitive stopping rules (mental list and difference threshold) observed in Internet users when they decided to stop a search (Browne et al. 2005). These users were aware of the search task and the information they needed to find. They monitored their outcomes (mental list) in order to satisfy their search task before deciding to stop the search (Browne et al. 2005: 92). In the same way, when users had exhausted all their search options and were not learning anything new from their search experience, they decided to stop the search, reflecting the difference threshold stopping rule (Browne et al. 2005: 92). Interestingly, users did not use the satisficing or sufficing rules for their decision to stop their search (Glesinger 2008; Mansourian 2007). Satisficing is defined by Simon (1971: 71) as a decision making process 'through which an individual decides when an alternative approach or solution is sufficient to meet the individuals' desired goals rather than pursue the perfect approach'. Sufficing refers to when individual users decide the information they have gathered for their search is 'good enough' or is 'as good as it gets' (Mansourian 2007; Prabha et al. 2007).
Observations from the research
The research resulted in a number of important outcomes that further enhance our understanding of information search behaviour. These can be broadly grouped into four key issues: the common search processes that operate across different information search contexts; the differences evident in the electronic document and records management system's context; the shifts in strategy that operate in a work-related setting; the influence of system design and support on information search and decision-making. These are each discussed in turn.
This study was aimed at identifying the specific ways in which individuals were likely to search an electronic document and records management system. While existing models of information seeking were initially used as a reference point, they were not taken as definitive. Instead, the research built a composite model based on the reported search principles employed by the forty participants. Of interest, then, was the similarity between the resultant model and the earlier library and information search models e.g. (Ellis 1989; Marchionini 1995; Meho and Tibbo 2003). The participants had clearly built a procedural strategy that was regularly employed to achieve their search goal, with this strategy operating on similar principles to those employed by library or Internet searchers. They were confident in navigating their particular organisational system and in making the necessary judgements as to what the next step should be. They followed a regular protocol and were readily able to determine success.
A similarity observed between library, Web and electronic document and records management system searchers was the goal of identifying a single item to meet the search criteria. However, the electronic document and records management system user has a specific need and is seeking a particular record to address that need, rather than any suitable item. This tailored outcome creates a very similar search context to those encountered in libraries and the Internet where users focus on known items such as authors or titles of specific resources. While this limits the complexity of the search domain, it also places a stronger expectation for accuracy on the search outcomes.
The proscribed nature of the electronic document and records management system generated differences in the search outcomes. Firstly, the search barriers and challenges likely to be encountered during their search were more related to human input errors than the complexity of the search process typical of more open information search architectures (Debowski et al. 2001). Secondly, their levels of task knowledge and success expectations were more stable and predictable, given their work-related knowledge. This contextual experience helped to guide them toward a quick decision as to the viability of the search strategy. Thirdly, the users saw the electronic document and records management system as one source of guidance and showed a willingness to engage with colleagues, other expert sources or other organisational channels should the search prove difficult. Fourthly, of interest was the impact of the organisational context on search practices. The delegation of searching to a personal assistant, for example, distances the searcher from the quest, pushing the search process into a stronger mediated setting, as used to be common in library settings. The limitations of time were more strongly apparent. The participants had strong expectations as to how long they would be willing to expend on the task before they were likely to shift their strategy. Their high levels of work knowledge also informed those judgements, in contrast to library and information searchers who are more likely to seek guidance as to the pool of knowledge from the search repository.
A fifth area of interest from the research is that system design and user education practices were highly influential on the search behaviour demonstrated. Users modelled consistent search tendencies within different organisational settings. In Organisation D, for example, a high proportion of users relied on metadata searching, despite the availability of searching by navigating the tree view of the classification scheme. In discussions with the records managers, it was evident that different emphases in the training of end-users generated different knowledge about the system and its architecture. Where training did not explore a particular search method, users generally ignored the potential of that option. This highlights the criticality of mapping the practices that would best suit users and ensuring they are properly supported in their subsequent exploration of the system.
Significance of the research
This research aimed to address some significant gaps in the theoretical and practical knowledge relating to the implementation and application of electronic document and records management systems. It also explored the ways in which users operate in a real world setting, as opposed to a theoretical laboratory experiment, and mapped the different ways they optimise their search processes to meet their information requirements. As such, it offers a different perspective on how people search for information in applied organisational contexts, specifically, a corporate records management system.
The first theoretical contribution, therefore, is a search behaviour model to explain user search strategies in electronic document and records management systems. This search model fills a gap in the literature. It provides an understanding of seven search stages and varied search activities users engage in and exhibit when they search in the system. The second contribution is the provision of empirical evidence on the effectiveness of using records management principles in system design to enable users to search and retrieve information from the system.
There are three practical contributions to the records management discipline. The first is an understanding and a description of the information search behaviour of electronic document and records management system users when presented with an information need to discharge their business tasks. The search model provides records managers with an understanding of what knowledge workers perceive as a simple versus a difficult search, and importantly what makes a search difficult for them. Records managers can use the search model to find out about users' search behaviour in order to provide specific training, or to work out strategies to improve the delivery of their records management services to users.
The second contribution is to provide records management professionals with different suggestions to better manage the delivery of records management services to users, with the aim of improving users' information search and retrieval experience working with the system. The importance of records management and system training, and how these improve knowledge workers' search and retrieval experience, is highlighted. These strategies have been outlined in detail in an earlier paper (Singh et al. 2008a).
The third contribution is an awareness of how to design systems that are in line with the search behaviour characteristics of users. The findings in this research indicate that it is best to design systems with both a tree view and a virtual database view. This ensures an efficient and successful information search and retrieval experience for users who prefer to search using metadata fields and those with a preference to search by navigation.
Finally, the fact that this is an empirical research conducted in a real business context, unlike most laboratory-based research, provides valuable accurate insights into knowledge workers' search behaviour in electronic document and records management systems.
Limitations of the research
Only four Australian government organisations were sampled regarding their implementation of the eight records management principles stated in ISO 15489. Perhaps a richer data set could have been obtained if more organisations from diverse industries, countries and the private sector were studied. Government organisations are generally subjected to a higher level of legislative compliance and accountability compared to private enterprises; as such they are more likely to have comprehensive records management practices. The findings from this study therefore might not be reflective of records management practices and system usage in private organisations. However, given the research method, it was necessary for the data collection to be conducted at the premises of the organisations, and a rigorous selection process was employed. These design limitations do not limit the generalisability of the research findings to Australian organisations. The systems studied (HP TRIM, e-Docs and Objective) are marketed and implemented internationally: customer listings at the vendor Websites testify to their international usage. The same applies to the records management principles used by the participating organisations, which are benchmarked against ISO 15489.
Likewise, the sample size of the users may be perceived as a limitation. However, Ellis (1989) had a sample size of forty-seven users, and Meho and Tibbo (2003) had sixty: the sample size of forty users for the current research is justified. Moreover, other scholars investigating search behaviour cited in this research report using sample sizes of twelve (Branch 2002: 14), thirteen (Ingwersen 1982: 173), thirty-six (Debowski 1997); and thirty-nine users (Byström 2002: 583).
In total this research investigated the search behaviour patterns of forty users, but 104 flowcharts were developed in total from the self-reported and observed simple and difficult search behaviour. This reflected a similar sample size to other studies. A sample size of forty users justifies the derivation of the search model, but it is acknowledged that an increased sample size would enhance the credibility of some data, such as the percentages cited of users' reported responses about the factors that caused search difficulties.
This research focused only on the last simple and difficult searches each user could recall. It would have been beneficial to extend the observations to include perhaps the last three simple and difficult searches of each user, or to have asked users to keep a journal of their simple and difficult searches for a period of two weeks. This would have provided a much richer data set and provide an in-depth insight into each user's search behaviour over a range of tasks. However, it would have required great commitment from users already facing busy work schedules and obligations. There are also possible limitations in the use of recall to capture typical searches.
Search software could have been installed in users' desktops to log and record their search history. This would have assisted in monitoring the search terms users entered and in recording their search activities, and provided richer data. However, this was not done as the four organisations had stringent policies about non-approved software.
There is considerable scope for additional research in this area. This first investigation would benefit from further research on how training, tasks and preferred search styles affect the search behaviour of users. Further research is also recommended on the affective behaviour of users, that is, their feelings, emotions and responses while conducting a search, particularly where the difficulty of locating suitable resources is high (Debowski et al. 2001).
Influence of task and training on search behaviour
An important issue arising from the research related to the extent to which users' specific work and search tasks and task knowledge influenced their search behaviour. The preconditions relating to a search in terms of prior knowledge, experience and the work environment in which the search is operating have been little explored in the literature but appear to be fruitful areas for further investigation. Additionally, there is considerable need to further investigate the modes and outcomes of user training on their success in searching.
The variances in search approach that emerge as tasks become more difficult still remains largely unexplored and would warrant further investigation, particularly with respect to the support and training effort.The preferred search style of users is a combination of the personal search styles that an individual user either already possesses for information search or adopts following training. Future research is recommended on how users develop their preferred search styles. An individual's style may be developed through routine and repetitive performance of search tasks in the current work role, while working with other information sources, after exposure to records systems in previous jobs, or while using search engines such as Google to search the Internet or Intranet. It is unclear whether preferred search styles are developed from users' training or in other ways. More in-depth research to explore the core models that are employed by users across different search environments would be timely, given the widening sphere of search activity that operates in work, personal and social settings. The interviews with the users indicated they had individual search preferences using shortcuts, navigation to folders, or metadata search options. Hence, future research on whether users' preferred search methods influence their search behaviour is recommended.
Research on users' experience working with records classification schemes
It was not the aim of this research to focus on the effectiveness of thesauri like the Keyword AAA (Accuracy, Accessibility, and Accountability) (New South Wales 2000) or the Keyword for Councils (New South Wales 2001) widely used for classifying records in Australia, but the findings reveal that users in the studied organisations had difficulties working with these tools. It is recommended that future research be conducted on whether and how users search and retrieve records using these tools. In general, further research on the value of classification schemes and thesauri seems warranted, particularly given the predominance of metadata searching among electronic document and records management system users.
Hierarchy of the electronic document and records management system as the information source selected
Organisations usually have different information source options available to staff to search for required information. Given that the users in this research selected the electronic document and records management system to conduct their information search, it was not possible to determine if it was their preferred information source in cases where the information was also stored elsewhere (e.g., network drives), or where in the hierarchy of user's preferred information sources the electronic document and records management system fitted.
It would be useful to investigate the decision strategies users undertake as they choose their information sources (Newell and Simon 1972). Linked to this, it would be interesting to find out how federated searches or enterprise search engines which enable one-stop searching across different business systems in the organisation would affect users' search strategies and behaviour (Broder and Ciccolo 2004; Hawking 2004; Mukherjee and Mao 2004).
It is also worth investigating the effectiveness of enterprise search mechanisms. Will they add value to knowledge workers' search experience or will the situation be similar to search experiences on the Internet that result in information overload (Broder and Ciccolo 2004; Hawking 2004; Mukherjee and Mao 2004)?
The results reported in this article offer a detailed and explicit model of how users search for information from an electronic document and records management system in order to address their work related information needs. These systems are centred in the records management discipline and are designed to implement records management principles from the ISO 15489 standard (International Organisation for Standardisation 2002a, 2002b). The records management domain in which search is done is different from the library domains and the Internet. As such, the search behaviour model in electronic document and records management system captures a different context from library-based or Web-based information seeking behaviour models. While this search model identifies seven stages through which users progress, it also illustrates commonalities with several previous information seeking models (Ellis 1989; Marchionini 1995; Meho andTibbo 2003).
The development of this model and exploration of an applied search context offers a new view of the ways in which earlier research can be applied and used to better understand the ways users operate in a variety of organisational settings. It highlights the need to be sensitive to the particular search context and its implications for system design and user education.
We thank the referees, editor (Alastair Smith) and copy-editor (Amanda Cossham) for assisting with improvements to the article. Special thank you to Mr. Thomas Benson-Lidholm and Alastair Smith for assistance with adapting the paper to the Information Research HTML template.
About the authors
Pauline Joseph (PhD) is a Lecturer in Records and Archives Management at the Department of Information Studies at Curtin University. Pauline completed her PhD at the University of Western Australia in 2011. Her PhD research is titled “EDRMS search behaviour: Implications for records management practices”. This study investigates the efficacy of electronic document record management systems (EDRMS) in enabling effective capture and dissemination of corporate information. The thesis examines the degree to which these systems are designed in accordance with the records management principles outlined in ISO 15489 support the effective retrieval of records by knowledge workers. Pauline's research interests are in the areas of design and implementation of EDRMS, information-seeking behaviour of knowledge workers; training and education of RIM services and programs for both knowledge workers and for the RIM profession. Recently, SharePoint 2010 has been added to the list, too. Pauline Joseph is the corresponding author and can be contacted at: email@example.com. Note that Pauline's surname was previously Singh.
Shelda Debowski(PhD) is Deputy Vice Chancellor, The University of Notre Dame and previously, Winthrop Professor of Higher Education Development at the University of Western Australia. She has published extensively in the field of knowledge management and user behaviour in information search contexts. She has a longstanding interest in the information sector, having initially worked as a librarian and information science educator for many years. Her recent roles included oversight of oragnisational creative, service delivery and capacity building initiatives.
Peter Goldschmidt (PhD) is a Professor in Information Management at the University of Western Australia Business School. He is currently working in the areas of Knowledge Management, Decision Support, Agent Technology, and Artificial Intelligence Applied to Business, Compliance Infrastructure and Asset Management work flow decision support. Since completing his PhD, Peter has investigated and proposed new approaches to support Compliance Monitoring for Anomaly detection (CMAD) in Complex Environments. Since then, Peter has extended this research from compliance monitoring of stock exchange transactions (Stock Market Surveillance for compliance) to areas such as the energy and petroleum industry; asset management in engineering; aerospace and defence.