METHODOLOGICAL ISSUES IN USING GROUNDED THEORY TO INVESTIGATE INTERNET SEARCHING

By Greg Hale and Nicola Moss

Department of Information Studies, University of Sheffield, Western Bank, Sheffield, S10 2TN, UK.

E-mail: ,

Paper Presented at the European Conference on Educational Research, Lahti, Finland 22 - 25 September 1999

ABSTRACT

This paper explores methodological issues in the use of grounded theory to investigate Internet searching [1]. The project is in the first phase and has two strands. The quantitative strand is investigating links between cognitive style and choice of Internet strategies in the context of a novel methodology that has been developed (Moss and Hale, 1999). The qualitative strand is investigating personal meanings and strategy-tactics in Internet searching in the context of a newly developed model of Internet searching (Hale, 1999; Hale and Moss, 1999). The project is funded by the Arts and Humanities Research Board, United Kingdom and runs from May 1999 to the end of April 2000.

This research is founded on the belief that language has a multiplicity of internal meanings and gradations, with zones of indeterminancy as well as shared meaning. The paper will use the findings from the research to illustrate issues of practice and methodology, in the context of the qualitative and quantitative investigation of Internet searching.

The quantitative and qualitative strands have independently grown nearer each other in their emphases, both drawing on a grounded theory paradigm. The upshot of this has been the first stage in the development of a new model of Internet searching and the first stage in developing a new methodology drawing on both quantitative and qualitative methods- a bi-modal approach. This paper particularly focuses on these developments.

INTRODUCTION

The Internet Searching Project is based at the Department of Information Studies, Sheffield University. The project is funded by the Arts and Humanities Research Board of the United Kingdom and runs from May 1999 to the end of April 2000. Further details can be found at the project website, (open 1st October 1999, A major aim of this research is to explore novel relationships about Internet searching, leading to new grounded theoretical understandings.

The project has both quantitative and qualitative strands. The quantitative strand is investigating links between cognitive style, individual differences and choice of Internet strategy. The qualitative strand is using a grounded theory methodology to investigate emergent issues related to Internet search strategies.

There are three approaches to language that could have informed the research. The first is a positivist approach where language is regarded as fixed and can be checked against canons of meaning and usage. This position would seem too static to accurately reflect what we know about the flexibilities of language. The second is a post-modern approach where language is such a ‘wild profusion’ that little can be said about it (Scheurich, 1995). Here language is seen as entirely local in meaning. In light of the fact that human beings spend a lot of time successfully communicating, this position seems difficult to substantiate. The third position recognises that language has both fixed and flexible dimensions. There are core meanings that are widely understood and other meanings understood only by groups or individuals. The perceptions of subjects undertaking Internet searches are likely to share both of these aspects of language. The Sheffield University Internet Searching Project explicitly recognises that language has both regularities and inconsistencies and that conventional signs have a multiplicity of internal meanings and gradations, with zones of indeterminancy as well as shared meaning i.e. that concepts are hazyspaces (Hale, 1998).

The emphasis of the project is constructivist and based on individuals. Therefore a grounded theory methodology (Glaser and Strauss, 1967) was appropriate, "...a qualitative research method that uses a systematic set of procedures to develop an inductively derived grounded theory about a phenomenon." (Strauss and Corbin, 1990, p. 24). The choice of this methodology ties in with the commitment to the process of developing emergent theory, even if the theory so generated ‘conflicts’ with findings from the other strand of the project. In the event, both strands have independently drawn on methods of the other paradigm and the focus on emergent theory and methodologies have informed both strands. It should be noted that grounded theory is itself developing and fracturing into different varieties (Parker and Roffey, 1996) from its original formulation in 1967.

Qualitative investigations are increasingly being used to investigate Internet searching and related behaviours, for instance, in business people (Correia and Wilson, 1997) in computer scientists in British and Greek universities (Siatri, 1998) the learning opportunities of young people (Pickard, 1998) and modelling users' successive searches in digital environments (Spink et. al. 1998). The project uses a grounded theory approach to explore users' strategies in Internet searching, which has, through a developing emphasis (see below) informed both strands of the research.

Grounded theory has clear ramifications in the data collection and data analysis/ theorisation phases of any research project. The issues of practice and theory are now highlighted for each of these phases, for the qualitative and quantitative strand of the Internet searching project respectively.

ISSUES OF IMPORTANCE IN THE QUALITATIVE STRAND

This section examines issues of importance in the qualitative strand of the investigation of Internet searching, including a personal reflection on the data theorising (principal investigator, Greg Hale).

Issues of importance in the data collection phase

The qualitative strand used iterative informal interviews to explore the individual meanings of five participants who undertook Internet searches (in-system interviews, using the 'think aloud' method, Ericsson and Simon, 1984, see also Berg, 1998) and in a situation where participants were not using the computer to undertake Internet searches but reflecting back on recent searches and their search strategies and tactics generally (out-of-system interviews). The out-of-system interviews avoided system constraints or cues and provided a wide and coarse grained investigation of Internet searching in the context of participants' individual lives, information needs and information seeking behaviour. In the in-system interviews participants undertook searches using information problems they were facing, using a search engine of their choice. Both sets of interviews were recorded by tape recorder, supplemented as necessary by the interview notes. In line with the emphasis on theory development through iterative work on the data, the temptation was resisted to complete many interviews in as short a time as possible, with insufficient regard to iterative theorising. This was important since this phase was explicitly explorative, so as to inform the main phase of the research and ensure that new theory was purposively developed. A crucial outcome of this phase was to narrow the focus of the investigation to facilitate this development of influential theory, since the field for possible theory development is very wide and it was important to focus on areas that are currently under-theorised.

Iterative theme development takes place within the interview, facilitated by reflection back to the participant for elaboration, an interview stance of dispassionate involvement (drawing partly on feminist insights into the collaborative nature of the interview process, Olesen, 1998; see also Puwar, 1997 on feminist interviewing in action) whereby the interviewer is explicitly active in the interview without compromising the meanings and structures of what the participant is saying. The nature of this active involvement is therefore at the emotional non-verbal level and within the reflection for elaboration. It is not engagement in the active nature of the interview (as a dyadic interaction) which sacrifices the focus on themes emerging from the participant alone.

Notes made during and after the interview (post interview reflection) were used to identify developing themes, facilitating the development of theory after the first cycle of the interviews and informing the next phase of interviews. However notes during the interview were made sparingly because of the intrusive nature of writing the notes. Because issues arising from the interviews are developed iteratively, the interview data becomes denser as the interviewing cycles proceed.

Personal reflections on the data theorising phase

Theoretical annotation and coding, choice of analytic tools (such as diagramming etc.) as appropriate (researcher as bricoleur) are important, leading to iterative and explicit theory development. The work of theorising was helped by exploring theoretical possibilities by writing theoretical notes, diagrams and flow charts. These identified and developed ‘proto-theories’ in the data that may be developed into theories as the data is configured into a dense theorised format. The process of compiling these documents further developed theoretical sensitivity, feeding back into the process at each phase that follows.

As I've interacted with the data, so I've struggled to articulate and re-formulate the data in ways that have heuristic value and interact with existing theory as well ('theoretical sensitivity', Glaser and Strauss, 1967). The first attempt was a process model that sought to define everything relevant in the transcripts, displaying it in a diagrammatic flow chart. This was a useful starting point, because search sequences do have in some measure a logical and temporal structure. However, as a means of gaining theoretical and heuristic insight the value of the flow chart may be limited because of the unwieldliness of the final diagram, the atheoretical diagramming of elements (i.e. the representation may be too strongly structured in logical and temporal terms) and the necessity of expanded and highly complex diagrams of actual searches. This process description probably has value at the 'Event' level of the Context-Action-Event model (see below) that is currently informing the project. The halfway point to this was to draw on a systems analysis approach. The problems with this were two fold. First, it was unclear what level should form the top level. Second, once this was decided, this had clear ramifications for the interrelationship of elements and the decomposition down levels, leading to artificial or inappropriate representations of elements. Finally, after much thought, incubation of ideas and reading (particularly around searching in business people) a model was developed which seemed to mesh with some of the theoretical work, whilst bringing in the neglected issue of the wider contexts of searching. As I made reference to pre-existing experimental and theoretical work certain themes became salient, allowing the development of the Context-Action-Event model, with the possibility of the process diagram feeding in at the Event level. This model should explicitly be regarded as provisional, exploratory and available for challenge. It is intended to develop this model further in the main phase of the research.

I started off wanting to find out more about searching, viewing it as an un-obvious activity. I was interested in the notion that adaptive strategies will be those that can be re-used or called upon with minimum effort and maximum utility- pragmatic laziness, or maybe cognitive effectiveness. Maybe there are strong contextual cues in the computer interface that suggest to a human being to select and use certain behavioural scripts, for instance related to understanding and recognition. Cognitive-behavioural sequences can of course be highly useful tools, allowing users' to manage large amounts of information and complex behaviour very readily, as long as the behavioural sequence fits. But users' may have idiosyncratic methods of judging this. The user has information needs which he or she needs to fulfil. Based on interior past experience/ exterior recommendation, together with effort minimisation, the user invokes these cognitive and behavioural sequences. If the user finds the information that he or she wants, then all is well. The behavioural sequence is a success and may be the first choice for use in a recurring similar situation. If the cognitive-behavioural sequence fails, then the user has to engage more closely with the problem, start to hypothesise about the system operating parameters and the problem space (see Hale and Moss, 1999).

When searches do not immediately bring up a clear 'right' answer they throw up a hazyspace- a conceptual space in which a few items may be recognisably relevant, many others are not clearly one or the other. The user now has a choice- view those items that look more relevant, or reformulate the query. It might be speculated that the user will check the results (though there are system constraints as to how much is shown and user issues to do with choice of going in to screens beyond the first few, speed of access issues etc.): even if there are thousands of items the user will probably check the first couple of screens till relevance sharply declines. At this stage the user has to reformulate the query- the level of engagement with the problem space may determine whether they choose a radically different set of search terms or use a variant or addition (agglomeration) of the existing search.

In all this theorising I wanted to try and preserve the sense that searchers conceptions of the search and system are hazyspaces, with all sorts of 'messy' and/or indeterminate behaviour related to themes such as 'superstition' (new users particularly), exploration and familiarity etc. without losing sight of the fact that Internet searching takes place within a context. The most recent formulation seeks to take account of this (Hale, 1999; Hale and Moss, 1999).

ISSUES OF IMPORTANCE IN THE QUANTITATIVE STRAND

This section examines issues of importance in the quantitative strand of the investigation of Internet searching, including a personal reflection on the data theorising (principal investigator, Nicola Moss).

Personal reflection on the data collection phase

Fifteen people agreed to participate in the study. They were asked to carry out three different activities. The first was an extended questionnaire which appraised study skills, problem solving style, Internet use and experience. The second activity was the undertaking of the computer based Cognitive Styles Analysis program (Riding 1991; 1994) [2]. The third consisted of three pre-determined Internet search tasks [3] using AltaVista. One hundred and fifty nine episodes were collected across the three tasks and a JavaScript based HTML form was used to collect the data. The data indicated the presence or absence of Boolean operators, keyword or combined terms; dates (if used in the query); and feedback from participants on a number of issues [4]. All this information was e-mailed to the principal investigator after each search episode.

The team thought it most appropriate to collect the data in sessions with groups of participants. However, I found that a problem was created by the fact that participants arrived piecemeal, so this method was not as successful as anticipated. In addition, some participants were very quick to complete the activities, so in the end I introduced the activities individually to each student.

My original intention was to run a Word-based version of the questionnaire, so that the all data was collected electronically. Unfortunately, this was too time consuming for the participants and therefore only the first four participants used the Word-based format, the rest used pen and paper.

Personal reflections on the data theorising phase

The original aim of this strand of the project was to focus on testing a series of hypotheses deduced from existing theory, using a quantitative approach (correlation and factor analysis). However, the open, exploratory nature of this initial phase of the study allowed for closer links between the two strands than had been anticipated (at least in terms of the methodology). In part, the quantitative strand has borrowed from the brief of the qualitative strand (generate new theory using a grounded theory-based analysis), because of the advantages a synthesis of the two approaches brings, resulting in bi-modal analysis.

The brief of the quantitative strand was also to illuminate phenomena (themes) within the data, the 'why' component of which could then be examined in the qualitative strand. I hypothesised that a grounded approach, which enables themes to emerge from the data, was the most appropriate methodology to adopt in order to fulfil the brief. This approach also allowed for a more in-depth analysis, which led to the identification of themes not previously identified in the literature on Internet searching behaviour.

The first stage of the analytic coding procedure was to identify the themes on which the data was to be coded. This required me to become immersed in the data, thus borrowing from more qualitative approaches, such as IPA (see note [5]). Following in-depth reading of the data, and also based generally on theoretical sensitivity related to Internet searching and information retrieval, quantitative codes were determined. These codes were numerically based, in that the coding involved a simple frequency count of the data. In coding the data I was not therefore required to interpret the data at too deep a level - i.e., all the codes were straightforward and data could be extracted quickly and unambiguously. Whilst the qualitative codes were determined in the same fashion (although involving a longer and more detailed process of development), coding the data using the themes required me to engage more with the data and to interpret the data in light of the codes. This therefore required the definitions of the themes to be much clearer and more in-depth.