Individual paper submission for the British Educational Research Association (BERA) Conference at Lancaster, September 1996

Computer-aided qualitative analysis of interview data: some recommendations for collaborative working

Steve Higgins, Kate Ford, Iddo Oberski,

University of Newcastle upon Tyne

Abstract

In a research project on the concerns and achievements of newly qualified teachers we used a qualitative data analysis software package for the Macintosh computer. This package allows the storage of documents such as interview transcripts, the coding and indexing of text-units and provides a tool for establishing and refining categories within data. However, although a computer-aided analysis dramatically decreases the time conventionally needed for the cutting, sorting and pasting of interview data, it poses several challenges when used in a collaborative context. In this paper we would like to discuss some of the practical, methodological and ethical considerations involved in using the package collaboratively and provide some basic recommendations for successful implementation.

Introduction

In a research project on the concerns and achievements of newly qualified teachers we used a qualitative data analysis software package for the Macintosh computer. The process of working as a small collaborative team on a research project which used computer aided data analysis was new to all of us and we thought it might be helpful to record our experiences in three main areas, practical, methodological and ethical. In this paper we hope to explore where the two sets of constraints intersect and elaborate where we have found issues that were particular to collaborative computer aided qualitative data analysis. What follows is a brief outline of the research project and its methodology then an account of the particular practical and methodological issues we encountered. Next there is a presentation of some particular issues relating to collaboration and finally a brief conclusion. In the appendices we offer our own working protocols and agreement, some references to further on-line resources as well as a more general computer aided qualitative data analysis bibliography. As the project unfolded there was little to emerge under ethical considerations that would not have arisen in any similar research. Some ethical problems of confidentiality and access to the data arose but as a consequence of using a word processor rather than the data analysis package. In the other areas the constraints of working collaboratively on the one hand and with a computer program on the other to analyse qualitative data provided two aspects to the research project that were particularly challenging.

Project description

The research described in this paper forms part of a broader focus of evaluation of the 'New Teacher in School' course first run at the Department of Education at Newcastle University in 1994/5. It was evaluated through a combination of quantitative and qualitative methodologies which also considered wider aspects of the support experienced by newly qualified teachers (Ford et al. 1996). Some of the findings of this wider research are presented in another BERA paper (Ford et al. 1996b).

The research process described here derives mainly from the analysis of semi structured interviews with the recently qualified teachers who were course participants. It is an analysis of the qualitative data about their school based and Local Education Authority support, the course and other support, and about their concerns and achievements after one month and one year of teaching. Initial quantitative data was also analysed from a survey and questionnaires with participants. The interviews after the end of their first year provided validation of these data and a more detailed picture of the teachersÕ experiences.

The tapes with the recorded interviews were transcribed and the transcripts were coded into categories which were descriptive and interpretative (Miles and Huberman 1994) by a combination of manual and computer-aided methods.The particular computer program we used was called NUD¥IST (which stands for Non-numerical Unstructured Data: Indexing Searching Theorising: Richards and Richards 1995) and it supports the development of hierarchical categories of coding. Although NUD¥IST also supports memoing, we decided to keep a separate on-line logbook in a standard word processor for the purpose of developing concepts and methods collaboratively. As all three researchers were involved in the analysis of the data, we agreed that data analysis would proceed in pairs, and that no-one would work on the data singly without explicit prior agreement. Part of the reason for this was because of the physical difficulty of enabling three people to use a computer at once, though we also wanted to ensure in the early stages that we also each developed our skills with this particular software package as well as the concern that we coded sections of the interview transcripts in the same way. This became especially important while we were using the software, and effectively prevented the development of three partially independent projects, which might have developed if each researcher had worked individually. This agreement developed into a set of protocols (Appendix 1) which evolved with the project and represent an attempt to keep a record of how we decided to work. Working in pairs also proved to be valuable for other reasons which are explained below.

We used a methodology of grounded theory (Glaser and Strauss 1967) and progressive focusing. We started the analysis with the longest and most complex interview by following the advice to "categorise richly and to code liberally" (Richards and Richards 1995) As anticipated, a large number of categories resulted, as we literally indexed everything at this stage. After this, we reviewed our procedures and objectives and agreed what aspects of the data should be looked at in more detail. We then used the now existing categories to manually and individually code two further interviews in a much more focused way, only coding what we thought was relevant to our now much more clearly defined research objectives. At our meetings we then compared our individual coding of the same scripts and entered agreed coding into NUD¥IST. Informal assessment of reliability suggested that there was substantial agreement on what needed to be coded, and almost as much consensus about how it needed to be coded. We now reviewed our analysis methods and objectives again and decided it was justified to allow some individual working at this stage of the project.

Practical and methodological issues

Using the computer

Initially many of the problems we encountered reflect our unfamiliarity with the software, such as ensuring transcripts had a format acceptable to NUD¥IST. We had to work out a style of working which suited each of us as well as a manner which fitted the program. Many early problems and difficulties were ironed out by setting down detailed descriptions (Appendix 1) of how documents should be formatted and introduced to NUD¥IST and more rigorous checking of documents before using the computer for analysis. Our log entries show some of the reasons for this.

18.12.95

Iddo and Steve

Introduced 5 staff interviews. Indexed as a node (Raw data files /Staff).

Discussed working protocols and making back ups.

Agreed that documents should be saved as text-only line-break ONLY AT THE VERY END!!

31.1.96

Steve and Iddo discovered that the documents introduced on 30-1 were formatted incorrectly, and had to delete and reintroduce reformatted documents. Another mistake was found in the Caroline doc (* missing) which had resulted in a section being skipped. We had to mend the original doc, delete it from the project, reintroduce it. Then delete the Node with SteveÕs questions and rerun parts of the command file to recreate that node. Then we realised we had only partly recreated that node, namely only for the Teacher docs, and thus had to repeat the exercise for the Staff docs. We are now confident that everything is as we thought it was yesterday (2.75 hours later)! (Argh).

An important part of this working agreement (Appendix 1) was establishing guidelines for where the data was kept, who was responsible for backing up the project and who had the most recent version. We were fortunate in being part of an Apple network so that it was possible to designate shared folders that we could all access. Even this did not solve all of the problems we encountered and persuading the University computing services that we needed a joint account with a shared password that we could each connect to from our machines was a hurdle we eventually passed. A minor inconvenience was not having a copy of the software on all of our machines however as we had decided to work in pairs for methodological reasons we largely able to bypass this problem. Working in pairs meant that one of the pair had always been at the last computer session and was there to explain what the log meant. Re-reading the log in this way at the beginning of the session was also valuable as a kind of stimulated recall. We tended to work in a rolling pattern (Kate and Iddo, Iddo and Steve, Steve and Kate) though sometimes our schedules meant that an individual might miss a session occasionally. When this happened the other pair kept working. At the beginning of the project working in pairs was a valuable way of ensuring that our skills in using the software developed steadily as well as a way of developing a shared understanding of the categories we were developing in our analysis of the data. Early on however we did note that it was time consuming and that it was hard to read, discuss and code the transcripts all at once.

13.2.96

Kate and Steve

Read through Brian's transcript on screen - did not index anything but raised several issues. Decided that we need to examine transcripts individually and bring some first thoughts about categories even at this initial stage. Two reasons 1. we found it hard to generate categories on screen when 'new' to the data 2. Also helpful not to have ideas influenced by others as they develop or to be overwhelmed by the group dynamic.

Despite reaching this conclusion in February we did not actually follow this suggestion until late April when we began printing off the next sections of transcripts for coding with a list of all the current categories in preparation for the next session as part of our working agreement.

Using paper

As the project developed we found we were increasingly using hard copy print outs rather than relying on the material on screen. This was partly due to practical reasons in that we did not all have a computer capable of running the software. Also NUD¥IST does not allow you to see or print out the whole "tree" of categories at once and as we developed and refined categories we found we continually needed to print out a list of all of the categories in the project so that when we were coding we had a list to refer to. Also our working agreement changed in that after the first few transcripts we prepared sections individually on paper then discussed our ideas and coded in pairs at the machine. At the start of the project we had anticipated we would not need to rely on so much paper. However as the project evolved we used an increasing amount as the extract from the log below shows.

22.4.96

Kate and Steve. Getting to grips with the research again. Printed and re read log. Tried to print out sections of the tree to familiarise ourselves with the categories and had the machine crash TWICE when attempting to print (2 2) "Affective". Drafted the categories onto the board so that we could print out a copy and review them. Decided that we should agree a (short) section of text to index by hand then discuss at the computer when working in pairs. We suggest that the person working in the next pair is to e-mail relevant section details for discussion and copy memo to third person as part of working protocol. Steve offered to design a sheet with the full tree on to keep beside the computer when working. DID SOME INDEXING!!! ... Indexed up to 989 which is where it gets exciting.....!

A peculiarity of NUD¥IST, mentioned above, is that it is not possible to see or print out the 'tree' showing the hierarchy of categories easily. As our tree grew we found it was necessary to print out a list of the relevant categories both to facilitate the process of coding and to remember the range of categories we had developed. This reliance on paper has had an interesting echo in a recent mailbase discussion about the advantages of paper in which the main arguments were centred around the attraction of remaining with the more physically appealing and aesthetic "interface" of paper (Qual-soft mailbase posting,1996: for details of how to get access to these e-mail discussions and other electronic sources of information about computer aided qualitative data analysis please see Appendix 2).

More efficient?

The change in our working pattern just mentioned was largely due to the time it took to read, discuss and agree coding at the computer. In the beginning this was inevitable as we needed to agree on our categories and we were determined to code everything. We had also decided to work in alternate pairings so some time was spent each session in bringing the person who had not been at the previous session up to date. However in order to work through all of the interviews we needed to speed up the process and so began preparing sections of scripts in advance then coding them at the computer in pairs and resolving any differences. We found then that there was substantial agreement in assigning sections of the transcripts to categories. Even so the whole process was very time consuming. In the first six sessions we had averaged less than half a line a minute. Part of this was due to using the computer and working through the agreed procedures for making back ups and keeping the log up to date. Certainly the flexibility to index on screen then collate and cross reference indexing was less time consuming than going through the same process on paper with pen, scissors and glue. When looked at in the short term the time saved on this aspect of the research process might not be sufficient to make up for the extra time spent in getting to grips with the software. However the flexibilty in working with the data and skills gained which can be used in the future suggest that this is an approach which will have long term rather than short term benefits.

Reliability and validity

Early log entries also show a developing focus on methodological issues and a clear separation from practical or technical problems.

30.1.96

Iddo Kate and Steve, joint session. Introduced first batch of NQT interviews indexed at AllData node. Questions and responses for K and S added to existing Questions node and then new node created for Teacher's answers. Discussed hierarchy of tree and usefulness (or otherwise) of having questions and answers separately available. Started to consider some methodological issues around interview length and percentage of talk by interviewer and respondent. Gone for shallow tree (no depth!) for ease of introduction at this stage and perceived limited use of the data in this form. Dated back up made to Iddo's machine.

The period of time we spent working in pairs early on in the project gave us an opportunity to assess inter rater reliability for coding the transcripts. However the collaborative nature of the working arrangement also meant that to some extent at least this reliability was supported and perhaps constructed by our regular discussions. By this we mean that any increased inter rater reliability was an unsurprising development in that we spent several months discussing the definition of new and emerging categories so our accuracy in assigning sections of transcripts to those and subsequent categories was to be expected. However we would also argue that this process of discussion is evidence of greater inter observer reliability (following LeCompte and Goetz's (1982) distinction in that our definitions and their assumptions have been discussed and tested in discussion. Differences in interpretation had to be negotiated (Crow et al. 1992) and the paired sessions in front of the computer regularly provided an opportunity for that negotiation.

1.2.1996

Iddo and Kate - discussion re non verbal behaviour in interviews and the possible impact upon Ôtranslating/understanding' meaning in scripts.

Started to index documents for ÔmotivationÕ aspect/reasons given for coming to the NTIS course. Needs to be completed....

13.2.96

Kate and Steve

Read through Brian's transcript on screen - did not index anything but raised several issues....

15.2 96

Iddo and Steve

Continued indexing Brian and generating new categories. ÒConcernsÓ at first level with "Conflict" underneath. Discussed the difference between conflict within the role of the teacher e.g. between discipline and rapport and between being a teacher and oneÕs personal life...