rigor and flexibility in computer-based qualitative research

13
INTRODUCTION E ffective qualitative data analysis plays a criti- cal role in research across a wide range of disciplines. From biomedical, public health, and educational research, to the behavioral, compu- tational, and social sciences, text data analysis requires a set of roughly similar mechanical operations. Many of the mundane tasks, tradi- tionally annotations performed by hand, are tedious and time-consuming, especially with large datasets. Researchers across disciplines with qualitative data analysis problems increasingly turn to commercial-off-the-shelf (COTS) soft- ware for solutions. As a result, there is a growing literature on computer-assisted qualitative data analysis software (CAQDAS), with Fielding and Lee’s study (1998) serving as an enduring land- mark in the canon. These various COTS packages (eg NVivo, ATLAS.ti, or MAXQDA, to name just a few) evolved from pioneering software development during the 1980s, with the first programs avail- able for widespread public use early 1990s (Bas- sett 2004). Software tools provide a measure of convenience and efficiency, increasing the overall level of organisation of a qualitative project. Researchers enhance their ability to sort, sift, search, and think through the identifiable pat- terns as well as idiosyncrasies in large datasets (Richards & Richards 1987; Tallerico 1992; Tesch 1989). Similarities, differences and rela- tionships between passages can be identified, 105 Copyright © eContent Management Pty Ltd. International Journal of Multiple Research Approaches (2008) 2: 105–117. Rigor and flexibility in computer-based qualitative research: Introducing the Coding Analysis Toolkit CHI -J UNG LU Department of Library Information Sciences, School of Information Sciences, University of Pittsburgh, Pittsburgh PA, USA STUART W SHULMAN Director, Qualitative Data Analysis Program, University Center for Social and Urban Research, University of Pittsburgh, Pittsburgh PA, USA ABSTRACT Software to support qualitative research is both revered and reviled. Over the last few decades, users and skeptics have made competing claims about the utility, usability and ultimate impact of commercial-off-the-shelf (COTS) software packages. This paper provides an overview of the debate and introduces a new web-based Coding Analysis Toolkit (CAT). It argues that knowl- edgeable, well-designed research using qualitative software is a pathway to increasing rigor and flexibility in research. Keywords: qualitative research; data analysis software; content analysis; multiple coders; annotation; adjudication; inter-rater reliability; tools Volume 2, Issue 1, June 2008 INTERNATIONAL JOURNAL OF MULTIPLE RESEARCH APPROACHES

Upload: others

Post on 12-Feb-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

IINNTTRROODDUUCCTTIIOONN

Effective qualitative data analysis plays a criti-cal role in research across a wide range of

disciplines. From biomedical, public health, andeducational research, to the behavioral, compu-tational, and social sciences, text data analysisrequires a set of roughly similar mechanicaloperations. Many of the mundane tasks, tradi-tionally annotations performed by hand, aretedious and time-consuming, especially withlarge datasets. Researchers across disciplines withqualitative data analysis problems increasinglyturn to commercial-off-the-shelf (COTS) soft-ware for solutions. As a result, there is a growingliterature on computer-assisted qualitative dataanalysis software (CAQDAS), with Fielding and

Lee’s study (1998) serving as an enduring land-mark in the canon.

These various COTS packages (eg NVivo,ATLAS.ti, or MAXQDA, to name just a few)evolved from pioneering software developmentduring the 1980s, with the first programs avail-able for widespread public use early 1990s (Bas-sett 2004). Software tools provide a measure ofconvenience and efficiency, increasing the overalllevel of organisation of a qualitative project.Researchers enhance their ability to sort, sift,search, and think through the identifiable pat-terns as well as idiosyncrasies in large datasets(Richards & Richards 1987; Tallerico 1992;Tesch 1989). Similarities, differences and rela-tionships between passages can be identified,

110055

Copyright © eContent Management Pty Ltd. International Journal of Multiple Research Approaches (2008) 2: 105–117.

Rigor and flexibility in computer-based qualitative research:Introducing the Coding Analysis Toolkit

CCHHII--JJUUNNGG LLUUDepartment of Library Information Sciences, School of Information Sciences, University ofPittsburgh, Pittsburgh PA, USA

SSTTUUAARRTT WW SSHHUULLMMAANNDirector, Qualitative Data Analysis Program, University Center for Social and Urban Research,University of Pittsburgh, Pittsburgh PA, USA

AABBSSTTRRAACCTTSoftware to support qualitative research is both revered and reviled. Over the last few decades,users and skeptics have made competing claims about the utility, usability and ultimate impact ofcommercial-off-the-shelf (COTS) software packages. This paper provides an overview of thedebate and introduces a new web-based Coding Analysis Toolkit (CAT). It argues that knowl-edgeable, well-designed research using qualitative software is a pathway to increasing rigor andflexibility in research.

Keywords: qualitative research; data analysis software; content analysis; multiple coders; annotation;adjudication; inter-rater reliability; tools

Volume 2, Issue 1, June 2008 IINNTTEERRNNAATTIIOONNAALL JJOOUURRNNAALL OOFF MMUULLTTIIPPLLEE RREESSEEAARRCCHH AAPPPPRROOAACCHHEESS

coded and effectively retrieved (Kelle 1997a;Wolfe, Gephart & Johnson 1993). The researchroutine is substantially altered. In the best cases,more intellectual energy is directed towards theanalysis rather than the mechanical tasks of theresearch process (Conrad & Reinharz 1984; StJohn & Johnson 2000; Tesch 1989).

The first part of this article summarises theclaims associated with using software for qualita-tive data analysis. Next, we acknowledge the oft-mentioned concerns about the role of software inthe qualitative research process. We then intro-duce a new web-based suite of tools, the CodingAnalysis Toolkit (CAT), designed to make mas-tery of core mechanical functions (eg managingand coding data) as well as advanced projects task(eg measuring inter-rater reliability and adjudica-tion) easier.1 Finally, we assert that CAQDAStools are useful for increasing rigor and flexibilityin qualitative research. Assuming a user (or team)running a project has a basic level of technicalfluency, the requisite research design skills, anddomain expertise, coding software opens upimportant possibilities for implementing andtransparently reporting large-scale, systematicstudies of qualitative data.

CCllaaiimmss ffoorr bbeenneeffiicciiaall eeffffeeccttss ooffssooffttwwaarreeThe ease and efficiency of the computerised ‘code-and-retrieve’ technique undoubtedly enablesresearchers to tackle much larger quantities of data(Dolan & Ayland 2001; St John & Johnson2000). In many projects, a code-and-retrieve tech-nique provides a straightforward data reductionmethod to cope with data overload (Kelle 2004).For some researchers, simply handling smalleramounts of paper makes the analytical process sig-nificantly less cumbersome and tedious (Lee &Fielding 1991; Richards & Richards 1991a). Nolonger constrained by the capacity of theresearcher to retain or find information and ideas,

CAQDAS creates possibilities for previouslyunimaginable analytic scope (Richard & Richards1991). Furthermore, large sample size in qualita-tive research allows for better testing of observableimplications of researcher hypotheses, paving theway for more rigorous methods and valid infer-ences (King, Keohane & Verba 2004; Kelle &Laurie 1995).

CAQDAS preserves and arguably enhancesavenues for flexibility in the coding process.Code schemes can be easily and quickly altered.New ideas and concepts can be inserted ascodes or memos when and where they occur.The raw data remains close by, immediatelyavailable for deeper, contextualised investigation(Bassett 2004; Drass 1980; Lee & Fielding1991; Richards & Richards 1991; St John &Johnson 2000; Tallerico 1992; Tesch 1991a).Exploratory coding schemes can evolve as soonas the first data is collected, providing anopportunity for a more thorough reflection dur-ing the data collection process, with consequentredirection if necessary. Coding using CAQ-DAS, therefore, is a continuous processthroughout a project, rather than one stage in alinear research method. It becomes much easierto multi-task, entering new data, coding, andtesting assumptions (Bassett 2004; Richards &Richards 1991; Tesch 1989).

Interactivity between different tools and func-tions within the various software packages pro-vides another key aspect of its flexibility (Lewins& Silver 2007: 10). These features open up newways of analysing data. Qualitative researcherscan apply more sophisticated analytical tech-niques than previously possible (Pfaffenberger1988), routinely revealing patterns or counterfac-tual observations not noticed in earlier analyses.Many software packages enable examination ofrelationships or links in the data in ways thatwould not be possible using hypertext links,Boolean searches, automatic coding, cross-match-

110066

Chi-Jung Lu and Stuart W Shulman

IINNTTEERRNNAATTIIOONNAALL JJOOUURRNNAALL OOFF MMUULLTTIIPPLLEE RREESSEEAARRCCHH AAPPPPRROOAACCHHEESS Volume 2, Issue 1, June 2008

1 The second author is the sole inventor of the Coding Analysis Toolkit and has a financial interest in it should it ever becommercialised.

ing coding with demographic data, or frequencycounting. Some products go beyond textual data,allowing users to link to pictures, include sounds,and develop graphical representations of modelsand ideas (St. John & Johnson 2000). Brown(2002, Paragraph 1) sees these benefits as theresult of ‘digital convergence’ that is widely notedin research, media and pop culture domains.Researchers also find CAQDAS uniquely suitedto facilitate collaborative projects and other formsof team-based research (Bourdon 2002; Ford,Oberski & Higgins 2000; Lewins & Silver 2007;MacQueen, McLellan, Kay & Milstein 1998;Manderson, Kelaher & Woelz-Stirling 2001; Sin2007a; St John & Johnson 2000).

Perhaps the most important contribution ofCAQDAS is to the quality and transparency ofqualitative research. First, CAQDAS helps clarifyanalytic strategies that formerly represented animplicit folklore of research (Kelle 2004). This inturn increases the ‘confirmability’ of the results.The analysis is structured and its progress can berecorded as it develops. In making analysisprocesses more explicit and easier to report,CAQDAS provides a basis for establishing credi-bility and validity by making available an audittrail that can be scrutinized by external reviewersand consumers of the final product (John &Johnson 2000). Welsh (2002: Paragraph 12)argues that using software can be an importantpart of increasing the rigor of the analysis processby ‘validating (or not) some of the researcher’sown impressions of the data.’

In addition to the benefits of transparency viathe audit trail just mentioned, the computerisedcode-and-retrieve model can make the examina-tion of qualitative data more complete and rigor-ous, enabling researchers and their critics toexamine their own assumptions and biases. Seg-ments of data are less likely to be lost or over-looked, and all material coded in a particular waycan be retrieved. ‘The small but insignificantpieces of information buried within the largermass of material are more effortlessly located –the deviant case more easily found’ (Bassett 2004:

34). Since the data is methodically coded, CAQ-DAS helps with the systematic use of all the avail-able evidence. Researchers are better able tolocate evidence and counter-evidence, in contrastto the level difficulty performing similar taskswhen using a paper-based system of data organi-sation. This can, in turn, reduce the temptationto propose far-reaching theoretical assumptionsbased on quickly and arbitrarily collected quota-tions from the materials (Kelle 2004).

CAQDAS can also be utilised to producequantitative data and to leverage statistical analy-sis programs such as SPSS. For many researchersnew to qualitative work or for those seeking topublish in journals typically averse to qualitativestudies, there is significant value in the ability toadd quantitative parameters to generalisationsmade in analysis (Gibbs, Friese & Mangabeira2002). For other researchers – especially thoseworking in applied, mixed method, scientific andevaluation settings – the ability to link qualitativeanalysis with quantitative and statistical results isimportant (Gibbs et al 2002). In these instances,CAQDAS enables the use of multiple coders andperiodic checks of inter-coder reliability, validity,conformability and typicality of text (Berends &Johnston 2005; Bourdon 2000; Kelle & Laurie1995; Northey 1997).

Another important contribution of CAQDASresults from the capability of improving rigorousanalysis. Computerised qualitative data analysis,when grounded in the data itself, good theory,and appropriate research design, is often morereadily accepted as ‘legitimate’ research (Richards& Richards 1991: 39). Qualitative research isroutinely defined negatively in relation to quanti-tative research as the ‘non-quantitative’ handlingof ‘unstructured’ data (Bassett 2004). This nega-tive definition conveys a sense of lower or inferiorstatus, with a specific reputation for untrustwor-thy results supported by cherry-picked anecdotes.Computerised qualitative data analysis provides aclear pathway to rigorous, defensible, scientificand externally legitimised qualitative research viatransparency that the qualitative method tradi-

110077

Rigor and flexibility in computer-based qualitative research: Introducing the Coding Analysis Toolkit

Volume 2, Issue 1, June 2008 IINNTTEERRNNAATTIIOONNAALL JJOOUURRNNAALL OOFF MMUULLTTIIPPLLEE RREESSEEAARRCCHH AAPPPPRROOAACCHHEESS

tionally lacks (Kelle 1997a; Richards & Richards1991; Tesch 1989; Webb 1999; Weitzman 1999;Welsh 2002).

SSoommee ccaauuttiioonnaarryy nnootteessThe impact of CAQDAS on research is extensiveand diverse. For some, however, such effects are notalways desirable. The enhanced capability of deal-ing with a large sample may mislead researchers tofocus on raw quantity (a kind of frequency bias)instead of meaning, whether frequent or not(Mangabeira 1995; Seidel 1991; St John & John-son 2000). There exist very legitimate fears ofbeing distant or de-contextualised from the data(Barry 1998; Fielding & Lee 1998; Sandelowski1995; Seidel 1991).

Mastering CAQDAS itself can be a distrac-tion. Many researchers complain they have toinvest too much time (and often money) learn-ing to use and conform to the design choices of aparticular software package. It has been notedthat new users encounter usability frustration,even despair and hopelessness, along the way toCAQDAS fluency or completed project, if theyever get there (St John & Johnson 2000). Clos-ing the quantitative–qualitative gap via computeruse threatens the sense shared by many about thedistinctive nature of qualitative research (Gibbset al 2002; Richards & Richards 1991a; Bassett2004).

Of particular concern is a widespread notionthat grounded theory is literally embedded in thearchitecture of qualitative data analysis software(Bong 2002; Coffey, Holbrook & Atkinson1996; Hinchliffe et al 1997; Lonkila 1995;MacMillan & Koenig 2004; Ragin & Becker1989). This is linked conceptually to an over-emphasis on the code-and-retrieve approach(Richards & Richards 1991a; Tallerico 1992). Atissue is the homogenising of qualitative method-ology. In this sense, CAQDAS may come todefine and structure the methodologies it shouldmerely support, presupposing a particular way ofdoing research (Agar 1991; Hinchliffe et al 1997;MacMillan & Koenig 2004; Ragin & Becker

1989; Richards 1996; St John & Johnson 2000;Webb 1999).

The debates are ongoing. Previous studies sug-gest that the structure of individual software pro-grams can influence research results (Mangabeira1995; Pfaffenberger 1988; Richards & Richards1991c; Seidel 1991). We join others whoacknowledge the paramount role held by the freewill of the human as computer-user and active,indeed dominant, agent in the analytic process(Roberts & Wilson 2002; Thompson 2002;Welsh 2002). Researchers can and should be vigi-lant, recognising that they have control over theirparticular use of the software (Bassett 2004).

The remainder of this paper introduces a newsystem, the Coding Analysis Toolkit, which ispart of an effort at the University of Pittsburgh toenhance both the rigor and flexibility of qualita-tive research. The CAT system aims to make sim-ple, usable, online coding tools easily accessible tostudents, researchers and others with large textdatasets.

TTHHEE CCOODDIINNGG AANNAALLYYSSIISS TTOOOOLLKKIITT

TThhee oorriiggiinn ooff tthhee CCooddiinngg AAnnaallyyssiissTToooollkkiitt

Accurate and reliable coding using computers iscentral to the content analysis process. For newand experienced researchers, learning advancedproject management skills adds hours of struggleduring trial-and-error sessions. Our approach ispremised on the idea that experienced coders andproject managers can deliver coded data in atimely and expert manner. For this reason, theQualitative Data Analysis Program (QDAP) wascreated as a program initiative at the University ofPittsburgh’s University Center for Social andUrban Research (UCSUR). QDAP offers a rigor-ous, university-based approach to analysing textand managing coding projects.

QDAP adopted the analytic tool ATLAS.ti(http://www.atlasti.com), one of the leading com-mercial CAQDAS for analysing textual, graphical,audio and video data. It is a powerful and widely-

110088

Chi-Jung Lu and Stuart W Shulman

IINNTTEERRNNAATTIIOONNAALL JJOOUURRNNAALL OOFF MMUULLTTIIPPLLEE RREESSEEAARRCCHH AAPPPPRROOAACCHHEESS Volume 2, Issue 1, June 2008

used program that supports project management,enables multiple coders to collaborate on a singleproject, and generates output that facilitates effec-tive analysis. However, as a frequent and high vol-ume user of ATLAS.ti (eg more than 5,000billable hours of coding took place in the QDAPlab this year), certain limitations stand out.

First, ATLAS.ti users perform many repetitivetasks using a mouse. The click-drag-releasesequence can be physically numbing and is proneto error. It decreases the efficiency of our largedataset annotation projects and represents a dis-traction from the core code-and-retrieve method.Most COTS packages in fact share this awkwarddesign principle. Coders must first manuallyretrieve a span of text (click-drag-release) andthen code it (another click-drag-release) for laterretrieval; hence a more accurate description ofmuch of the COTS software is as retrieve-code-retrieve and not code-and-retrieve.

Second, ATLAS.ti lacks an easy mechanismto measure and report inter-coder reliability andvalidity. Clients of QDAP point to their need tomake these measures operational in their grantproposals and peer-reviewed studies. ATLAS.tiprovides limited technical support for the adju-dication procedures that often follow a multi-coder pre-test. There are other softwareapplications for calculating inter-coder reliabilitybased on merged and exported ATLAS-codeddata; however, each of these programs ratherinconveniently has its own requirements regard-ing data formatting (Lombard, Snyder-Duch &Bracken 2005), raising concern that researchersmay be forced to choose some fixed procedurethat can easily accommodate further to use theparticular tool.

KKeeyy ffuunnccttiioonnaalliittyy ooff tthhee CCooddiinnggAAnnaallyyssiiss TToooollkkiittIn order to address these problems, QDAP devel-oped the Coding Analysis Toolkit (CAT)(http://cat.ucsur.pitt.edu) as a complement toATLAS.ti. This web-based system consists of asuite of tools custom built from the ground up to

facilitate efficient and effective analysis of raw textdata or text datasets that have been coded using theATLAS.ti. The website and database are hosted ona Windows 2003 UCSUR server; the program-ming for the interface is done using HTML,ASP.net 2.0 and JavaScript. The aims of the CATare to serve as a functional complement and exten-sion to the existing functionality in ATLAS.ti, toopen new avenues for researchers interested inmeasuring and accurately reporting coder validityand reliability, to streamline the consensus-basedadjudication procedures, and to encourage newstrategies for data analysis in ways that prove diffi-cult when performed during the traditionalmechanical method of doing qualitative research.The following sub-sections summarize the primaryfunctionality and the breakthroughs made withinthe CAT system.

DDaattaa ffoorrmmaatt rreeqquuiirreemmeennttThe CAT system provides two types of workmodes. The first mode uploads exported codedresults from ATLAS.ti into the system for usingComparison Tools and Validation and Adjudica-tion Module. In this case, the CAT system func-tions as a complement or extension to ATLAS.ti.The second mode allows users to upload a raw,uncoded data file and an optional code file witha .zip file, an xml-based file or a plain-text file.Then, the assigned sub-account holder(s) candirectly utilise the Coding Module to performannotation tasks. In this situation, CAT replacesthe coding function in ATLAS.ti. With CAT,users have more flexible options when adoptinganalytic strategies than they do when usingATLAS.ti alone.

DDyynnaammiicc ccooddeerr mmaannaaggeemmeennttUsers need to register accounts first and then logon to access CAT. The system also allows usersto assign additional sub-accounts to their codersfor practicing coding and adjudication proce-dures. Due to its web-based nature, projectmanagers can quickly and readily trace the cod-ing progress and monitor its quality. It thus

110099

Rigor and flexibility in computer-based qualitative research: Introducing the Coding Analysis Toolkit

Volume 2, Issue 1, June 2008 IINNTTEERRNNAATTIIOONNAALL JJOOUURRNNAALL OOFF MMUULLTTIIPPLLEE RREESSEEAARRCCHH AAPPPPRROOAACCHHEESS

offers a collaborative workbench and facilitatesefficient multiple-coder management.

CCooddiinngg mmoodduulleeIn CAT’s internal Coding Module, a projectmanager, who registers as the primary accountholder, has the ability to upload raw textdatasets with an optional code file and hascoders with sub-accounts to code these datasetsdirectly through the CAT interface. This CodingModule features automated loading of discretequotations (ie one quotation at a time in a view)and requires only keystrokes on the number padto apply the codes to the text. It eliminates therole of the computer mouse, thereby streamlin-ing the physical and mental tasks in the codinganalysis process. We estimate coding tasks usingCAT are completed two to three times fasterthan identical coding tasks conducted usingATLAS.ti. It also allows coders to add memosduring coding. While this high-speed ‘mouse-less’ coding module would poorly serve manytraditional qualitative research approaches(where the features of ATLAS.ti are very well-suited), it is ideally suited to annotation tasksroutinely generated by computer scientists and arange of other health and social science

researchers using large-scale content analysis aspart of a mixed methods approach. Figure 1shows the Coding Module interface.

CCoommppaarriissoonn ttoooollssThe Comparison Tools comprise two components:Standard Comparison and Code-by-Code Compari-son. For the Standard Comparison, the CAT sys-tem has taken the ‘Kappa Tool’ developed byNamhee Kwon at the University of Southern Cal-ifornia – Information Sciences Institute(USC/ISI) and significantly expanded its func-tionality. Users can run comparisons of inter-coder reliability measures using Fleiss’ Kappa orKrippendorff ’s Alpha (shown as Figure 2). A usercan also choose to perform a code-by-code com-parison of the data, revealing tables of quotationswhere coders agree, disagree, or (in the case ofATLAS.ti users) overlap their text span selections.Users can choose to view the data reports on thescreen or, alternatively, can download the data fileas a rich-text file (.rtf ).

VVaalliiddaattiioonn aanndd aaddjjuuddiiccaattiioonn mmoodduulleeCAT’s other core functionality allows for theadjudication of coded items by an ‘expert’ userwho is a primary account holder or by a sub-

111100

Chi-Jung Lu and Stuart W Shulman

IINNTTEERRNNAATTIIOONNAALL JJOOUURRNNAALL OOFF MMUULLTTIIPPLLEE RREESSEEAARRCCHH AAPPPPRROOAACCHHEESS Volume 2, Issue 1, June 2008

FFIIGGUURREE 11:: TTHHEE CCOODDIINNGG MMOODDUULLEE IINNTTEERRFFAACCEE

account holder attached to the primary accountholder. An expert user (or a consensus adjudica-tion team) can log onto the system to validatethe codes assigned to items in a dataset. Figure3 shows the interface of the validation function,which, just like the Coding Module, was alsodesigned to use keystrokes and automation toclarify and speed-up the validation or consensusadjudication process. Using the keystrokes 1 or3 (or Y or N), the adjudicator or consensusteam reviews, one at a time, every quotation ina coded dataset. While the expert user validatescodes, the system keeps track of valid instances

of a particular code and notes which codersassigned those codes. This information providesa track record of coders, assessing coder validityover time. If two or more coders applied thesame code to the same quotation, the databaserecords the valid/invalid decision for all thecoders. It also allows the account holder(s) tosee a rank order list of the coders most likely toproduce valid observations and a report theoverall validity scores by code, coder, or entireproject (shown as Figure 4), resulting in a‘clean’ dataset consisting of only valid observa-tions (shown as Figure 5).

111111

Rigor and flexibility in computer-based qualitative research: Introducing the Coding Analysis Toolkit

Volume 2, Issue 1, June 2008 IINNTTEERRNNAATTIIOONNAALL JJOOUURRNNAALL OOFF MMUULLTTIIPPLLEE RREESSEEAARRCCHH AAPPPPRROOAACCHHEESS

FFIIGGUURREE 22:: SSTTAANNDDAARRDD CCOOMMPPAARRIISSOONN OONN AA SSAAMMPPLLEE DDAATTAASSEETT

FFIIGGUURREE 33:: TTHHEE VVAALLIIDDAATTIIOONN AANNDD AADDJJUUDDIICCAATTIIOONN MMOODDUULLEE IINNTTEERRFFAACCEE

DDIISSCCUUSSSSIIOONNThis section further discusses how users canapply inter-coder agreement outputs and adjudi-cation procedures to reinforce the rigor ofresearch and utilise the multiple approachesoffered by CAT to enhance the flexibility of cod-ing and analysing procedures.

IInntteerr--ccooddeerr aaggrreeeemmeenntt mmeeaassuurreessssuuppppoorrtt rraatthheerr tthhaann ddeeffiinnee rriiggoorrCriticism about the rigor of qualitative research hasa long history. Ryan (1999) points out those skep-tics tend to ask qualitative inquiries two questions:

‘To what degree do the examples representthe data the investigator collected?’ and

‘To what degree do they represent the con-struct(s) that the investigator is trying todescribe?’

Both questions focus specially on investiga-tors’ synthesis and presentation of qualitativedata and are related to issues of reliability andvalidity of research.

For many years, some investigators have advo-cated ‘the use of multiple coders to better linkabstract concepts with empirical data’ (Ryan 1999:313) and to thus be able to effectively addresssuch suspicion (Carmines and Zeller 1982; Milesand Huberman 1994). Multi-coder agreement isalso beneficial for text retrieval tasks, especiallywhen the dataset is large. An investigator who usesa single coder to mark themes relies on the coder’sability to accurately and consistently identifyexamples. ‘Having multiple coders mark a textincreases the likelihood of finding all the examplesin a text that pertain to a given theme’ (Ryan1999: 319). It helps build up reliability of theanalysis process. Recognising the benefits multiplecoders use as a means to increase rigor and toenhance the quality of qualitative research, QDAPcurrently adopts this approach.

Ryan (1999) summarises that multi-coderagreement serves three basic functions in theanalysis of text. First, it helps researchers verifyreliability. Agreement between coders tells inves-tigators the degree to which coders can be treat-ed as ‘reliable’ measurement instruments. High

111122

Chi-Jung Lu and Stuart W Shulman

IINNTTEERRNNAATTIIOONNAALL JJOOUURRNNAALL OOFF MMUULLTTIIPPLLEE RREESSEEAARRCCHH AAPPPPRROOAACCHHEESS Volume 2, Issue 1, June 2008

FFIIGGUURREE 44:: AADDJJUUDDIICCAATTIIOONN SSTTAATTIISSTTIICCSS RREEPPOORRTT

degrees of multi-coder reliability mean that mul-tiple coders apply the codes in the same manner,thus acting as reliable measurement instruments.Multi-coder reliability proves particularlyimportant if the coded data will be analyzed sta-tistically. If coders disagree on the coding deci-sion with each other, then the coded data areinaccurate. Discrepancies between coders areconsidered error and affect analysis calculations(Ryan 1999: 319).

Second, multi-coder agreement can also func-tion as the validity measure for the coded data,demonstrating that multiple coders can pick thesame text as pertaining to a theme. This providesevidence that ‘a theme has external validity and isnot just a figment of the primary investigator’simagination’ (Ryan 1999: 320). CAT users canobtain reliability measures to serve the two previ-ously discussed functions through the Compari-son Tools (shown as Figure 2).

The last function of multi-coder agreementprovides evidence of typicality. It can helpresearchers identify the themes within data. Ryan(1999) notes ‘with only two coders the typicalityof responses pertaining to any theme is difficultto assess’. A better way to check typicality ofresponses is to use more than two coders; in thiscase, we assume that the more coders who identi-fy words and phrases as pertaining to a giventheme, the more representative the text. There-

fore, agreement among coders demonstrates thecore features of a theme. Some may argue thatthese core features represent the central tendencyor typical examples of abstract constructs.

On the contrary, the disagreement highlightsthe theme’s peripheral features, representing the‘edges’ of the theme (Ryan 1999). They are seenas marginal examples of the construct. They maybe less typical, but are not necessarily unimpor-tant. Within CAT, users can acquire both centraltendency and marginal information on the codeddata through Dataset Reports, which list the quo-tations by each code and are tagged with thecoder ID. Figure 5 shows the report of a sampledataset. In this case, the four coders all annotatedthe second quotation as the code name of ‘eco-nomic issues,’ which indicates this quotation iscommonly identified as the same theme or keyconcept, while the first and the fifth quotationsare coded by only one coder; the investigator mayneed to consider how to handle these peripheralfeatures in his or her analyses.

There are at least three moments in whichLombard, Snyder-Duch & Bracken (2005) sug-gest doing an assessment of inter-coder reliability:1. During coder training: Following instrument

design and preliminary coder training, proj-ect managers can initially assess reliabilityinformally with a small number of units,which ideally are not part of the full sample

111133

Rigor and flexibility in computer-based qualitative research: Introducing the Coding Analysis Toolkit

Volume 2, Issue 1, June 2008 IINNTTEERRNNAATTIIOONNAALL JJOOUURRNNAALL OOFF MMUULLTTIIPPLLEE RREESSEEAARRCCHH AAPPPPRROOAACCHHEESS

FFIIGGUURREE 55:: CCOODDEEDD DDAATTAASSEETT RREEPPOORRTT

of units to be coded, refining the codingscheme or coding instruction until the infor-mal assessment reveals an adequate level ofagreement.

2. During a pilot test: Investigators can use arandom or other justifiable procedure toselect a representative sample of units for apilot test of inter/multi-coder reliability. Ifreliability levels in the pilot test are adequate,proceed to the full sample; if they are notadequate, conduct additional training, refinethe code book, and, only in extreme cases,replace one or more coders.

3. During coding of the full sample: When con-fident that reliability levels will be adequate(based on the results of the pilot test of relia-bility); investigators can assess the reliabilityby using another representative sample toassess reliability for the full sample to becoded. The reliability levels obtained in thistest can be the ones to be presented in allreports of the project.

In a traditional way, these assessments mightbe carried out once for each moment through thelinear time, or only one time for the whole cod-ing process, most likely during coder training orin pilot study before starting to code the full sam-ple. With CAT, the reliability assessment can beeasily and quickly obtained at any time. The cod-ing tasks become a dynamic and circular process,allowing the investigators intimate control overthe analytic courses.

AAddjjuuddiiccaattiioonn pprroocceedduurreess ffoorrssuuppppoorrttiinngg rriiggoorrThe CAT system provides a flexible framework topractice adjudication of coded data. The flexibili-ty of assigning sub-accounts holders in CATallows a user to choose the way of arranging theadjudication strategy depending on his/her ownhuman resources. The following three conditionscan all be served:1. an investigator can assign the coding tasks to

coders and he/she does the coded data adju-

dication when he/she is the only investigatorin the project;

2. when the condition is a collaborated projectinvolving one primary investigator and otherco-investigators, parts of the research teamcan do the coding tasks while the other partstake responsibility of adjudication;

3. investigator(s) participate in coding and havean external party serve as the expert adjudica-tor(s). This third condition can also be seenas part of the auditing process, which canadd to the ‘confirmability’ (or ‘objectivity’ inconventional criteria) of qualitative research.

MMuullttiippllee aapppprrooaacchheess ffoorr ssuuppppoorrttiinnggfflleexxiibbiilliittyyRather than imposing a fixed working procedureon users, we expect researchers using CAT tomaintain their free will. The decision of using allor certain aspects of the available functionalitydepends on the researchers’ judgment accordingto the nature of their studies. We encourage qual-itative researchers to keep a critical perspective onCAQDAS use and, more importantly, to use it ina flexible and creative way. CAT, as a flexible anddynamic workbench, supports multipleapproaches in qualitative research:• It can be used as a complementary tool to

ATLAS.ti or as a tool supporting the wholecoding and adjudicating and output analysingprocess.

• It supports a solo-coder work mode or a multi-ple-coder teamwork mode.

• It allows adjudication procedures carried outby one expert adjudicator or a consensus team.

• It helps to blend quantitative contributionsinto quantitative analysis – researchers canchoose to report texts with qualitative interpre-tation alone, or researchers can report high orlow frequency statistics of coded data as an evi-dence of central tendency and/or a theme’speripheral features.

Based on this section’s discussion of CAT as atool to enhance qualitative research rigor and

111144

Chi-Jung Lu and Stuart W Shulman

IINNTTEERRNNAATTIIOONNAALL JJOOUURRNNAALL OOFF MMUULLTTIIPPLLEE RREESSEEAARRCCHH AAPPPPRROOAACCHHEESS Volume 2, Issue 1, June 2008

flexibility, we may argue that the CAT system, infact, increases ‘closeness to data.’ Countering theconcerns that researchers will become distancefrom the data due to computer use, CAT’s con-venience and flexibility encourages users to ‘play’with their data, and to examine data from differ-ent angles.

NNeexxtt sstteeppssDrawing on the rich experiences of qualitativeprojects implementation and collaborating withclients across disciplines, QDAP has developedrich resource bases to determine the real needs ofanalysts. The features within CAT are based on auser-centered design. During CAT’s develop-ment, special attention has been paid to the effec-tiveness of functionality and to the needs for aneasy-to-use and instinctive interface that directlyinfluences the efficiency of the workflow and theuser experience. We expect that the CAT systemwill not only extend but effectively alter the exist-ing practice. In order to verify the potential posi-tive impacts of CAT and to detect problems in itsfunctionality and interface, evaluations areincluded in our research plan. Therefore, the nextstep will be to conduct a usability and impactstudy. Systematic user feedback will be gatheredvia a beta tester web survey in April 2008 andwill shape the future development of CAT. At thesame time, QDAP continues to extend the capa-bilities of CAT. Its reliability as software mayprove sufficiently robust enough to merit com-mercial licensing before the close of 2008.

RReeffeerreenncceessAgar M (1991). The right brain strikes back. In NG

Fielding and RM Lee (Eds.), Using Computers inQualitative Research (pp. 181–193). London:Sage.

Barry CA (1998). Choosing qualitative data analysissoftware: Atlas.ti and Nudist compared.Sociological Research Online, 3. Retrieved 30January 2008, from http://www.socresonline.org.uk/socresonline/3/3/4.html.

Bassett, R. (2004). Qualitative data analysissoftware: Addressing the debates. Journal ofManagement Systems, XVI: 33–39.

Berends L and Johnston J (2005). Using multiplecoders to enhance qualitative analysis: The caseof interviews with consumers of drug treatment.Addiction Research & Theory, 13: 373–382.

Bong SA (2002). Debunking myths in qualitativedata analysis. Forum of Qualitative SocialResearch, 3. Retrieved 30 January 2008, fromhttp://www.qualitative-research.net/fqs-texte/2-02/2-02bong-e.htm.

Bourdon S (2002) The integration of qualitativedata analysis software in research strategies:Resistances and possibilities. Forum QualitativeSozialforschung, 3(2). Retrieved 30 January 2008from www.qualitative-research.net/fqs-texte/2-02/2-02bourdon-e.htm

Brown D (2002) Going digital and stayingqualitative: Some alternative strategies fordigitizing the qualitative research process. ForumQualitative Sozialforschung / Forum: QualitativeSocial Research [On-line Journal], 3(2). Retrieved30 January 2008, from http://www.qualitative-research.net/fqs-texte/2-02/2-02brown-e.htm.

Coffey A, Holbrook B and Atkinson P (1996)Qualitative data analysis: technologies and repre-sentations. Sociological Research Online, 1(1).Retrieved 30 January 2008, from http://www.soc.surrey.ac.uk/socresonline/1/1/4.html.

Conrad P and Reinharz S (1984) Computers andqualitative data: Editor’s introductory essay.Qualitative Sociology, 7: 3–13.

Dolan A and Ayland C (2001). Analysis on trial.International Journal of Market Research, 43:377–389.

Drass KA (1980) The Analysis of Qualitative Data:A Computer Program. Urban Life, 9: 332–353.

Fielding NG and Lee RM (1998) Computer Analysisand Qualitative Research: New technology for socialresearch. London: Sage.

Ford K, Oberski I and Higgins S (2000) Computer-aided qualitative analysis of interview data: Somerecommendations for collaborative working. TheQualitative Report, 4 (3, 4).

Gibbs GR, Friese S and Mangabeira WC (2002)The use of new technology in qualitativeresearch. Introduction to Issue 3(2) of FQS.Forum Qualitative Sozialforschung / Forum:Qualitative Social Research, 3(2). Retrieved 30January 2008, from http://www.qualitative-research.net/fqs-texte/2-02/2-02hrfsg-e.htm.

Hinchliffe SJ, Crang MA, Reimer SM and HudsonAC (1997) Software for qualitative research: 2.Some thoughts on ‘aiding’ analysis. Environmentand Planning A, 29, 1109–1124.

111155

Rigor and flexibility in computer-based qualitative research: Introducing the Coding Analysis Toolkit

Volume 2, Issue 1, June 2008 IINNTTEERRNNAATTIIOONNAALL JJOOUURRNNAALL OOFF MMUULLTTIIPPLLEE RREESSEEAARRCCHH AAPPPPRROOAACCHHEESS

Kelle U and Laurie H (1995) Computer use inqualitative research and issues of validity. In U.Kelle (Ed.) Computer-aided Qualitative DataAnalysis: Theory, methods and practice (pp.19–28).London/Thousand Oaks/New Delhi: Sage.

Kelle U (1997a) Theory building in qualitativeresearch and computer programs for themanagement of textual data. Sociological ResearchOnline, 2(2).

Kelle U (2004) Computer-assisted qualitative dataanalysis. In C. Seale, G. Gobo, J. F. Gubrium,and D. Silverman (Eds.), Qualitative ResearchPractice (pp. 473–489). London: Sage.

King G, Keohane R and Verba S (1994) DesigningSocial Inquiry: Scientific Inference in QualitativeResearch Princeton, NJ: Princeton UniversityPress.

Lee RM and Fielding NG (1991) Computing forqualitative research: Options, problems andpotential. In N. G. Fielding & R. M. Lee (Eds.),Using Computers in Qualitative Research (pp.1–13). London: Sage.

Lewins A and Silver C (2007) Using Software inQualitative Research: A step-by-step guide.London: Sage.

Lombard M, Snyder-Duch J and Bracken CC(2005) Practical resources for assessing and report-ing intercoder reliability in content analysis researchprojects. Retrieved 17 March 2008, fromhttp://www.temple.edu/sct/mmc/reliability/.

Lonkila M (1995) Grounded theory as an emergingparadigm for computer-assisted qualitative dataanalysis. In U. Kelle (Ed.), Computer-AidedQualitative Data Analysis. Theory, Methods andPractice (pp.41–51). London: Sage.

MacMillan K and Koenig T (2004) The wowfactor: Preconceptions and expectations for dataanalysis software in qualitative research. SocialScience Computer Review, 22: 179–186.

MacQueen KM, McLellan E, Kay K and Milstein B(1998) Codebook development for team-basedqualitative analysis. Cultural AnthropologyMethods, 10: 31–36.

Manderson L, Kelaher M and Woelz-Stirling N (2001).Developing qualitative databases for multiple users.Qualitative Health Research, 11: 149–160.

Mangabeira W (1995) Computer assistance,qualitative analysis and model building. In R. M.Lee (Ed.), Information Technology for the SocialScientist (pp 129–146). London: UCL Press.

Northey Jr, W F. (1997) Using QSR NUD*IST todemonstrate confirmability in qualitativeresearch. Family Science Review, 10: 170–179.

Pfaffenberger B (1988) Microcomputer Applicationsin Qualitative Research. Newbury Park, CA: Sage.

Ragin CC and Becker HS (1989) How themicrocomputer is changing our analytical habits.In G Blank, JL McCartney and E Brent (Eds.),New Technology in Sociology (pp. 47–55). NewBrunswick NJ: Transaction Publishers.

Richards L and Richards T (1991a) The transform-ation of qualitative method: Computationalparadigms and research processes. In NG Fieldingand RM Lee (Eds.), Using Computers in QualitativeResearch (pp.38–53). London: Sage.

Richards L and Richards TJ (1987) Qualitative dataanalysis: Can computers do it? Australia and NewZealand Journal of Sociology, 23: 23–35.

Roberts KA and Wilson RW (2002) ICT and theresearch process: Issues around the compatibilityof technology with qualitative data analysis. ForumQualitative Sozialforschung / Forum: QualitativeSocial Research, 3(2). Retrieved 30 January 2008,from http://www.qualitative-research.net/fqs-texte/2-02/2-02robertswilson-e.htm.

Ryan G (1999) Measuring the typicality of text: Usingmultiple coders for more than just reliability andvalidity checks. Human Organization, 58: 313–322.

Sandelowski, M. (1995). On the aesthetics ofqualitative research. Image: Journal of NursingScholarship, 27: 205–209.

Seidel J (1991) Method and madness in theapplication of computer technology toqualitative data analysis. In NG Fielding andRM Lee (Eds.), Using Computers in QualitativeResearch. (pp. 107–116). London: Sage.

Sin CH (2007a) Teamwork involving qualitativedata analysis software: Striking a balance betweenresearch ideals and pragmatics. Social ScienceComputer Review accessed 13 May 2008 fromhttp://ssc.sagepub.com/cgi/rapidpdf/0894439306289033v1.pdf.

St John W and Johnson P (2000). The pros andcons of data analysis software for qualitativeresearch. Journal of Nursing Scholarship, 32:393–397.

Tallerico M (1992) Computer technology forqualitative research: Hope and humbug. Journalof Educational Administration, 30: 32–40.

Tesch R (1989) Computer software and qualitativeanalysis: A reassessment. In G Blank, JLMcCartney and E Brent (Eds.), New Technologyin Sociology (pp. 141–154). New Brunswick NJ:Transaction Publishers.

Tesch R (1991a) Introduction. Qualitative Sociology,14: 225–243.

111166

Chi-Jung Lu and Stuart W Shulman

IINNTTEERRNNAATTIIOONNAALL JJOOUURRNNAALL OOFF MMUULLTTIIPPLLEE RREESSEEAARRCCHH AAPPPPRROOAACCHHEESS Volume 2, Issue 1, June 2008

Thompson R (2002) Reporting the results ofcomputer-assisted analysis of qualitative researchdata. Forum Qualitative Sozialforschung / Forum:Qualitative Social Research, 3(2). Retrieved 30January 2008, from http://www.qualitative-research.net/fqs-texte/2-02/2-02thompson-e.htm

Webb C (1999) Analyzing qualitative data:Computerised and other approaches. Journal ofAdvanced Nursing, 29: 323–330.

Weitzman EA (1999) Analyzing qualitative datawith computer software. Health Services Research:Part 2, 34: 1241–1263.

Welsh E (2002) Dealing with data: Using NVivo inthe qualitative data analysis process. Forum:Qualitative Social Research, 3(2). Retrieved 30January 2008, from http://www.qualitative-research.net/fqs-texte/2-02/2-02welsh-e.htm

Wolfe RA, Gephart RP and Johnson TE (1993)Computer-facilitated qualitative data analysis:Potential contributions to management research.Journal of Management 19: 637–660.

Received 13 March 2008 Accepted 14 April 2008

111177

Rigor and flexibility in computer-based qualitative research: Introducing the Coding Analysis Toolkit

Volume 2, Issue 1, June 2008 IINNTTEERRNNAATTIIOONNAALL JJOOUURRNNAALL OOFF MMUULLTTIIPPLLEE RREESSEEAARRCCHH AAPPPPRROOAACCHHEESS

CCOONNTTEEMMPPOORRAARRYY NNUURRSSEE:: HHeeaalltthh ccaarree aaccrroossss tthhee lliiffeessppaann

**** wwwwww..ccoonntteemmppoorraarryynnuurrssee..ccoomm ****

Advances in Contemporary Palliative and Supportive CareVol 27/1, 1st Dec 2007 ISBN 978-0-975771-04-4

Advances in Contemporary Aged Care: Retirementto End of LifeVol 26/2, 1st Oct 2007 ISBN 978-0-975771-01-3

Advances in Contemporary General PracticeNursing: Role of the Practice NurseVol 26/1, 1st Aug 2007 ISBN 978-0-975771-03-7

Advances in Contemporary Nurse Recruitment and RetentionVol 24/2, 1st Apr 2007 ISBN 978-0-975771-00-6

Advances in Contemporary Community and FamilyHealth CareVol 23/2, 1st Jan 2007 ISBN 978-0-975771-02-0

Advances in Contemporary Indigenous Health CareVol 22/2, 1st Sep 2006 ISBN 978-0-975043-69-1

Advances in Contemporary Nursing & InterpersonalViolenceVol 21/2, 1st May 2006 ISBN 978-0-975043-66-0

Advances in Contemporary Mental Health NursingVol 21/1, 1st Mar 2006 ISBN 978-0-975043-68-4

Advances in Contemporary Child and Family CareVol 18/1-2, 1st Jan 2005 ISBN 978-0-975043-63-9

Advances in Contemporary Transcultural NursingVol 15/3, 1st Oct 2003 ISBN 978-0-975043-61-5

HHEEAALLTTHH SSOOCCIIOOLLOOGGYY RREEVVIIEEWW:: PPoolliiccyy,, pprroommoottiioonn,, eeqquuiittyy aanndd pprraaccttiiccee

**** wwwwww..hheeaalltthhssoocciioollooggyyrreevviieeww..ccoomm ****

Expert Patient PolicyVol 18/2, 1st Jun 2009 ISBN 978-1-921348-16-7

Social Determinants of Child Health and WellbeingVol 18/1, 1st Mar 2009 ISBN 978-1-921348-15-0

Community, Family, Citizenship and the Health ofLGBTIQ PeopleVol 17/3, 1st Oct 2008 ISBN 978-1-921348-03-7

Re-imagining Preventive Health: TheoreticalPerspectivesVol 17/2, 1st Aug 2008 ISBN 978-1-921348-00-6

Death, Dying and Loss in the 21st CenturyVol 16/5, 1st Dec 2007 ISBN 978-0-9757422-9-7

Social Equity and HealthVol 16/2, 1st Jun 2007 ISBN 978-0-9757422-8-0

Medical Dominance RevisitedVol 15/5, 1st Dec 2006 ISBN 978-0975742-26-6

Childbirth, Politics & the Culture of RiskVol 15/4, 1st Oct 2006 ISBN 978-0977524-25-9

Revisiting Sexualities and HealthVol 15/3, 1st Aug 2006 ISBN 978-0975742-21-1

Closing Asylums for the Mentally Ill: SocialConsequencesVol 14/3, 1st Dec 2005 ISBN 978-0975742-21-1

Workplace Health: The Injuries of NeoliberalismVol 14/1, 1st Aug 2005 ISBN 978-0975043-64-6

RREECCEENNTT SSPPEECCIIAALL IISSSSUUEESS FFRROOMM eeCCOONNTTEENNTT

eContent Management Pty Ltd, PO Box 1027, Maleny QLD 4552, AustraliaTel.: +61-7-5435-2900; Fax. +61-7-5435-2911; [email protected]

www.e-contentmanagement.com