Download - Patient-informant concordance of the Structured Clinical Interview for DSM-IV Axis II ... · 2017-10-10 · Vertommen, 2003), SCID-II was chosen for use as the gold standard in this

Patient-informant concordance of the Structured Clinical

Interview for DSM-IV Axis II Personality Disorders (SCID-II)

and diagnostic abilities of the Personality Diagnostic

Questionnaire-4+ (PDQ-4+) in a non-forensic population of

aggressive men

Naam: W.J. Pertijs (Tom)

ANR: 138960

Onderzoeksbegeleider: Dr. P.M.C. Mommersteeg (Paula)

Tweede beoordelaar: Dr. S.Y.M. Thong (Melissa)

Geestelijke Gezondheidszorg Westelijk Noord-Brabant (GGZ-WNB)

Begeleider: Drs. C.A. van Tilburg (Carola)

Universiteit van Tilburg

Faculteit Sociale Wetenschappen

Departement Medische en Klinische Psychologie

Juni 2013

2

Abstract

Concordance of subject’s interview information and informant’s interview information from

partners obtained by the SCID-II and agreement of self-report information obtained by the

PDQ-4+ were examined in aggressive men participating in a non-forensic ambulant group

based treatment program. It was expected that SCID-II patient-informant concordance would

be particularly low, that PDQ-4+ diagnostic agreement with SCID-II for dimensional trait

scores would be moderate and that PDQ-4+ diagnostic agreement as well as efficiency for

categorical personality disorder would be poor. Pearson correlation coefficients and Kappa

values reflecting dimensional and categorical SCID-II patient-informant concordance

respectively, were generally poor, especially for antisocial and borderline personality

disorder, which were the most common in the sample, however: informants underreported

personality disorder traits of their partners on the SCID-II interview. Also intraclass

correlation coefficients and Kappa values reflecting dimensional and categorical PDQ-4+

diagnostic agreement respectively, were generally poor, especially for antisocial and

borderline personality disorder again: PDQ-4+ yielded many false positive diagnoses

compared to SCID-II, except for antisocial personality disorder diagnoses. Altogether, it can

be concluded that one might only administer the SCID-II to the patients in the first place,

although the SCID-II itself turned out to have some notable shortcomings too.

Keywords: SCID-II, PDQ-4+, patient-informant concordance, diagnostic agreement,

diagnostic efficiency, non-forensic population

3

Introduction

Personality disorders will affect treatment outcome (Skodol et al., 2005; Tyrer & Simmons,

2003). However, a systematic review of Mulder (2002) emphasizes that the effects of

personality pathology on treatment outcome appears to depend on study design, since the rate

of personality pathology varies markedly depending on how it is measured. This underlines

the importance of reliable and valid personality disorder assessment in future treatment

outcome research.

Currently, a treatment outcome study is being conducted on an ambulant group-based

treatment program for aggressive men. Research on non-forensic populations of aggressive

men is scarce. Noting the importance of reliable and valid personality disorder assessment in

future treatment outcome research, this preliminary study will examine personality disorder

assessment in this kind of populations. Because it has been expected that antisocial

personality disorder will be one of the most frequently occurring personality disorders in

populations like ours, the truthfulness of the answers and the willingness to cooperate in this

study are questioned in particular, given that deception is one of the DSM-IV-TR criteria for

antisocial personality disorder (APA, 2000). A lack of truthfulness, resulting in distorted self-

descriptions, may be revealed by informant reports from close relatives (i.e. partners).

Unwillingness to cooperate may be solved by using a self-report instrument instead of a

clinical interview in order to reduce the amount of time and effort asked from patients.

Patient-informant concordance of SCID-II

In the first part of this study, concordance of subject’s interview information and informant’s

interview information from partners of the patients obtained by the Structured Clinical

Interview for DSM-IV Axis II Personality Disorders (SCID-II) will be examined. The

4

Structured Clinical Interview for DSM-IV Axis II Personality Disorders (SCID-II) is a widely

used semi-structured diagnostic instrument for assessing all ten DSM-IV personality

disorders, as well as two other personality disorders mentioned in Appendix B (APA, 2000;

First, Spitzer, Gibbon, Williams & Benjamin, 1997). Semi-structured interviews like SCID-II

can be particularly helpful in situations in which the credibility or validity of the assessment

might be questioned, such as in forensic populations, because they provide exhaustive

evaluations (Widiger & Boyd, 2009). For this reason, SCID-II might also be helpful in our

sample. The SCID-II is considered to be the gold standard semi-structured assessment

instrument for personality disorders. Because of its generally good psychometric properties

and because it is the most commonly used in clinical research (Dreessen & Arntz, 1998;

Lobbestael, Leurgans & Arntz, 2011; Weertman, Arntz, Dreessen, Van Velzen &

Vertommen, 2003), SCID-II was chosen for use as the gold standard in this study as well.

No studies to date on patient-informant concordance using SCID-II for DSM-IV could

be found, but some research using SCID-II for DSM-III-R or its precursor, the Structured

Interview for DSM-III Personality (SIDP-III), is available. In psychiatric patients, agreement

of subject-based and informant-based diagnoses was high for any personality disorder or

individual personality disorder clusters, as well as for individual personality disorders

(Schneider et al., 2004). However, only poor to moderate correlations between subject-based

and informant-based diagnoses could be found in other studies, although some of the

correlations were still significant (Bernstein et al., 1997; Dreessen, Hildebrand & Arntz, 1998;

Zimmerman, Pfohl, Coryell, Stangl & Corenthal, 1988). Research using the SCID-II

Questionnaire for DSM-III-R (SCID-II-P) also found little meaningful correlations between

self-ratings and informant ratings in college samples (Ouimette & Klein, 1995; McKeeman &

Erickson, 1997) and psychiatric patients (Modestin & Puhan, 2000). In sum, it can be

concluded that patient-informant concordance of SCID-II diagnoses is generally poor.

5

However, patient and informant evaluations represent two different assessment approaches

with some part of unique information, so a complete agreement is not to be expected

(Klonsky, Oltmanns & Turkheimer, 2002; Modestin & Puhan, 2000).

Diagnostic agreement of PDQ-4+

In the second part of this study, agreement of subject’s interview information obtained by the

SCID-II and self-report information obtained by the Personality Diagnostic Questionnaire-4+

(PDQ-4+) will be examined. The PDQ-4+ is a relatively brief true-false self-report inventory

for assessing all ten DSM-IV personality disorders, as well as two other personality disorders

mentioned in Appendix B (APA, 2000; Bagby & Farvolden, 2004). Self-report instruments

like PDQ-4+ can be particularly helpful in situations in which maladaptive personality

functioning might be missed due to false expectations or assumptions (Widiger & Boyd,

2009). For this reason, PDQ-4+ might also be helpful in our sample. Although psychometric

properties are generally poor (Bagby & Farvolden, 2004; Bos et al., 2005; Widiger & Boyd,

2009; Wilberg et al., 2000), PDQ-4+ was chosen for use in this study because it belongs to

one of the most commonly used self-report instruments for personality disorder assessment in

clinical research. Besides, the inclusion of the two validity scales makes PDQ-4+ pre-

eminently suitable for use in our sample.

Criterion validity research on PDQ-4+ with SCID-II showed that diagnostic agreement

was generally poor (Bos, Van Velzen & Meesters, 2005; Fossati et al., 1998; De Reus, Van

den Berg & Emmelkamp, 2011; Wilberg, Dammen & Friis, 2000). However, agreement was

light to moderate for some personality disorders while using the Clinical Significance Scale, a

mini-interview belonging to PDQ-4+ and assessing the clinical significance for any of the

personality disorders (Bouvard, Vuachet & Marchant, 2011). With regard to our sample, some

studies among prison populations with high prevalence rates of antisocial personality disorder

6

must be noted. Agreement was moderate for most personality disorder diagnoses (Abdin et

al., 2011; Davison, Leese & Taylor, 2001). According to Davison et al. (2001), antisocial

personality disorder and borderline personality disorder showed even better agreement than

the others. In another study focused exclusively on antisocial personality disorder in an

offender population, a strong dimensional association was found between the antisocial

personality disorder scale of PDQ-4+ and the antisocial personality disorder module of SCID-

II, while agreement of categorical diagnoses was limited (Guy, Poythress, Douglas, Skeem &

Edens, 2008).

Diagnostic efficiency of PDQ-4+

In the third part of this study, efficiency of categorical personality disorder diagnoses

derived from self-report information obtained by the PDQ-4+ will be examined.

Criterion validity research on PDQ-4+ with SCID-II showed high false-positive rates

and low false-negative rates (Bos et al., 2005; Fossati et al., 1998; De Reus et al., 2011;

Wilberg et al., 2000). Although PDQ-4+ was tend to overdiagnose in some studies among

prison populations with high prevalence rates of antisocial personality disorder, the absence

of any personality disorder could be predicted with moderate agreement (Abdin et al., 2011;

Davison et al., 2001). Because PDQ-4+ has poor psychometric properties and tends to

overdiagnose personality disorders, some researchers conclude that PDQ-4+ is unsuitable

even as a screening instrument (Bouvard et al., 2011; Fossati et al., 1998; De Reus et al.,

2011). However, as PDQ-4+ tends to adequately predict the absence of any personality

disorder, others conclude that PDQ-4+ can be used as a screening instrument for sure, at least

to predict the presence or absence of any personality disorder (Abdin et al., 2011; Bos et al.,

2005; Davison et al., 2001; Wilberg et al., 2000).

7

Hypotheses

The aim of this study is to investigate two personality disorder assessment instruments in a

non-forensic group of aggressive men in order to develop a mode of adequate personality

disorder assessment in such groups. First, it is expected that SCID-II concordance of subject’s

interview information and informant’s interview information from partners of the patients will

be particularly low in this sample, possibly indicating distorted self-descriptions. Second, it is

expected that agreement of subject’s SCID-II interview information and PDQ-4+ self-report

information will be moderate for dimensional trait scores, but poor for categorical diagnoses.

Third, it is expected that efficiency of categorical personality disorder diagnoses derived from

PDQ-4+ self-report information will be poor with high rates of false positives when compared

to categorical personality disorder diagnoses derived from subject’s SCID-II interview

information, indicating that PDQ-4+ may not be suitable as a SCID-II substitution for

personality disorder assessment.

8

Method

Participants and procedure

In total, 34 patients and their eventual partners were addressed for research participation and

25 patients agreed. Nineteen of these patients actually had a partner and 12 of those partners

agreed to be approached as an informant for administration of the SCID-II interview, yielding

12 patient-informant couples participating in the study. Two interviewed patients refrained to

fill out the PDQ-4+ after the interview, yielding 23 cases whose interview data and self-report

data were both available. All patients were consecutive patients referred to ‘Geestelijke

Gezondheidszorg Westelijk Noord-Brabant’ (GGZ-WNB) outpatient department

‘Klachtgerichte Behandelingen’, located at Roosendaal and Bergen op Zoom, the

Netherlands, between September 2012 and March 2013. This is a generic outpatient clinic

covering a major area of the southern part of the Netherlands. The sample was drawn from

patients participating in an ambulant group-based cognitive behavioral treatment program for

aggressive men. All patients participating in this program had good knowledge of the Dutch

language, presented no clinically important cognitive impairment and did not suffer from any

acute psychotic disorder. The same was true for all informants participating in this study.

Patients and their eventual partners were invited for an 1.5-hour appointment intended

for administering the SCID-II interview and several questionnaires. After a complete

description of the study was provided, written informed consent was obtained from both the

patient and his eventual partner. In the following 1.5 hour, SCID-II was administered to the

patient while the eventual partner was waiting outside the room. After completion, the patient

was placed in another room to fill in the PDQ-4+, AUDIT, DUDIT and some other

questionnaires for the future treatment outcome study, while SCID-II was administered to the

eventual partner. SCID-II administration was done by a trained and experienced professional

9

or a trained and supervised trainee.

Demographic variables

Age, marital status, the presence of children at home, education level and job status were

asked from all participating patients and age was asked from all participating partners (i.e.

informants). Additionally, substance abuse was assessed in 56% (n = 14) of all participating

patients. They filled out the Dutch translation of the Alcohol Use Disorders Identification Test

(AUDIT) by Schippers and Broekman (2010) and the Dutch translation of the Drug Use

Disorders Identification Test (DUDIT) by Kraanen (2008), two parallel self-report

instruments screening alcohol-related and drug-related problems, respectively. The AUDIT

contains 10 items and the DUDIT contains 11 items, which provide information on different

aspects of alcohol or drug use. Items are rated on a 3- or 5-point interval scale (Babor,

Higgins-Biddle, Saunders & Monteiro, 2001; Berman, Bergman, Palmstierna & Schlyter,

2003).

Measures

SCID-II. Both patients and their partners (i.e. informants) were administered the Dutch

translation of the Structured Clinical Interview for DSM-IV Axis II Personality Disorders

(SCID-II) by Weertman, Arntz and Kerkhofs (2000). The 134-item SCID-II consists of

twelve personality disorder modules with one or a few items for each diagnostic criterion.

Diagnostic criteria are scored absent, questionable or present by the interviewer. Using SCID-

II, diagnoses can be made either categorically or dimensionally (First et al., 1997). In this

study, SCID-II modules for assessing depressive personality disorder and passive-aggressive

personality disorder were left out. The partners were administered SCID-II in a slightly

different way, since the intention was to obtain informant reports on the patients. Therefore,

10

questions had to be reworded. It was stressed that questions were relating to what extent they

experienced the questioned personality disorder traits in their partner (i.e. the patient).

PDQ-4+. Participants filled out the Dutch translation of the Personality Diagnostic

Questionnaire-4+ (PDQ-4+) by Akkerhuis, Kupka, Van Groenestijn and Nolen (1996). The

99-item PDQ-4+ consists of twelve personality disorder scales with one item for each

diagnostic criterion. It also includes a 4-item Too Good Scale to assess underreporting and a

2-item Suspect Questionnaire Scale to identify individuals who are lying, responding

randomly or not taking the questionnaire seriously. Items are scored true or false. Using PDQ-

4+, diagnoses can be made either categorically or dimensionally (Bagby & Farvolden, 2004;

Hyler, 1994).

Preliminary explorative analysis

Assumptions for parametric tests and statistics. All dimensional variables were examined

by using histograms, stam-and-leafplots, normal and detrended normal Q-Q-plots and

boxplots to determine normality of distributions, the range of scores and outliers in order to

decide if parametric tests and statistics might be used. Linearity, homoscedasticity, direction

and strength of relationships were assessed by scatterplots to decide if correlations might be

calculated.

Reliability of PDQ-4+ and ability of PDQ-4+ validity scales. Before conducting main

analyses, reliability analysis of the PDQ-4+ was conducted by calculating mean inter-item

correlations for each scale and the ability of the PDQ-4+ validity scales was assessed by

means of linear regression analysis.

Main analysis

Patient-informant concordance SCID-II. Pearson correlation coefficients were used as a

11

measure for concordance of dimensional trait scores derived from either subject’s SCID-II

interview information or informant’s SCID-II interview information from partners of the

patients, as the experiences of the patients and their partners are not considered to be the same

and thus provide some part of unique information (Klonsky et al., 2002; Modestin & Puhan,

2000). There is no absolute standard for interpreting Pearson correlation coefficients.

According to Hinkle, Wiersema and Jurs (2003), values from .00 to .29 represent negligible

concordance, values between .30 and .49 represent low concordance, values between .50 and

.69 represent moderate concordance, values between .70 and .89 represent high concordance

and values from .90 to 1.00 represent almost perfect concordance. Although many other

proposals do exist, these benchmarks suggested by Hinkle et al. (2003) were adopted in this

study.

Kappa values (Cohen, 1960) were used as a measure for concordance of categorical

personality disorder diagnoses derived from either subject’s SCID-II interview information or

informant’s SCID-II interview information from partners of the patients, because they correct

for chance agreements on nominal categories. In small samples, when a diagnosis occurs at a

very low base rate, Kappa values do have high variability (Shrout, Spitzer & Fleiss, 1987).

Therefore, Kappa values were calculated only for diagnoses occurring in 5% or more of the

sample using SCID-II administered to the patients themselves as criterion. Additionally,

Kappa values were not calculated if one variable was a constant in the 2-ways table. As for

Pearson correlation coefficients, there is no absolute standard for interpreting Kappa values.

According to Spitzer and Fleiss (1974), values below 0.5 represent low concordance, those

between 0.5 and 0.7 represent moderate concordance and those greater than 0.7 represent high

concordance. Although many other proposals do exist, these benchmarks suggested by Spitzer

and Fleiss (1974) were adopted in this study.

Diagnostic agreement PDQ-4+. Type A intraclass correlation coefficients (ICC) using an

12

absolute agreement definition (Cronbach, Gleser, Nanda & Rajaratnam, 1971; McGraw &

Wong, 1996; Shrout & Fleiss, 1979) were used as a measure for diagnostic agreement of

dimensional trait scores derived from either subject’s SCID-II interview information or PDQ-

4+ self-report information, as systematic variability due to the measures is considered

relevant, since it is expected that measures with SCID-II and PDQ-4+ provide the same

information. These intraclass correlations differ from Pearson correlations in that mean

differences between raters are classified as error, resulting in lower correlations (Cronbach et

al., 1971). The earlier mentioned benchmarks suggested by Hinkle et al. (2003) were adopted

for interpreting intraclass correlation coefficients as well.

Kappa values (Cohen, 1960) were used as a measure for diagnostic agreement of

categorical personality disorder diagnoses derived from either subject’s SCID-II interview

information or PDQ-4+ self-report information, for the same reason as they were chosen as a

measure for concordance of categorical personality disorder diagnoses.

Diagnostic efficiency PDQ-4+. Diagnostic efficiency was defined by sensitivity, specificity,

positive predictive power (PPP) and negative predictive power (NPP). Sensitivity refers to the

proportion of positives according to the standard (SCID-II) who are correctly identified as

such by the instrument in question (PDQ-4+). It represents the probability that someone with

a particular SCID-II diagnosis will have the same PDQ-4+ diagnosis too. The specificity

refers to the proportion of negatives according to the standard (SCID-II) who are correctly

identified as such by the instrument in question (PDQ-4+). It represents the probability that

someone without a SCID-II diagnosis will not have a PDQ-4+ diagnosis either. Positive

predictive power refers to the proportion of positives according to the instrument in question

(PDQ-4+) who are correctly identified as such. It represents the probability that someone with

a particular PDQ-4+ diagnosis will have the same SCID-II diagnosis too. Negative predictive

power refers to the proportion of negatives according to the instrument in question (PDQ-4+)

13

who are correctly identified as such. It represents the probability that someone without a

PDQ-4+ diagnosis will not have a SCID-II diagnosis either. As with Kappa values, the

conditional probability values reflecting sensitivity, specificity, positive predictive power

(PPP) and negative predictive power (NPP) were not calculated for diagnoses occurring in

less than 5% of the sample using SCID-II as criterion or if one variable was a constant in the

2-ways table. In line with previous research (e.g. Bouvard et al., 2011), conditional

probability values ranging from .00 and .29 were considered to be low, values ranging from

.30 to .69 were considered moderate and values ranging from .70 to 1.00 were considered to

be high in this study.

Post-hoc analysis of mean differences between sources of information

After conducting the main analysis, paired-samples t-tests were conducted to evaluate the

statistical significance for observed mean differences between subject’s interview,

informant’s interview and self-report information. The eta squared statistic was used as a

measure of effect size. For interpreting eta squared statistics, benchmarks suggested by Cohen

(1988) were adopted. According to Cohen (1988), a value of .01 indicates a small effect, a

value of .06 indicates a moderate effect and a value of .14 indicates a large effect.

14

Results

Sample characteristics

The mean age of the included patients was 40.0 years (SD = 9.5) and 42.5 years (SD = 9.5) for

the included partners. At entry into the study, 16% (n = 4) of the participants were single,

40% (n = 10) were in a relationship without being married, 36% (n = 9) were married and 8%

(n = 2) were separated or divorced. Of all included patients, 56% (n = 14) had children living

at home at the time. Most patients had a lower (40%, n = 10) or middle education (40%, n =

10), while the remaining part (20%, n = 5) only had primary education or had no education at

all. In total, 60% (n = 15) of all included patients had a job and were currently working at the

time of the study.

Substance abuse related to either alcohol (28,6%, n = 4), other substances (42,9%, n =

6) or both alcohol and other substances (28,6%, n = 4) was present in 50% (n = 7) of all 14

cases that filled out the AUDIT and DUDIT. The mean dimensional trait scores of personality

disorders and the prevalence of categorical personality disorder diagnoses are presented in

Table 1. Regarding SCID-II diagnoses derived from subject-based information as criterion,

not only antisocial personality disorder (48%, n = 12), but also borderline personality disorder

(36%, n = 9) turned out to be common in this sample. However, schizoid, schizotypal,

histrionic and narcissistic personality disorder were not diagnosed in this sample at all, again

regarding SCID-II diagnoses derived from subject-based information as criterion. Mean total

of reported personality disorder traits on SCID-II as well as mean total of categorical

personality disorder diagnoses on SCID-II were higher for participants aged between 31 and

40 years or aged 41 years or older, as well as for participants with primary education only or

no education at all. Unemployed participants and employed participants not working at the

time of the study also reported more personality disorder traits on average and had more

15

personality disorder diagnoses on average, so did participants in whom substance abuse was

present.

Preliminary explorative analysis

Assumptions for parametric tests and statistics. Most dimensional variables were

negatively skewed. The range was too small to calculate correlations for schizoid,

schizotypal, histrionic and narcissistic personality disorder trait scores. Linearity and

homoscedasticity was sufficient to calculate correlations for the remaining trait scores.

Reliability of PDQ-4+ and ability of PDQ-4+ validity scales. Mean inter-item correlations

for each PDQ-4+ scale are presented in Table 2. Internal consistency of the Dutch translation

of PDQ-4+ was poor for most scales. In total, 7 cases (30,4%) had a positive score on either

one or both validity scales. Multiple linear regression was used to assess the ability of the

PDQ-4+ validity scales to predict the number of items endorsed. Both scales were entered, R2

= .683, F(2, 19) = 8.291, p = .003. Positive scores on the Too Good Scale were significantly

related to a lower number of personality disorder criteria endorsed, b = -19.128, t(21) = -

3.123, p = .006. However, positive scores on the Suspect Questionnaire Scale were

significantly related to a higher number of personality disorder criteria endorsed, b = 17.538,

t(21) = 2.864, p = .010. Nevertheless, no cases were excluded in further analyses, yielding

analyses of true agreement without having sorted out any particular cases.

Main analysis

Patient-informant concordance SCID-II. One couple was excluded in analyses as they were

in divorce at the time of the interviews and because their reports turned out to be strikingly

different, yielding 11 couples whose reports have been analyzed. Pearson correlation

coefficients reflecting concordance of dimensional trait scores and Kappa values reflecting

16

concordance of categorical personality disorder diagnoses were calculated for those couples

and are presented in Table 3. Pearson correlations for schizoid, schizotypal, histrionic and

narcissistic trait scores were not calculated, because the range was too small to calculate

correlations for those traits. Kappa values for those personality disorder diagnoses were not

calculated either, because the base rates were below 5%. Concordance was high for avoidant

trait scores (r = .709, p = .014) and moderate for borderline (r = .589, p = .057), obsessive-

compulsive (r = .585, p = .059) and cluster C trait scores (r = .657, p = .028), with significant

results at the 5% significance level for avoidant and cluster C trait scores only, however.

Besides, concordance was negligible for antisocial trait scores (r = .292, p = .384) and cluster

A trait scores, as well as for the total number of personality disorder traits (r = .285, p = .396).

Moreover, concordance of categorical diagnoses was low for paranoid (κ = .-222), antisocial

(κ = .353), borderline (κ = .377) and obsessive-compulsive (κ = .298) personality disorder.

The only personality disorder that showed better agreement, was avoidant personality disorder

(κ = .542), which showed moderate agreement. Concordance of the total number of

categorical personality disorder diagnoses was also low (r = .489, p = .127).

Diagnostic agreement PDQ-4+. Intraclass correlation coefficients reflecting agreement of

dimensional trait scores and Kappa values reflecting agreement of categorical personality

disorder diagnoses were calculated for those cases and are presented in Table 4. Intraclass

correlations for schizoid, schizotypal, histrionic and narcissistic trait scores were not

calculated, because the range was too small to calculate correlations for those traits. Kappa

values for those personality disorder diagnoses were not calculated either, because the base

rates were below 5%. Agreement was medium for dependent trait scores (ICC = .608, p <

.001), as well as the total number of C-criteria of antisocial personality disorder (ICC = .674,

p < .001). Both rates were significant at the 1% significance level. Agreement rates were low

or negligible for all other scales, including antisocial (ICC = .119, p = .292), which had the

17

lowest agreement, borderline (ICC = .328, p = .055) and the total number of personality

disorder traits. However, agreement rates for avoidant (ICC = .485, p = .001), cluster B (ICC

= .312, p = .032) and cluster C trait scores (ICC = .359, p = .016), as well as for the total

number of personality disorder traits (ICC = .224, p = .048) were significant at the 5%

significance level, though. For categorical diagnoses, agreement was low for all personality

disorders except for dependent personality disorder (κ = .623), which was moderate. The

lowest agreement rates were found for antisocial (κ = .094) and borderline personality

disorder (κ = .087), as well as for paranoid personality disorder (κ = -.039). Besides,

agreement of the total number of categorical personality disorder diagnoses was negligible

(ICC = .221, p = .054).

Diagnostic efficiency PDQ-4+. Diagnostic efficiency values of categorical personality

disorder diagnoses are also presented in Table 4. Values for schizoid, schizotypal, histrionic

and narcissistic personality disorder were not calculated, because the base rates for these

personality disorders were below 5%. Sensitivity and negative predictive power were high for

avoidant (sensitivity = 1.00, NPP = 1.00), dependent (sensitivity = 1.00, NPP = 1.00) and

obsessive-compulsive personality disorder (sensitivity = .833, NPP = .900). Negative

predictive power was also high for paranoid personality disorder (NPP = .750). However,

most diagnostic efficiency values, especially specificity and positive predictive power, were

only moderate in most cases. For antisocial personality disorder, in contrary, sensitivity and

negative predictive power were slightly lower, in favor of specificity and positive predictive

power, respectively.

Post-hoc analysis of mean differences between sources of information

Examining Table 1, informants seem to underreport personality disorder traits of their

partners on the SCID-II interview, compared to the patients themselves. To evaluate this

18

presumption, paired-samples t-tests were conducted on the total number of traits and the

number of categorical personality disorders derived from the subject’s interview on one hand

and the total number of traits and the number of categorical personality disorders derived

from the informant’s interview on the other hand. The results from these tests are presented in

Table 5. For dimensional trait scores, the total number derived from the informant’s interview

(M = 15.2, SD = 4.27) was significantly lower than the number derived from the subject’s

interview (M = 19.6, SD = 6.67), t (10) = 2.12, p = .030 (one-tailed). The eta squared statistic

(.313) indicated a large effect size. Also for categorical diagnoses, the total number derived

from the informant’s interview (M = 1.27, SD = .647) was significantly lower than the number

derived from the subject’s interview (M = 2.00, SD = 1.27), t (10) = 2.19, p = .027 (one-

tailed). Again, the eta squared statistic (.323) indicated a large effect size.

Further examination of Table 1 shows that patients seem to overreport personality

disorder traits on the PDQ-4+ self-report, compared to their reports on the SCID-II interview.

To evaluate this presumption, paired-samples t-tests were conducted on the total number of

traits and the number of categorical personality disorders derived from the SCID-II subject’s

interview on one hand and the total number of traits and the number of categorical personality

disorders derived from the PDQ-4+ self-report on the other hand. The results for these tests

are also presented in Table 5. For dimensional trait scores, the total number derived from the

PDQ-4+ self-report (M = 31.0, SD = 14.0) was significantly higher than the number derived

from the SCID-II subject’s interview (M = 18.6, SD = 8.30), t (22) = -4.53, p < .001 (one-

tailed). The eta squared statistic (.483) indicated a large effect size. Also for categorical

diagnoses, the total number derived from the PDQ-4+ self-report (M = 1.61, SD = 1.41) was

significantly higher than the number derived from the SCID-II subject’s interview (M = 2.00,

SD = 1.27), t (22) = -4.36, p < .001 (one-tailed). Again, the eta squared statistic (.464)

indicated a large effect size.

19

Examining Table 1 once more, antisocial personality disorder seems to be the only

exception to the above trend: although patients overreport personality disorder traits on the

PDQ-4+ self-report, they seem to underreport antisocial personality disorder traits on the

PDQ-4+, however. To evaluate this presumption, paired-samples t-tests were conducted on

the antisocial trait score and the total number of C-criteria derived from the SCID-II subject’s

interview on one hand and the antisocial trait score and the total number of C-criteria derived

from the PDQ-4+ self-report on the other hand. Such as those of the previous t-tests, the

results for the tests are presented in Table 5. For antisocial trait scores, the total number

derived from the PDQ-4+ self-report (M = 2.57, SD = 1.67) was not significantly lower than

the number derived from the SCID-II subject’s interview (M = 2.91, SD = 1.65), t (22) = .756,

p = .229 (one-tailed). Likewise, the eta squared statistic (.025) indicated a small magnitude of

mean difference. Also for C-criteria, the total number derived from the PDQ-4+ self-report

(M = 3.17, SD = 3.01) was not significantly lower than the number derived from the SCID-II

subject’s interview (M = 3.96, SD = 3.21), t (22) = 1.52, p = .071 (one-tailed). However, the

eta squared statistic (.655) indicated a moderate magnitude of mean difference.

20

Discussion

First, it was expected that SCID-II patient-informant concordance would be particularly low

in this sample, possibly indicating distorted self-descriptions. Indeed, concordance was poor

overall, especially for antisocial trait scores and diagnoses. Concordance of the total number

of traits and the total number of diagnoses was also poor. In general, the patients themselves

did report more personality disorder traits than their partners, so it may not be concluded that

the patients participating in the aggression treatment program inherently provided distorted

(i.e. dishonest) self-descriptions.

Second, it was expected that PDQ-4+ diagnostic agreement with SCID-II would be

moderate for dimensional trait scores, but poor for categorical diagnoses. Agreement was

modest for dependent trait scores and diagnoses, as well as for the total number of antisocial

C-criteria. Agreement of all other dimensional scales, including antisocial and borderline

personality disorder scales, was poor, so was agreement of the total number of personality

disorder traits. Agreement rates for all categorical diagnoses but dependent personality

disorder, especially for antisocial, borderline and paranoid personality disorder, were also

poor, so was agreement of the total number of diagnoses. In sum, PDQ-4+ agreement with

SCID-II for most scales including antisocial and borderline trait scores was poor and

agreement rates for both antisocial and borderline personality disorder diagnoses were among

the lowest of all, while these two personality disorders were the most prevalent in this sample

regarding SCID-II diagnoses derived from subject-based information as criterion.

Third, it was expected that PDQ-4+ diagnostic efficiency would be poor with high

rates of false positives, indicating that PDQ-4+ might not be suitable as a SCID-II substitution

for personality disorder assessment. Indeed, PDQ-4+ yielded a high rate of false positive

diagnoses compared to the patient’s SCID-II interview, except for antisocial personality

21

disorder. Therefore, it can be concluded that PDQ-4+ is not suited at all for a clinical

interview substitution in this sample, as it was expected. It is not suitable for the use as merely

a screening tool in this sample either, because sensitivity and negative predictive power were

only moderate for antisocial personality disorder. This is a substantial drawback for the use of

PDQ-4+ as a screening tool for ruling out any personality disorder either, since antisocial

personality disorder turned out to be the most common disorder in this sample regarding

SCID-II diagnoses derived from subject-based information as criterion.

The current findings from the preliminary explorative analysis on the reliability of

PDQ-4+ and ability of the PDQ-4+ validity scales are in line with previous research. Internal

consistency of the Dutch translation of PDQ-4+ was poor for most scales in this study, in

accordance to previous results presented by Bos et al. (2005) and Wilberg et al. (2000). The

Too Good Scale was related to underreporting and the Suspect Questionnaire Scale was

related to overreporting in this study, in accordance to previous results presented by Wilberg

et al. (2000) and De Reus et al. (2011).

In general, the current findings on SCID-II patient-informant concordance of this

study are also in line with previous research (Dreessen et al., 1998). Kappa values were

somewhat higher compared to previous findings by Zimmerman et al. (1988). However,

compared to findings by Schneider et al. (2004), the concordance rates found in this study

were very low. Compared to studies using the SCID-II Questionnaire for DSM-III-R (SCID-

II-P), all Pearson correlations except for antisocial trait scores were better (McKeeman &

Erickson, 1997), as well as Kappa values except for antisocial, borderline and obsessive-

compulsive personality disorder (Modestin & Puhan, 2000).

Some current findings on PDQ-4+ diagnostic agreement are not in line with previous

research, however. First, agreement was poor for most categorical personality disorder

diagnoses in this study, while Abdin et al. (2011) as well as Davison et al. (2001) found

22

moderate agreement for most personality disorder diagnoses even in prison populations.

Second, in contrary to findings of Davison et al. (2001), antisocial and borderline personality

disorder diagnoses showed worst agreement of all in this study, instead of better agreement.

Third, agreement of antisocial trait scores was particularly low in this study, whereas findings

by Guy et al. (2008) indicate a strong dimensional association of antisocial traits.

With regard to PDQ-4+ diagnostic efficiency, the current finding that PDQ-4+ yielded

high false-positive rates and low false-negative rates of categorical personality disorder

diagnoses compared to patient’s SCID-II interview information is in accordance to previous

findings (Bos et al., 2005; Fossati et al., 1998; De Reus et al., 2011; Wilberg et al., 2000).

However, in previous studies in which diagnostic efficiency values were calculated,

sensitivity and negative predictive power were not lower for antisocial personality disorder

(Bos et al., 2005; Fossati et al., 1998).

Although most findings are supported by previous research, two remarkable findings

must be noted. First, it is remarkable that for antisocial personality disorder and to a lesser

extent for borderline personality disorder, SCID-II patient-informant concordance seemed to

be generally worse compared to previous research. Second, as for patient-informant

concordance, it is also remarkable that PDQ-4+ agreement for antisocial personality disorder

and borderline personality disorder seemed to be generally worse compared to previous

research. These are important findings, since antisocial personality disorder and borderline

personality disorder were the most prevalent in this sample, as was mentioned earlier.

Some explanations for the first finding, the one on SCID-II patient-informant

concordance, can be given. One possible explanation lies in the way the SCID-II interviews

have been administered, since SCID-II still relies on subjective interpretation of the

interviewer for some part (First et al., 1997). A second explanation lies in the informant

reports provided by the partners. Overall, informants (i.e. the partners) reported less

23

personality disorder traits than the patients in the current study. This finding is in concordance

with previous findings of studies comparing personality disorder diagnoses derived from

patient’s and informant’s interview information (Dreessen et al., 1998; McKeeman &

Erickson, 1997). In another study, this trend was found for most but not all personality

disorders, however (Schneider et al., 2004). Besides, in some other studies, the trend was not

found at all (Bernstein et al., 1997) or even in a reversed direction (Modestin & Puhan, 2000;

Zimmerman et al., 1988).

Some explanations can be given for the current finding that informants (i.e. the

partners) reported less personality disorder traits than the patients. First of all, the willingness

to report personality disorder traits depends on the type of informant (Modestin & Puhan,

2000). Some partners in this study stood aloof, seeing their husband’s mental health problems

as something where they had nothing to do with. Second, most of the patients themselves

were highly motivated to participate in the treatment program and willing to disclose

themselves. A third explanation for the current findings is cognitive dissonance, which refers

to the unpleasant state of psychological arousal resulting from an inconsistency within one’s

important attitudes, beliefs or behaviors (Festinger, 1957; Kenrick, Neuberg & Cialdini,

2010). In the resulting efforts to reduce it, women cope with physical and emotional abuse by

using cognitive strategies that help them perceive their partner in a more positive way while

staying in the abusive relationship (Herbert, Silver & Ellard, 1981). A fourth explanation to

close with is traumatic bonding, which relates to strong emotional ties that develop between

two persons where one person intermittently harasses, beats, threatens, abuses, or intimidates

the other (Dutton & Painter, 1981). The last two explanations are only partial however, since

not all relationships involved in this study had abusive characteristics.

Also for the second finding, the one on PDQ-4+ agreement, some explanations can be

given. It is difficult to explain these results, especially because better agreement rates were

24

found even in forensic populations (Abdin et al., 2011; Davison et al., 2001). One explanation

lies in the way the PDQ-4+ is approached by the patients, referring to the substantial number

of patients scored positive on one or both PDQ-4+ validity scales. The Too Good Scale was

related to underreporting and the Suspect Questionnaire Scale was related to overreporting, in

accordance to previous results presented by Wilberg et al. (2000) and De Reus et al. (2011). A

second explanation lies in the self-reports on the PDQ-4+. In line with previous research by

Bos et al. (2005), Fossati et al. (1998), De Reus et al. (2011) and Wilberg et al. (2000),

patients overreported personality disorder traits. However, they underreported antisocial traits

on the PDQ-4+ compared to SCID-II, in contrast to previous results (Bos et al., 2005; Fossati

et al., 1998; De Reus et al., 2011; Wilberg et al., 2000).

Some explanations can be given for the current finding that patients reported more

personality disorder traits on the PDQ-4+, except for antisocial traits. First, self-report

instruments do not provide a valid measure of personality disorder severity because they do

not establish the maladaptivity, distress or pervasiveness of each symptom, often resulting in

overdiagnosis of personality disorders (Widiger & Boyd, 2009). Second, the PDQ-4+ has a

dichotomous scale, while the SCID-II has a 3-point Likert scale. Thus, uncertainty about an

item might have resulted in a ‘questionable’ score on the SCID-II, rescored as ‘absent’ later

on, while it might have resulted in a ‘true’ score on the PDQ-4+, rescored as ‘present’ later

on. This difference in scoring procedures also could have resulted in overdiagnosis of

personality disorders. The finding that patients reported less antisocial traits on the PDQ-4+

might be due to one of the characteristics of the antisocial personality disorder itself which is

mentioned in the DSM-IV-TR (APA, 2000) and which was mentioned in this study earlier:

deception. Thus, underreporting of antisocial traits could be a manifestation of the antisocial

traits itself (APA, 2000).

Two important comments, the first one about the current results on SCID-II patient-

25

informant concordance and the second one about the current results on PDQ-4+ diagnostic

agreement, must be made. First, the subjective view of the patient and the ‘pseudo-objective’

view of the informant (i.e. the partner in this study) reflect two different assessment

approaches to the personality: the more experiential on the one hand, the more observational

on the other hand. Therefore, absolute SCID-II patient-informant concordance is not to be

expected. Second, as it was mentioned before, the PDQ-4+ does not provide a valid measure,

has a different scoring procedure and is susceptible to distortion, especially for antisocial

traits. Therefore, absolute PDQ-4+ diagnostic agreement with SCID-II is not to be expected

either.

Previous findings in various clinical and non-clinical populations indicate that both

patient and informant reports can make unique contributions to the assessment of personality

disorders and provide strong support for using both patient and informant reports in the

assessment of personality disorders (Klonsky et al., 2002; Zimmerman, 1994). In addition,

self-report instruments are currently receiving significant research attention, because they can

also make valuable contributions to the assessment of personality disorders (Widiger & Boyd,

2009). Although neither absolute SCID-II patient-informant concordance nor absolute PDQ-

4+ diagnostic agreement is to be expected, SCID-II patient-informant concordance as well as

PDQ-4+ diagnostic agreement turned out to be exceptionally low in this non-forensic

population of aggressive men for several reasons discussed earlier. Therefore, it seems that

informant reports and self-report instruments do not contribute to a more reliable and valid

personality disorder assessment in this non-forensic population of aggressive men.

Apart from the low contribution of informant reports and self-report instruments, it

may be questioned if even the SCID-II interview itself is suitable for personality disorder

assessment in this non-forensic population of aggressive men. It was expected that the

prevalence of only antisocial personality disorder would be high in this sample. However,

26

regarding SCID-II diagnoses derived from subject-based information as criterion, the

prevalence of most other personality disorders, especially borderline personality disorder, was

unexpectedly high, even when compared to forensic populations (Abdin et al., 2000; Davison,

Leese & Taylor, 2001; Ullrich & Marneros, 2004). Therefore, it is likely that personality

disorders were overdiagnosed in this study, as well as in other studies that used SCID-II in the

same way. One possible explanation for overdiagnosis is that personality disorder diagnoses

cannot be made solely on the basis of a short structured clinical interview, as the criteria for

personality disorders in particular require much more inference on the part of the observer

(APA, 2000; Zimmerman, 1994). In addition, the DSM-IV lists general diagnostic criteria for

a personality disorder, which must be met in addition to the specific criteria for a particular

named personality disorder (APA, 2000). These criteria were not explicitly examined in the

diagnostic process however, which also could have contributed to overdiagnosis of

personality disorders in this study.

This study has several notable strengths, including the concomitant use of interviews

and a self-report instrument to assess personality disorders, independent evaluations from

both patients and informants and the use of both categorical and dimensional approaches in

assessing personality disorders. However, this study also has some shortcomings which

should be noted. First, the sample size in this study was small, raising the possibility of Type

II errors. Second, a large number of analyses were conducted, increasing the risk of Type I

errors. Third, the Clinical Significance Scale, a short interview of the PDQ-4+, which assesses

if reaching the threshold of a specific personality disorder is also clinically significant, was

omitted in this study. In previous research, the use of the Clinical Significance Scale was

found to improve the diagnostic agreement and diagnostic efficiency between the interview

and the self-report, although indices are still modest to moderate using this scale (Bouvard et

al., 2011; Reus et al., 2011). However, the scale was omitted in this study because it does not

27

seem to enhance the intended time-saving effect of administering the PDQ-4+ versus

administering the SCID-II. A fourth limitation of this study to close with, is the lack of

agreement upon the gold standard assessment instrument for personality disorders

(Zimmerman, 1994). This limits the generalizability of the current findings.

In summary, the present findings indicate that SCID-II informant interviews do not

have much additional value to the SCID-II interviews with the patients themselves: the

informants underreport personality disorder traits of their partners compared to the patients

themselves and the patients themselves do not appear to inherently provide distorted (i.e.

dishonest) self-descriptions on the SCID-II interview. Therefore, the SCID-II informant

interview can be omitted in future treatment outcome research in this non-forensic population

of aggressive men. In addition, the search in order to reduce the amount of time and effort

asked from patients participating in the aggression treatment program has resulted in the

finding that the PDQ-4+ do not appear as an adequate instrument to assess personality

disorders, even for screening purposes: the PDQ-4+ has poor psychometric properties and

overdiagnoses most personality disorders. Therefore, the PDQ-4+ should be omitted as a

possible SCID-II substitution in future treatment outcome research in this non-forensic

population of aggressive men. Altogether, it can be concluded that one might only administer

the SCID-II to the patients in the first place. However, concerns about the suitability of the

SCID-II itself for personality disorder assessment in this non-forensic population of

aggressive men have been spoken out, so there is still a long way to go for reliable and valid

personality disorder assessment on one hand, but time-saving and cost-efficient on the other

hand.

28

References

Abdin, E., Koh, K. G. W. W., Subramaniam, M., Guo, M. E., Leo, T., Teo, C., Tan, E. E., & Chong,

S. A. (2011). Validity of the Personality Diagnostic Questionnaire—4 (PDQ-4+) among

Mentally Ill Prison Inmates in Singapore. Journal of Personality Disorders, 25, 834-841.

Akkerhuis, G. W., Kupka, R. W., Groenestijn, M. A. C. van, & Nolen W. A. (1996). PDQ 4+

vragenlijst voor persoonlijkheidskenmerken: experimentele versie. Lisse: Swets & Zeitlinger.

American Psychiatric Association (1994/2000). Diagnostic and Statistical Manual of Mental

Disorders (4th ed.). Washington, D.C.: American Psychiatric Association.

Babor, T., Higgins-Biddle, J. C., Saunders, J., & Monteiro, M. G. (2001). The Alcohol Use Disorders

Identification Test: Guidelines for Use in Primary Care. Second Edition. World Health

Organization.

Bagby, R. M., & Farvolden, P. (2004). The Personality Diagnostic Questionnaire-4 (PDQ-4). In M. J.

Hilsenroth, & D. L. Segal (Eds.), Comprehensive Handbook of Psychological Assessment 2.

Hoboken, NJ: John Wiley & Sons, Inc.

Bartko, J .J., & Carpenter, W. T. (1976). On the methods of reliability. Journal of Nervous and Mental

Disease, 163, 307-317.

Berman, A. H., Bergman, H., Palmstierna, T., & Schlyter, F. (2002). Evaluation of the Drug Use

Disorders Identification Test (DUDIT) in criminal justice and detoxification settings and in a

Swedish population sample. European Addict Research, 11, 22-31.

Berman, A. H., Bergman, H., Palmstierna, T., & Schlyter, F. (2003). DUDIT: The Drug Use Disorders

Identification Test. Version 1.0. Stockholm: Karolinska Institutet, Department of Clinical

Neuroscience Section for Alcohol and Drug Dependence Research.

Bernstein, D. P., Kasapis, C., Bergman, A., Weld, E., Mitropoulou, V., Horvath, T., Klar, H. M., &

Silverman, J. (1997). Assessing Axis II disorders by informant interview. Journal of

Personality Disorders, 11, 158-167.

Bos, J. H., Velzen, C. J. M. van, & Meesters, Y. (2005). The assessment of personality disorders.

29

PDQ-4+ versus SCID-II. Nederlands Tijdschrift voor de Psychologie, 60, 107-115.

Bouvard, M., Vuachet, M., & Marchant, C. (2011). Examination of the Screening Properties of the

Personality Diagnostic Questionnaire 4+(PDQ-4+) in a non-clinical sample. Clinical

Neuropsychiatry, 8, 151-158.

Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological

Measurement, 20, 37-46.

Cohen, J. W. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ:

Lawrence Erlbaum Associates.

Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1971). The dependability of behavioral

measurements. New York: Wiley.

Davison, S., Leese, M., & Taylor, P. J. (2001). Examination of the screening properties of the

Personality Diagnostic Questionnaire–4+ (PDQ-4+) in a prison population. Journal of

Personality Disorder, 15, 180-194.

Dreessen, L., & Arntz, A. (1998). Short-interval test–retest interrater reliability of the Structured

Clinical Interview for DSM-III-R Personality Disorders (SCID-II) in outpatients. Journal of


Dreessen, L., Hildebrand, M., & Arntz, A. (1998). Patient-informant concordance on the Structured

Clinical Interview for DSM-III-R Personality Disorders (SCID-II). Journal of Personal

Disorders, 12, 149-161.

Dutton, D. G., & Painter, S. L. (1981). Traumatic Bonding: the development of emotional attachments

in battered women and other relationships of intermittent abuse. Victimology: An International

Journal, 1, 139-155.

Festinger, L. (1957). A theory of cognitive dissonance. Evanston, IL: Row Peterson.

First, M. B., Gibbon, M., Spitzer, R. L., Williams, J. B. W., Benjamin, L. S. (1997). User’s Guide for

the Structured Clinical Interview for DSM-IV Axis II Personality Disorders (SCID II).

Washington, D.C.: American Psychiatric Press.

Fossati, A., Maffei, C., Bagnato, M., Donati, D., Donini, M., Fiorilli, M., Novella, L., & Ansoldi, M.

(1998). Brief communication: Criterion validity of the Personality Diagnostic Questionnaire-

30

4+ (PDQ-4+) in a mixed psychiatric sample. Journal of Personality Disorders, 12, 172-178.

Guy, L. S., Poythress, N. G., Douglas, K. S., Skeem, J. L., & Edens, J. F. (2008). Correspondence

Between Self-Report and Interview-Based Assessments of Antisocial Personality Disorder.

Psychological Assessment, 20, 47-54.

Herbert, T. B., Silver, R. C., & Ellard, J. H. (1991). Coping with an Abusive Relationship: I. How and

Why Do Women Stay? Journal of Marriage and Family, 53, 311-325.

Hinkle, D. E., Wiersma, W., & Jurs, S. G. (2003). Applied Statistics for the Behavioral Sciences (5th

ed.). Boston: Houghton Mifflin

Hyler, S. E. (1994). PDQ-4+ Personality Diagnostic Questionnaire. New York: New York State

Psychiatric Institute.

Klonsky, E. D., Oltmanns, T. F., & Turkheimer, E. (2002). Informant-Reports of Personality Disorder:

Relation to Self-Reports and Future Research Directions. Clinical Psychology: Science and

Practice, 9, 300-311.

Kraanen, F. L. (2008). Drug Use Disorders Identification Test Authorized Dutch Translation.

Amsterdam: University of Amsterdam, Department of Clinical Psychology.

Lobbestael, J., Leurgans, M., & Arntz, A. (2011). Inter-Rater Reliability of the Structured Clinical

Interview for DSM-IV Axis I Disorders (SCID I) and Axis II Disorders (SCID II). Clinical

Psychology & Psychotherapy, 18, 75-79.

McKeeman, J. L., & Erickson, M. T. (1997). Self and informant ratings of SCID-II personality

disorder items for nonreferred college women: effects of item and participant characteristics.

Journal of Clinical Psychology, 53, 523-533.

McGraw, K. O., & Wong, S. P. (1996). Forming inferences about some intraclass correlation

coefficients. Psychological Methods, 4, 30-46.

Modestin, J., & Puhan, A. (2000). Comparison of assessment of personality disorder by patients and

informants. Psychopathology, 33, 265-270.

Mulder, R. T. (2002). Personality Pathology and Treatment Outcome in Major Depression: A Review.

American Journal of Psychiatry, 159, 359-371.

Ouimette, P. C., & Klein, D. N. (1995). Test–retest stability, moodstate dependence, and informant-

31

subject concordance of the SCID-Axis II questionnaire in a non-clinical sample. Journal of


Reus, R. J. M. de, Berg, J. F. van den, Emmelkamp, P. M. G. (2011). Personality Diagnostic

Questionnaire 4+ is not Useful as a Screener in Clinical Practice. DOI: 10.1002/cpp.766.

Schippers, G. M., & Broekman, T. G. (2010). De AUDIT. Nederlandse vertaling van de Alcohol Use

Disorders Identification Test. Available from: http://www.mateinfo.nl/audit/audit-nl.pdf.

Schneider, B., Maurer, K., Sargk, D., Heiskel, H., Weber, B., Frölich, L., Georgi, K., Fritze, J., &

Seidler, A. (2004). Concordance of DSM-IV Axis I and II diagnoses by personal and

informant’s interview. Psychiatry Research, 127, 121-136.

Shrout, P. E., & Fleiss, J. L. (1979). Intraclass correlation: uses in assessing rater reliability.

Psychological Bulletin, 86, 420-428.

Shrout, P. E., Spitzer, R. L., & Fleiss, J. L. (1987). Quantification in psychiatric diagnosis revisited.

Archives of General Psychiatry, 44, 172-177.

Skodol, A. E., Pagano, M. E., Bender, D. S., Shea, M. T., Gunderson, J. G., Yen, S., et al. (2005).

Stability of functional impairment in patients with schizotypal, borderline, avoidant, or

obsessive-compulsive personality disorder over two years. Psychological Medicine, 35, 443-

451.

Spitzer, R. L., & Fleiss, J. L. (1974). A re-analysis of the reliability of psychiatric diagnosis. British

Journal of Psychiatry, 125, 341-347.

Tyrer, P., & Simmons, S. (2003). Treatment models for those with severe mental illness and comorbid

personality disorder. British Journal of Psychiatry, 182, 15-18.

Ullrich, S., & Marneros, A. (2004). Dimensions of personality disorders in offenders. Criminal

Behavioral Mental Health, 14, 202-213.

Weertman, A., Arntz, A., Dreessen, L., Velzen, C. van, & Vertommen, S. (2003). Short-interval test-

retest interrater reliability of the Dutch version of the Structured Clinical Interview for DSM-

IV Personality Disorders (SCID-II). Journal of Personality Disorders, 17, 562-567.

Weertman, A., Arntz, A., & Kerkhofs, L. M. (2000). Gestructureerd Klinisch Interview voor DSM-IV

As-II Persoonlijkheidsstoornissen. Amsterdam: Harcourt Assessment B.V.

32

Widiger, T. A., & Boyd, S. E. (2009). Personality disorders assessment instruments. In J. N. Butcher

(Ed.), Oxford Handbook of Personality Assessment. New York: Oxford University Press, Inc.

Wilberg, T., Dammen, T., & Friis, S. (2000). Comparing Personality Diagnostic Questionnaire-4+

with Longitudinal, Expert, All Data (LEAD) standard diagnoses in a sample with a high

prevalence of axis I and axis II disorders. Comprehensive Psychiatry, 41, 295-302.

Zimmerman, M. (1994). Diagnosing personality disorders: A review of issues and research methods.

Archives of General Psychiatry, 51, 225-245.

Zimmerman, M., Pfohl, B., Coryell, W., Stangl, D., & Corenthal, C. (1988). Diagnosing personality

disorder in depressed patients: A comparison of patient and informant interviews. Archives of

General Psychiatry, 45, 733-737.

Table 1. Sample characteristics - mean dimensional trait scores of personality disorders and the prevalence of categorical personality disorder

diagnoses

Personality disorder/scale SCID-II patient

M (SD)

n = 25

Prevalence (%)

SCID-II partner

M (SD)

n = 12

Prevalence (%)

PDQ-4+

M (SD)

n = 23

Prevalence (%)

Paranoid 2.44 (1.26) 20.0 2.50 (1.38) 25.0 4.48 (2.15) 47.8 Schizoid .960 (.978) .000 .830 (.937) .000 2.43 (1.97) 34.8 Schizotypal 1.20 (1.19) .000 .670 (.651) .000 3.87 (2.42) 39.1 Antisocial A-criteria 2.92 (1.58) 1.83 (1.40) 2.57 (1.67) Antisocial C-criteria 3.96 (3.09) 3.17 (3.01) Antisocial 48.0 41.7 34.8 Borderline 4.12 (2.59) 36.0 3.50 (2.32) 33.3 5.04 (2.12) 60.9

Histrionic .600 (1.00) .000 .750 (1.29) .000 1.65 (1.47) 4.30 Narcissistic .520 (.918) .000 1.58 (2.31) .000 2.35 (1.30) 8.70

Avoidant 1.84 (1.72) 20.0 1.50 (1.83) 25.0 3.35 (2.25) 47.8 Dependent 1.36 (1.63) 8.00 1.67 (1.72) 0.00 2.00 (1.95) 17.4

Obsessive-compulsive 2.64 (1.75) 28.0 2.25 (1.29) 8.30 3.52 (1.15) 56.5 Cluster A 4.60 (12.8) 4.00 (1.54) 10.5 (5.62)

Cluster B 8.16 (4.66) 7.67 (5.80) 11.6 (4.95) Cluster C 5.84 (3.52) 5.42 (3.03) 8.87 (5.13)

Total number of traits 18.6 (7.95) 17.1 (7.48) 31.0 (14.0) Total number of diagnoses 1.60 (1.53) 1.50 (1.00) 3.70 (2.44)

Cluster A (odd, eccentric) includes paranoid, schizoid and schizotypal; Cluster B (dramatic, emotional, erratic) includes antisocial,

borderline, histrionic and narcissistic; Cluster C (anxious) includes avoidant, dependent and obsessive–compulsive; Total number of traits

includes any personality disorder. M: Mean, SD: standard deviation.

Table 2. Preliminary explorative analysis - mean inter-item correlations of PDQ-4+ scales (n

= 23)

Scale (N of items) Mean Variance Minimum Maximum Range

Paranoid (7) .347 .055 -.146 .742 .889

Schizoid (7) .254 .051 -.066 .677 .743

Schizotypal (9) .223 .052 -.178 .649 .826

Antisocial A-criteria (7) .197 .024 -.182 .467 .649

Antisocial C-criteria (13) .215 .052 -.204 .697 .901

Borderline (9) .138 .072 -.420 .636 1.06

Histrionic (8) .064 .048 -.331 .528 .859

Narcissistic (9) .056 .039 -.289 .611 .900

Avoidant (7) .337 .047 -.071 .763 .833

Dependent (8) .213 .060 -.163 1.00 1.16

Obsessive-compulsive (8) .240 .034 -.178 .572 .750

Too Good Scale (4) .154 .048 -.087 .549 .636

Suspect Questionnaire (1) - - - - -

Cluster A (22) .256 .052 -.311 .840 1.15

Cluster B (33) .090 .054 -.456 .792 1.25

Cluster C (23) .193 .050 -.322 1.00 1.32

Total number of traits (77) .143 .060 -.574 1.00 1.57

Cluster A (odd, eccentric) includes paranoid, schizoid and schizotypal; Cluster B

(dramatic, emotional, erratic) includes antisocial, borderline, histrionic and narcissistic;

Cluster C (anxious) includes avoidant, dependent and obsessive–compulsive; Total number of

traits includes any personality disorder. Two of the antisocial C-criteria component variables had zero variance and were

removed from the scale; One of the Total component variables had zero variance and was removed from the scale.

-: One of the Suspect Questionnaire variables had zero variance and was removed from the scale, so too many items were deleted from the scale to perform the analysis.

35

Table 3. Patient-informant concordance SCID-II - correlations between dimensional

personality disorder trait scores by patient’s and informant’s SCID-II interview and Kappa

values of categorical personality disorder diagnoses by patient’s and informant’s SCID-II

interview (n = 11)

Personality disorder/scale

- Source of information

M (SD) Prev. (%) r κ (SE)

Paranoid

- Patient’s interview

- Informant’s interview

2.36 (1.12)

2.27 (1.19)

18.2

18.2

.368 -.222 (.108)

Schizoid



1.00 (.775)

.910 (.944)

0.00

0.00

- -

Schizotypal



.910 (.831)

.550 (.522)

0.00 0.00

- -

Antisocial A-criteria - Patient’s interview


3.45 (1.29)

1.73 (1.42)

.292

Antisocial



72.7 36.4

.353 (.197)

Borderline



4.09 (2.47)

3.09 (1.92)

36.7

27.3

.589 .377 (.291)

Histrionic



.640 (.809)

.450 (.820)

0.00

0.00

- -

Narcissistic



.820 (1.25)

1.00 (1.18)

0.00

0.00

- -

Avoidant



1.91 (1.87)

1.64 (1.86)

27.3

27.3

.709* .542 (.285)

Dependent



1.64 (1.75)

1.45 (1.64)

9.09

0.00

.379 -

Obsessive-compulsive



2.82 (1.72)

2.18 (1.33)

36.4

9.09

.585 .298 (.246)

Cluster A



4.27 (2.05)

3.73 (1.27)

-.122

Cluster B



9.00 (3.77)

6.27 (3.38)

.479

Cluster C



6.36 (3.98) 5.27 (3.13)

.657*

Total number of traits - Patient’s interview

19.6 (6.67)

.285

36

- Informant’s interview 15.3 (4.27)

Total number of diagnoses



2.00 (1.27)

1.27 (.647)

.489

Cluster A (odd, eccentric) includes paranoid, schizoid and schizotypal; Cluster B

(dramatic, emotional, erratic) includes antisocial, borderline, histrionic and narcissistic;

Cluster C (anxious) includes avoidant, dependent and obsessive–compulsive; Total number of

traits includes any personality disorder.

M: mean; SD: standard deviation; Prev.: prevalence; r: Pearson correlation coefficient;

κ: Cohen’s Kappa value; SE: standard error of Cohen’s Kappa value.

-: No calculation of correlation measures because the range was too small to calculate

correlations/because one variable was a constant in the 2-ways table or where 5% or less of

the sample were diagnosed of having the personality disorder in question.

*: p < 0.05; **: p < 0.01; ***: p < 0.001.

Table 4. Diagnostic agreement and diagnostic efficiency of PDQ-4+ - correlations between dimensional personality disorder trait scores by

patient’s SCID-II interview and PDQ-4+ self-report, Kappa values of categorical personality disorder diagnoses by patient’s SCID-II interview

and PDQ-4+ self-report and diagnostic efficiency values of categorical personality disorder diagnoses by PDQ-4+ self-report (n = 23)

Personality disorder/scale

- Source of information

M (SD) Prev. (%) ICC κ (SE) Sensitivity Specificity PPP NPP

Paranoid

- SCID-II interview

- PDQ-4+ self-report

2.43 (1.31)

4.48 (2.15)

21.7

65.2

.204 -.039 (.143) .600 .333 .200 .750

Schizoid

- SCID-II interview


.960 (.976)

2.43 (1.97)

0.00

34.8

- - - - - -

Schizotypal

- SCID-II interview


1.26 (1.21)

3.87 (2.42)

0.00

39.1

- - - - - -

Antisocial A-criteria

- SCID-II interview


2.91 (1.65) 2.57 (1.67)

.119

Antisocial C-criteria - SCID-II interview


3.96 (3.21)

3.17 (3.01)

.674***

Antisocial

- SCID-II interview


43.5

34.8

.094 (.206) .400 .692 .500 .600

Borderline

- SCID-II interview


4.30 (2.62)

5.04 (2.12)

39.1

60.9

.328 .087 (.187) .667 .429 .429 .667

Histrionic

- SCID-II interview


.480 (.947)

1.65 (1.47)

0.00

4.30

- - - - - -

Narcissistic

- SCID-II interview


.430 (.788)

2.35 (1.30)

0.00

8.70

- - - - - -

38

Avoidant

- SCID-II interview


1.83 (1.78)

3.35 (2.25)

21.7

47.8

.485* .465 (.159) 1.00 .667 .455 1.00

Dependent

- SCID-II interview


1.39 (1.67)

2.00 (1.95)

8.40

17.4

.608*** .623 (.236) 1.00 .908 .500 1.00

Obsessive-compulsive

- SCID-II interview


2.57 (1.75)

3.52 (2.15)

26.1

56.5

.209 .263 (.160) .833 .529 .385 .900

Cluster A

- SCID-II interview


4.65 (2.90)

10.5 (5.62)

.135

Cluster B - SCID-II interview


8.13 (4.85)

11.6 (4.95)

.312*

Cluster C

- SCID-II interview


5.78 (3.52) 8.87 (5.13)

.359*

Total number of traits - SCID-II interview


18.6 (8.30)

31.0 (14.0)

.224*

Total number of diagnoses - SCID-II interview


1.61 (1.41)

3.70 (2.44)

.221

Cluster A (odd, eccentric) includes paranoid, schizoid and schizotypal; Cluster B (dramatic, emotional, erratic) includes antisocial,

borderline, histrionic and narcissistic; Cluster C (anxious) includes avoidant, dependent and obsessive–compulsive; Total number of traits

includes any personality disorder.

M: mean; SD: standard deviation; Prev.: prevalence; ICC: Intraclass correlation coefficient; κ: Cohen’s Kappa value; SE: standard error of

Cohen’s Kappa value; PPP: positive predictive power; NPP: negative predictive power.

-: No calculation of correlation measures because the range was too small to calculate correlations/because one variable was a constant in

the 2-ways table or where 5% or less of the sample were diagnosed of having the personality disorder in question.

*: p < 0.05; **: p < 0.01; ***: p < 0.001.

39

Table 5. Post-hoc analysis - results of paired-samples t-tests for mean differences between subject’s interview, informant’s interview and self-

report information

Paired-samples t-test

- Variable 1 (source of information)

- Variable 2 (source of information)

M

M

SD

SD

Dif. SE (Dif.) t df p Eta sqrd.

Paired-samples t-test 1

- Number of traits (SCID-II subject)

- Number of traits (SCID-II informant)

19.6

15.2

6.67

4.27

4.36 2.06 2.12 10 .030 .313


- Number of diagnoses (SCID-II subject)

- Number of diagnoses (SCID-II informant)

2.00

1.27

1.27

.647

.727 .333 2.19 10 .027 .323


- Number of traits (SCID-II subject)

- Number of traits (PDQ-4+ self-report)

18.8

31.0

8.30

14.0

-12.4 2.75 -4.53 22 < .001 .483


- Number of diagnoses (SCID-II subject)

- Number of diagnoses (PDQ-4+ self-report)

1.61 3.70

1.41 2.44

-2.09 .478 -4.36 22 < .001 .464


- Antisocial trait score (SCID-II subject)

- Antisocial trait score (PDQ-4+ self-report)

2.91

2.57

1.65

1.67

.348 .460 .756 22 .229 .025


- Antisocial C-criteria (SCID-II subject)

- Antisocial C-criteria (PDQ-4+ self-report)

3.96

3.17

3.21

3.01

.783 .514 1.52 22 .071 .095

M: mean; SD: standard deviation; Dif.: mean difference; SE (Dif.): standard error of mean difference; t: test statistic; df: degrees of freedom; p: confidence level; Eta sqrd.: effect size.

Download - Patient-informant concordance of the Structured Clinical Interview for DSM-IV Axis II ... · 2017-10-10 · Vertommen, 2003), SCID-II was chosen for use as the gold standard in this

Top Related