the challenge of psychological measurement - cognadev challenge of... · the challenge of...

18
The challenge of Psychological Measurement Maretha Prinsloo 2014 Introduction As indicated by a number of researchers in the field of psychological measurement, such as Barrett (2013), Michell (1999, 2009), Borsboom et al (2009), Grice et al (2012) and others, no psychological characteristic varies as a quantity. Michell (2009) points out that an attribute is measurable if and only if it possesses both an ordinal and additive structure. Since there is no evidence that psychological attributes are additively structured, the presumptions underlying the concept of validity in psychometrics are thus invalidly endorsed (Barrett 2013). This poses fundamental challenges for the measurement of psychological constructs. The potential shortcomings of both the psychometric assessment techniques as well as the statistical analysis of the measured attributes may erode the essence of what is being measured. Correlations, normal curves, and standard deviations are techniques which deal with quantitative data, but fail to accommodate the complexity and meaning of psychological constructs per se. Grice et al (2012) point out that standard null hypothesis testing uses statistical methods that seem to abstract away from reality with every step. He suggest a much more realistic and practical methodology. Freedman (1991) further points out that “statistical technique can seldom be an adequate substitute for good design, relevant data, and testing predictions against reality in a variety of settings” (p. 291) These insights have not guided common practice in traditional psychometrics though. Many psychological constructs such as “g” in intelligence research and the “big five” in personality research have been derived at empirically using limited measurement methodologies combined with correlational analysis. The outcome of this approach has not always been plausible. An individual is after all unlikely to shows a static personality profile of only a few specific, statistically derived characteristics. In other words, a person can hardly be compared to a group average. Although there may be an argument for the usefulness of the concept of group averages, the question remains of whether a group average is even a meaningful thing to begin with as it does not exist in reality. Psychology thus has to critically rethink this issue. It would make sense to start off by clarifying the underlying assumptions regarding the nature of the subject matter. Psychological make-up seems multi-layered, dynamic and interactive. Behaviour is highly variable, contextualised and related to the actor’s worldview and values, or what s/he regards as appropriate and desirable, as well as his/her sense of personal purpose. It often reflects a unique existential theme for every individual, and the associated behavioural tendencies can almost be compared to the multiple facets of a kaleidoscope – all of which raise the bar for measurement practice. In this article various theoretical and measurement practices and challenges in psychometrics are considered and an approach proposed for the measurement of psychological factors by focusing on consciousness and cognition, both of which are holonically organised. To broadly discuss these issues, the focus will be on theory, measurement, statistics and the interpretation of assessment results for predictive purposes.

Upload: tranliem

Post on 30-Apr-2018

224 views

Category:

Documents


5 download

TRANSCRIPT

The challenge of Psychological Measurement

Maretha Prinsloo

2014

Introduction

As indicated by a number of researchers in the field of psychological measurement, such as Barrett (2013), Michell (1999, 2009), Borsboom et al (2009), Grice et al (2012) and others, no psychological characteristic varies as a quantity. Michell (2009) points out that an attribute is measurable if and only if it possesses both an ordinal and additive structure. Since there is no evidence that psychological attributes are additively structured, the presumptions underlying the concept of validity in psychometrics are thus invalidly endorsed (Barrett 2013).

This poses fundamental challenges for the measurement of psychological constructs. The potential shortcomings of both the psychometric assessment techniques as well as the statistical analysis of the measured attributes may erode the essence of what is being measured.

Correlations, normal curves, and standard deviations are techniques which deal with quantitative data, but fail to accommodate the complexity and meaning of psychological constructs per se. Grice et al (2012) point out that standard null hypothesis testing uses statistical methods that seem to abstract away from reality with every step. He suggest a much more realistic and practical methodology. Freedman (1991) further points out that “statistical technique can seldom be an adequate substitute for good design, relevant data, and testing predictions against reality in a variety of settings” (p. 291)

These insights have not guided common practice in traditional psychometrics though. Many psychological constructs such as “g” in intelligence research and the “big five” in personality research have been derived at empirically using limited measurement methodologies combined with correlational analysis. The outcome of this approach has not always been plausible. An individual is after all unlikely to shows a static personality profile of only a few specific, statistically derived characteristics. In other words, a person can hardly be compared to a group average. Although there may be an argument for the usefulness of the concept of group averages, the question remains of whether a group average is even a meaningful thing to begin with as it does not exist in reality.

Psychology thus has to critically rethink this issue.

It would make sense to start off by clarifying the underlying assumptions regarding the nature of the subject matter. Psychological make-up seems multi-layered, dynamic and interactive. Behaviour is highly variable, contextualised and related to the actor’s worldview and values, or what s/he regards as appropriate and desirable, as well as his/her sense of personal purpose. It often reflects a unique existential theme for every individual, and the associated behavioural tendencies can almost be compared to the multiple facets of a kaleidoscope – all of which raise the bar for measurement practice.

In this article various theoretical and measurement practices and challenges in psychometrics are considered and an approach proposed for the measurement of psychological factors by focusing on consciousness and cognition, both of which are holonically organised.

To broadly discuss these issues, the focus will be on theory, measurement, statistics and the interpretation of assessment results for predictive purposes.

Theory

Theory is about meaning. Since psychological functioning cannot easily be understood, encapsulated and observed, theoretical constructs in psychology are best specified within the context of an integrated and self-contained theoretical model which is rationally and philosophically anchored and if possible, empirically supported. Given the inadequate scientific status of psychology, such theoretical models cannot be regarded as reflective of real phenomena, and need to meet certain criteria of theory building while also partly transcending the shortcomings of the logical and empirical domains.

Criteria for theory building include: parsimony, practical utility, structural adequacy, generalizability, specificity, empirical adequacy and falsifiability (Prinsloo, 1992). A discussion of these criteria can also be found in Prinsloo & Barrett (2012).

In formulating a theoretical model – be that of behavioural tendencies or catalysts of behaviour, certain guidelines can be followed:

- Practice, observation and action research provide much richness in understanding psychological constructs. Psychology is an ever emerging science and real life manifestations of complex and dynamic behavioural tendencies provide meaningful clues as to what is involved. It therefore is important that theory should stem from practice.

- Current theory and research findings need to be integrated to find intuitively appealing aspects that have repeatedly emerged over time and across theoretical perspectives. Given the limited scientific status of psychology as well as its complexity, these emerging insights need to be kept simple, almost descriptive and intuitive. As Taleb (2010) points out: “perhaps there is a realm of wisdom from which the logician is exiled” (p 255).

But has this approach to theory building been followed in traditional psychometrics - which primarily focuses on the constructs of Personality and Intelligence. In terms of the criteria of practical utility and structural adequacy, for example, it can be argued that the divide between Personality and Intelligence lacks substance and may not always be useful especially from the perspective of human functioning as integrated thus having the potential for almost any combination of process and context dependent traits or characteristics.

The popular typological approach in personality theory and measurement, which specifies polarities such as “introversion versus extraversion”, also seems flawed in terms of a number of the above mentioned criteria for theory building. For many individuals, both tendencies of intro- and extraversion may commonly manifest as part of their behavioural repertoire. Whether one or the other is triggered, depends on internal and external factors. The forced choice format of personality questionnaires, however, ensures outcomes in terms of the theoretically proposed constructs.

Traditional psychometrics thus fails to account for dynamic and interactive psychological effects. The question remains of whether it is at all useful to measure reported on behavioural tendencies and statistically derive at ”latent traits” which are supposed to underlie behaviour as these static and structured constructs may not be sufficient to capture the core aspects of psychological functioning.

But what are the alternatives? How can the uniqueness, dynamics and perhaps thematic nature of human behaviour best be accessed and understood?

One possibility in getting to understand and predict human behaviour, is to focus on the catalysts underlying the dynamic interplay of tendencies, or the way in which certain behaviours are activated and maintained.

These catalysts primarily seem to be the subject matter of Consciousness theory with its focus on concepts such as worldview, perceptual system, valuing orientation and frame of reference, as well as the information processing models describing cognitive functioning.

Consciousness models such as those of Graves, Loevinger, Gebser, Piaget, and others (Prinsloo, 2014) offer a starting point in this regard. The constructs proposed by these models all espouse a holonic structure (a term coined by Wilber, 2001) of organisation where each consecutive level transcends and includes the previous. In the case of ‘soft hierarchies’ such as these models, it is difficult to separate constructs given the significant overlap in the postulated levels of awareness involved in cognition, moral development and ego states.

The organisational structure of holonic concepts differs fundamentally from that of the typologies and traits of personality theory as well as from the construct of ability, GMA or “g” of the differential paradigm.

Value orientations and cognitive processes, as conceptualised holonically, act as catalysts or frames of reference which determine the interpretation of incoming stimuli. These constructs may be powerful for predictive purposes, probably more so than any one of the statistically derived concepts from traditional psychometrics. The measurement of these catalysts pose serious challenges though, as the test subject’s worldview or frame of reference determines the way in which the item content of an assessment technique is interpreted. In other words, the same test item may be understood to mean different things to people with different worldviews.

It can be concluded that whether measurement should focus on traditional constructs such as behaviour, traits, typologies and abilities on the one hand, or on holonically organised psychological catalysts of behaviour on the other, depends on the measurement purposes.

Measurement

Various testing methodologies including assessment centres, interviews, self-report questionnaires, simulation exercises and IQ tests are currently used to measure psychological constructs.

Assessment centres capitalise on expert observation of behaviour in controlled environments according to operationalised guidelines. However, the mere act of observation is bound to impact on the subject’s behaviour and the subjectivity of the observer may interfere with the findings.

Traditional psychometrics largely relies on self-report in measuring personality constructs. The results can be distorted by the test subject’s lack of awareness and objectivity, efforts to manipulate the outcomes of one’s results and subconscious defence mechanisms. This criticism applies to structured and unstructured interviews as well as to questionnaires.

IQ testing via highly structured linear tasks requiring timed, convergent, logical-analytical reasoning about specific content domains to determine “intellectual ability” only reflects a small aspect of what intellectual functioning entails.

These techniques aimed at measuring static constructs offer inadequate clarification of complex and dynamic systems functioning. Alternative assessment methodologies are thus called for.

Paul Barrett (2011), on the Psychometric forum, pointed out that a core challenge of assessment is to capture the output of a complex, self-organizing, adaptive system: “To predict these kinds of

outputs requires innovative classifier (simple ordered classes or categories) construction that neither depends upon strong quantitative measurement models for their validity, nor simplistic nonsense such as true scores, latent variables, or the like.”

According to Barrett this may require a new kind of systems-modeling, where trajectory mapping for an individual is the goal rather than measurement of psychological attributes. It implies a differentiation between what is to be considered measurable versus classifiable, or loose-ordered aspects of a systems-network model. The problem seems not one of "psychometrics - measurability" but one of how to deal with assessing the operation and outcome-consequences of a dynamic complex system in motion so to speak. The idea is to move from broad generalizations into more precisely modelling the individual. This he sees “not as a task for psychometrics, but rather as something quite new for a group of people who are more scientist-philosopher-heuristics specialists than ersatz statisticians” (Barrett, 2011).

This is a tall order. A number of principles may, however, guide efforts to assess integrated psychological functioning in this way.

The measurement of value orientations

In this article, the measurement of the holonically organised constructs of consciousness and cognition is proposed as a potential improvement on current psychometric practice. Consciousness theories abound, but are structured in terms of homogeneous principles (Prinsloo, 2014). For the purposes of this article, the Spiral Dynamics theory of Graves (Beck & Cowan, 1996) describing various value orientations, is focused on.

An eclectic approach is used in considering possible solutions for the measurement of value orientations.

Some wisdom in this regard is provided by hormesis as well as homeopathy. The term hormesis refers to a favourable biological response to low level exposure to toxins or stressors. In homeopathy the principle of “heal like with like” applies, implicating the power of resonance. The principle of interest here, is that by diluting potentially healing agents in water, until the substance per se no longer is present but the water carries the essential chemical imprint of the substance, the healing properties are retained.

The homeopathic principle of resonance provides a valuable metaphor for psychometrics where the goal is to extract the essence, or the fundamental information structure, of a particular psychological orientation, the latter of which permeates all behavioural manifestations. This is achieved by presenting test items that may not obviously seem associated with a particular value orientation, and may even appear as conflicting, but which reflects a powerful underlying structure which echo’s that of a particular value orientation.

This is a different approach from common practice in psychometrics which relies on item content and self-insight. Such reliance on content is potentially problematic given its dependence on the test subject’s perceptual filters. By creating and systematically varying stimuli that may not strictly be responded to in terms of content, but which may potentially resonate with a subject’s perceptual framework, transcends a measurement approach relying on obvious and transparent items. Content may add noise and is an unreliable indication of the catalysts that trigger behaviour.

An example is the tendency of those who show socially undesirable tendencies to deny it, whereas those same weaknesses are more readily acknowledged by less defensive and more aware individuals. From a traditional psychometrics perspective, these results seem counter intuitive. A test item such as “I can be impulsive at times” is thus often denied by those who tend to be

impulsive and accepted by those who are open-minded and self-aware – even though the latter may not be particularly impulsive. The “resonance” is not achieved by the item content (impulsivity) but by the open-mindedness to embrace self-criticism.

Another example could be that those who live a highly routinized and regulated life, tend to reject the notion of being structured or of seeking structure, given their frustration with its self- or environmentally imposed limitations. In some of these cases, the individuals may want to liberate themselves, but experience dissonance given their needs for security and emotional certainty. Again, the focus is not in the item content (namely that of being structured), but in the subjectively experienced cost of this tendency.

These examples indicate that a measurement focus on perceptual frameworks as catalysts, offers a useful alternative to one strictly based on item content. Item content can, however, be leveraged as a means to an end in revealing the catalysts of behaviour.

When measuring the holonically organised constructs of Consciousness theory, the key challenge is to tap into the essence of each particular orientation, thereby enabling differentiation between the various overlapping layers. The focus is on the subjectively experienced purpose and meaning of everyday functioning.

The “essence” of each orientation or level of consciousness, can perhaps be understood in terms of Pribram’s work on structural monism. In exploring the mind-brain relationship, he proposed the concept of pluralistic monism in which "informational structure" remains neutral and consistent across different physical and metaphysical manifestations. For example, a particular tune in music remains essentially the same whether it is whistled, played by an orchestra, appears in the form of sheet music or is simply being dreamt about. In measuring overlapping and holonically organised constructs, the challenge is to extract the essence and core meaning of each layer which remains “neutral” regardless of behavioural manifestations.

Because these issues are of a subtle nature, their measurement requires a great degree of practical understanding of the constructs. To obtain this level of understanding requires of the researcher focused introspective awareness, compassionate sensitivity of others and systematic action research, which may enable the identification of seemingly small or subtle differences between the catalysts of behaviour. The principle involved is similar to that referred to by chaos theorists as ”sensitive dependence on initial conditions”. It implies that small variations in perception, which reside at the source of the allocated meaning, may result in significant differences between responses which may seem to occur randomly. This concept from chaos theory is a fundamental mechanism of evolution which underlies the potential for creative process and therefore immense variability, which also applies to human behaviour. The next valuable guideline that can be utilised in measuring value orientations is Wilber’s (2006) proposed concept of “altitude” which he regards as central to the understanding of consciousness. According to Wilber the degree of awareness is the central organising principle involved in perception. The inclusiveness of a person’s world view is thus an important criterion in differentiating between value orientations. But how that manifests requires further clarification. Given the holonic nature of the Spiral Dynamics theory (where the various “levels” are indicated by certain colours), higher level value orientations are more inclusive. Whereas the core theme of the Purple value orientation is security and certainty, that of the Red system is power, Blue is structure, Orange is value creation, Green is harmony, Yellow is integration and Turquoise is awareness. Each

of these successive themes transcends and includes the previous. For example, for Purple, close family and team members are important. Blue can, however, accommodate broader social groups and goals. For Green the emphasis is on humanity and for Turquoise the proliferation of all life. The “level” or “altitude” from which a person views and experiences life, determines how that person’s physical, emotional, intellectual, and psychological resources would be capitalised on. Figure 1: A graphic representation of the spiral dynamics model of Graves

But how can item content be structured to capitalise on the above mentioned principles of initial conditions, resonance and altitude? Evidence of a particular active orientation or level of awareness could possibly be found in both the: (a) consistency of associated responses regardless of seemingly contradictory item content; (b) level of response (given the holonic or hierarchical organisation of the constructs); and (c) expected overlap of themes Consistency

Pribram’s (1986) theory of structural monism explains the impact of what he refers to as subatomic coherence. The latter ensures the neutrality and retention of information structure across various manifestations – both of a tangible or physical, as well as of an intangible or metaphysical nature.

This concept can also be applied to psychological constructs such as that of value orientation – proposed as a catalyst of behaviour. Here, the concept of coherence translates to consistency in approach. The underlying information structure of specific items, regardless of the specific content involved (as explained, the manifestation may resonate with the underlying information structures of specific value orientation(s)). The unique meaning attributed to the item content is a function of the perceptual framework or value orientation of the test subject and can be compared to the spectacular repetitive fractal patterns of chaos theory. Fractals reflect the emergence of an essential aspect of organisation regardless of the format of realisation.

The detection of and resonance between “deep” information structures can be expected to emerge consistently (given the coherence involved) regardless of more superficial characteristics (such as seemingly contradictory item content). For example, test subjects with a strong compassionate orientation may endorse a statement such as ”I regard charity as important”, yet also endorse a statement such as “I do not practice charity”. Here the item content may seem conflicting, but the particular value orientation of compassion is consistently triggered given its deeper structure. To

interpret such seeming dissonance in terms of a “lie scale”, as is often done in psychometrics, is short sighted.

Another example which demonstrates the consistency in approach of a particular value orientation, regardless of seemingly conflicting item content is the tendency of the Red value orientation of the Spiral Dynamics theory, for example, to endorse seemingly contradictory statements such as:

- “I’m tough” and “I am sensitive” - “I control my own life” and “external factors control my life” - “I seek fun and sensory stimulation” and “I seek success via hard work”

By systematically varying item content, assessment techniques can thus be designed to identify consistent patterns associated with a particular value orientation.

Level / Altitude

Given the holonic nature of many of the constructs dealt with in psychology, the various levels of organisation involve a great deal of overlap as each consecutive level incorporates and transcends the previous one. A higher level of a holonic structure can be expected to show many of the characteristics of previous levels, however, it also shows a degree of uniqueness that is not reflected by preceding levels.

Assessment techniques can capitalise on the systematic variation of level-related aspects of the value orientation or cognitive process being measured.

As Taleb (2010) observed: “In theory there is no difference between theory and practice; in practice there is.” (p 213). This is because the reality of everyday life represents a higher level of realisation than theory. It encompasses and transcends theory. In the case of the valuing systems as specified by the Spiral Dynamics model (of which the levels are characterised by colours), there is no difference between Red ego needs and Green acceptance needs from a Red point of view, but a clear differentiation between the two from a higher level of Green organisation.

To demonstrate the principle of “level” it may be useful to consider the following responses to item content:

If the item reads: “I’m most concerned about”, and the response options are:

Response options

Spiral Dynamics theory: the various value orientations

Purple Red Blue Orange Green Yellow Turquoise

My family Yes Yes Yes (Yes) (Yes)

My team (Yes) Yes Yes

My community

(Yes) Yes (Yes)

Stakeholders Yes Yes (Yes) (Yes)

Humanity Yes (yes) Yes

Consciousness Yes Yes Yes

Gaia Yes ((Yes) the “yes” in brackets indicates that the item content may be endorsed but that it does not reflect the essential orientation of that particular colour)

Higher level value orientations may endorse items normally associated with lower level value orientations, but not vice versa. For example, Purple, Red, Blue and Orange are unlikely to endorse “Humanity” in the example above. But Green may endorse “Family”, “Team”, “Community”,

“Stakeholders”, “Humanity” and “Consciousness”. Green should therefore theoretically and for measurement purposes not be linked to the “Family” response as it does not differentiate between Green and the other colours that endorse “Family”. Green should ideally be linked positively to Humanity. However, Red is unlikely to endorse “Humanity” and could be linked negatively to Humanity to emphasise level or altitude effects.

Another example to illustrate level or altitude effects can be found in the differences in response to socially undesirable personal traits. Higher level value orientations will embrace these aspects more readily than lower levels. To the question: “I can be:” Red and Green may respond seemingly counter-intuitively.

Answer:

- “defensive at times” – endorsed by Green and denied by Red - “quite impulsive” – endorsed by Green and denied by Red

This can be interpreted in terms of Green being more open-minded and less defensive than Red.

Overlap

Subsequent levels of holonic organisation tend to endorse related perspectives which may not be as relevant at other levels:

Theme Purple Red Blue Orange Green Yellow Turquoise

Safety Yes Yes

Sensory stimulation

Yes Yes (Yes) (Yes)

Power Yes Yes (Yes) (Yes)

Order Yes (Yes)

Status Yes Yes

Relating (Yes) Yes

Understanding (Yes) (Yes) Yes Yes

Connecting Yes Yes

Some of this overlap may be due to the different interpretations of the item content. For example, the concept used in the item may be “Personal power”.

- Red may perceive this in terms of authority, positional power, dominance and will identify and endorse it given Red’s identity formation theme, need for recognition and external locus of control

- Green may primarily regard the trigger word as indicative of centredness and inner substance by which others can in turn be empowered. It identifies with and endorses the trigger of personal power given an internal locus of control and well established sense of self.

To measure psychological catalysts by relying on the above mentioned factors of consistency, level and overlap, is not straight forward though. Differentiation between these holonically organised constructs can be expected to be small. This is due to the complexity of the information structures involved and the circularity involved in measuring catalysts of behaviour such as perceptual frameworks and valuing systems.

The measurement of cognitive constructs

Few terms in Psychology have elicited the amount of controversy that the concept of intelligence has (Whiteley, 1977). Intelligence is typically defined in terms of concepts such as learning, problem solving, memory, executive control, judgement, speed and the ability to abstract. In reviewing all these conceptions, it appears that the notion of intelligence seems quite arbitrary - or rather - the research community has until now failed to agree on a definition of the concept. Despite their obvious limitations, some of the definitions however, seem to have gained general acceptance mainly because of their parsimony and intuitive appeal. Intelligence is therefore a problematic concept in that it is of a fuzzy nature incorporating many important features for which no definite criteria can be determined.

Not only attempts to define intelligence, but the concept per se, have repeatedly been criticized. Some theorists regard the term intelligence as useful in everyday language, but do not view it as constituting an adequate scientific concept (Maraun, 1998) even though it is regarded as scientifically useful by others. Its measurement too, poses fundamental challenges.

A powerful research catalyst in the field of cognitive psychology can thus be found in questions regarding the structure of the mind and its bearing on thinking, reasoning and problem solving. Efforts to scientifically capture the essence of these complex, intangible and largely descriptive concepts has resulted in a polarisation of theoretical and methodological approaches ranging from quantitative to qualitative observation and interpretation to mere speculation.

Traditionally, the measurement of intellectual functioning involves techniques including IQ testing, structured interviews, situational judgement tests, assessment centres, handwriting analysis, knowledge tests and trainability tests. Many of these techniques, especially IQ testing, capitalise on logical-analytical reasoning of a convergent nature within structured contexts using domain specific information. These methodologies mostly capture particular aspects of cognitive functioning only, but do not provide a comprehensive picture of cognitive capabilities and preferences from a holonic perspective.

In this article a holonic view and a new research approach is proposed by which cognitive functioning can be understood, measured and predicted and its relationship to consciousness development be addressed.

Here, cognition is thus not proposed as merely intellectual “ability”, which has been the dominant perspective within psychology – differential psychology and psychometrics in particular - for more than a century.

Within the spectrum of consciousness as postulated by various consciousness theorists, cognition, according to Wilber’s All Quadrants All Levels (AQAL) metatheory, merely represents a developmental “line” or ”stream”, and does not encapsulate the essence or apex of consciousness.

Up to a point, cognitive factors enable the emergence of consciousness, and very importantly, the implementation of one’s world view, or level of awareness. But this does not imply a linear relationship between cognition and consciousness. People with high levels of cognitive capability, for example, can be found at any of the various levels of consciousness as hypothesised by consciousness theorists and developmental psychologists. As proposed in a previous article on consciousness theory (Prinsloo, 2012), the fractal nature of the various “streams” or “lines” of development, including cognition, reflects that of the overall evolutionary emergence of consciousness, all of which involve processes of increasing differentiation followed by increasing integration of subcomponents

The concept of cognition, like consciousness, is postulated to be holonically organised and consists of a number of information processing functions including memory, exploration, analysis, structuring, transformation and metacognition (Prinsloo, 1992). Higher level processing functions include and transcend preceding levels. These processing functions are directed by metacognitive criteria such as clarity, relevance, accuracy, coherence, purposefulness, appropriateness by which the various processing functional are guided. This theoretical model, which meets most of the criteria for theory building (Prinsloo 1992, 2012), provides the necessary guidelines for the operationalization and measurements of the holonically organised processing constructs.

This self-contained model of cognitive processing is graphically represented in Figure 2 and 3.

Figure 2: The Holonic structure of the Cognitive Processing Model

Memory

Exploration

Analysis

Structuring

Transformation

Metacognition

Transformation:

Transfer, restructure, logical reasoning, lateral

creation

Metacognition: Self-awareness,

self-monitoring, learn, strategise, use judgement and intuition

Exploration: Search,

scan, focus, investigate, clarify, hypothesize, discriminate, select

Memory: Retention,

recall, internalisation, automation

Analysis: Differentiate

(break-up), compare, apply rules, identify

relationships

Structuring: Categorise,

order, group, generalise, integrate, represent,

abstract, conceptualise

In Figure 2, Metacognition represents encompassing cognitive self-awareness and is a prerequisite for effectiveness / capability. Metacognition is involved with all the processing categories, guiding each via specific criteria which form the basis for the Cognitive Process Profile (CPP) assessment. These metacognitive criteria are shown in Figure 3.

Figure 3: The Metacognitive Criteria which guides each Cognitive Process

The above mentioned self-contained model of cognitive processing constructs, enables the design of an assessment approach.

The key to the measurement of the various cognitive processing constructs is the depth and detail of operationalization of the processes. In other words, the measurement of each processing construct needs to take place by tracking multiple (literally thousands) “micro actions” by which response tendencies can be identified.

In measuring the processing constructs, the technique developed for this purpose, namely the Cognitive Process Profile (CPP), capitalises on largely unfamiliar task material which is not domain specific and not dependent on the previous knowledge and experience of the test taker. The assessment methodology elicits all possible thinking approaches, preferences and skills instead of merely logical-analytical or memory processes. The way in which processing functions are applied, is externalised and tracked accurately at a micro level for optimal algorithmic interpretation. Subjective rater bias in the interpretation of results is avoided by using a rule-based expert system for the interpretation of results. The approach thus entails the use of an automated and computerised assessment technique which is theoretically based as well as standardised, accurate, reliable and replicable. Not only current skill, but learning potential, or cognitive modifiability and adaptability is also evaluated. The methodology is applicable cross-culturally and the results are interpreted to predict real-life work performance.

The above mentioned theoretical model and methodological approach for the measurement of cognitive processes reflect a holonic view of intellectual functioning which has been used in the development of not only the Cognitive Process Profile (CPP) but also the Learning Orientation Index (LOI) as provided by Cognadev (Prinsloo, CPP Research Manual, 2012). This approach fundamentally differs from the focus of the Differential approach on intellectual “ability”.

Statistical analysis

Statistical evidence for psychological findings is mostly presented as science – but without clarifying the descriptive and speculative nature of social sciences.

Psychometric research largely relies on statistical concepts such as averages, bell curves, significance and normalisation. The use of these concepts to understand psychological phenomena, requires further consideration. In addressing the important goal of scientific prediction, the economist Nassim Taleb (2010), argues that it can be pursued by either looking at the world in terms of averages (where outliers are ignored) or in terms of exceptions (where extremes count), and that the subject matter will determine the most valid approach. Taleb explains that in the case of averaging, equilibrium forces are at work, which negate randomness. This is typical in the case of statistics where the bell curve and standard deviation reign supreme. The bell curve transforms single observations to a completely abstract level, close to an average. He points out that bell curves and correlational techniques only work in fairly predictable environments, where measurements are controlled, reflective of a certain structure and where there are no extreme values. Correlations, which seem to form the back bone of psychometrics, do not indicate causality although it often is interpreted in that way. Constructs such as averages, group membership and category which are supposed to reflect the structure of reality, largely fail in explaining and predicting psychological phenomena. Taleb (2010) regards the bell curve as fraudulent; sees correlational evidence as “so what”; and looks upon the concept of the “standard deviation” as leaving much to be desired and merely a property of the bell curve. With high variability in a system, meaning is lost through the calculation of averages and normal distributions. James Grice (Grice et al, 2012), the developer of the innovative Observation Oriented Method (OOM) approach came to a similar conclusion. In a social networking blog he stated: “Following the publication of OOM I hope to never conduct another misguided t-test, ANOVA, multiple regression, structural equation model, between-persons factor analysis, etc. for the remainder of my academic career. I am sure this will strike many as extreme, but evidence has been mounting in recent years that psychologists have let their methods determine their metaphysics for far too long.” The norming of test results in psychometrics, is a specific case of the “averaging” approach. Norms are based on group averages, bell curve distributions and standard deviations. It is aimed at correcting for potential error and bias as well as to ensure fairness in measurement, but it has obvious weaknesses. Normalised results reflecting higher scores on normalised construct A as opposed to normalised construct B on a profile chart, may be incorrectly interpreted as “this person is more A than s/he is B”. Unfortunately these misinterpretations are made by professionals, test users and test subjects alike. In addition, an emphasis on the selection of the norm group, often aimed at establishing equivalence, distracts the attention from the more fundamental issues that are involved. According to Barrett (2012), establishing the equivalence of latent variables across groups is less important than that of the degree of criterion prediction involved. The use of ipsative norms and score calculations combining ipsative with group-based normed scores, are also questionable and normally done to enhance differentiation between construct scores. Barrett (2012) also comments on the tendency of commercially driven test providers to feel

justified in publishing large and significant differences between normed scores based on raw scores that hardly differ. Analytical techniques such as averaging and categorisation, as involved with norming are inadequate to capture the randomness of nature. Taleb (2010) describes this approach as arriving at certainties by adding up and averaging uncertainties, and refers to it as a “masquerading problem” where what one sees is less relevant than what one does not see. He also refers to categorisation and averaging as “platonifying” and points out that it cannot access the true nature of a phenomenon. Categorisation should thus at best be regarded as a means to an end as opposed to a primary theoretical goal. Simple answers may provide emotional security and certainty to the theorist, but are not sufficient to explain complex matters. Taleb (2010) suggests that a different kind of statistical approach is required to deal with randomness. He sees human functioning as an irregular affair, the randomness of which can be compared to natural phenomena. The latter (e.g. a landscape), for example, may appear completely different from a distance (the generalised, abstract perspective comparable to a theoretical model) than what it looks like under a microscope (the experience, the “source code” and/or the immediate real-life manifestation of constructs). In psychometrics the rules need to prove the exceptions and the exceptions prove the rules. The tremendous potential and variability of human behavior which is dormant in each individual and which may be triggered under certain conditions need to be accommodated for. The question arises that if psychometrics hardly approximates a person’s performance under normal, stable conditions, what is its predictive power under unusual and changing conditions? Taleb (2010) is thus of the opinion that numerical values that largely deviate from the mean can hardly be predicted in terms of normal distributions. Because of the variable nature of human behaviour much valuable information is thus lost by fitting all of human potential under a normal curve. He points out that social sciences occurrences tend to show “fat tail” distributions, given the prominence of statistically unusual events. Typical methods of statistical inference which are unreliable when dealing with a relatively large fraction of outliers need to be replaced by heavy-tailed distributions and more appropriate and robust statistical inference methods. An example can be found in constructs such as talent and leadership potential which are contextually anchored. These constructs should ideally be represented as flatter and wider distribution curves.

A reliance on reductionistic statistics thus fails to accommodate the large qualitative differences between constructs in the social sciences. In Grice et al’s (2012) view, researchers in psychology have failed to demonstrated the true structure of the attributes they are studying. He points out that, without continuous quantitative structure, the application of techniques such as t-tests, ANOVAs, least squares regressions and factor analysis to psychological data is dubious, at best. Newer statistical methods like SEM and multi-level modelling are no more valid because psychologists have not yet demonstrated they are actually measuring the attributes they claim to be measuring, such as intelligence, depression, anxiety and the Big Five personality traits. He comes to the conclusion that psychology has in a sense failed to progress as a science because of its failure to differentiate between quantity and quality as distinguishable modes of being. Psychology ideally requires an emphasis on the potentially unique, often meaningful and symbolic aspects of human behaviour, its leverage points as well as cumulative and interactive dynamics. The use of traditional statistical techniques for the validation of psychometric tools, is thus flawed. Of particular concern is the concept of “significance”.

The practice of establishing the significance of psychometric predictions, has been criticised for almost as long as it has been in existence. Ziliak and McCloskey (2009), outspoken critics of statistical practice in econometrics and other sciences, in their paper: “The cult of statistical significance”, come to the conclusion that “significant does not mean important and insignificant does not mean unimportant”. In other words, statistical significance at the 1%, 5% or any other arbitrary level is neither necessary nor sufficient for proving discovery of a scientific or commercially relevant result. In their view the development of science has suffered greatly from the generally accepted practice of "statistical significance", a qualitative, philosophical rule, to substitute for a quantitative, scientific magnitude. To quote them: “We and our small (if distinguished) group of fellow sceptics say that a finding of “statistical” significance, or the lack of it, statistical insignificance, is on its own valueless, a meaningless parlour game. Statistical significance should be a tiny part of an inquiry concerned with the size and importance of relationships. Unfortunately it has become a central and standard error of many sciences. The history of this "standard error" of science—the past 85 years of mistaking statistical significance for scientific importance….” (p 2303) Morris DeGroot (1975) explains that even where there are high probabilities of rejecting the null hypothesis, the true value may differ only slightly from the null. Peter Kennedy too, in his “A Guide to Econometrics” (1985), briefly mentions that a large sample always shows "significance”. Ziliak and McCloskey (2009) point out that only 8 of 294 articles published, using a test of significance (where significance level is the probability of rejecting the null hypothesis), failed to reject the null. This realisation was inspired by Gosset (1942), who never saw “significance” as a substitute for finding out “how much”.

According to Gosset, a significant finding by itself may be “nearly

valueless”. To him, the important thing is to have a low real error, not to have a "significant" result. Not only should the researcher be able to say "we have significant evidence for this or that in 19 out of 20 cases…” but also "we need to find out why it doesn't work in the twentieth.”

In any event, the rejection of null hypotheses based on correlation, does not prove the existence of a meaningful relationships between constructs. Not having correlational evidence for the existence of a relationship, is not the same as proof of absence. Given the limitations of statistical techniques, a combination of analytical techniques and qualitative interpretations may perhaps be most appropriate where it comes to psychological construct.

Instead of merely applying superficial technical fixes to fundamental challenges, greater awareness of the shortcomings of current practice, specifically the interface between statistical calculations and real phenomena is a required. Efforts to understand the issues under scrutiny could also include perceptive observation, intuition and action research, as action research offers an opportunity for experiential learning and understanding of the dynamics involved in human functioning. It complements and breathes life into the mere technical manipulation of data by contextualising and adding meaning to mathematical abstractions. Experiential learning results in important insights for optimising the design, refinement and application of any measurement and analytical approach. Given the limitations of descriptive and predictive statistical methods in psychology, Grice (2012) argues for a return to the common sense realism of Aristotle and its recognition of the explanatory value of the metaphysical concept of human nature and its role in causal explanation.

Application: Predicting behaviour

Psychology operates in a subtle domain - the complexity and dynamics of which are difficult to unravel. Averages, correlations and normal curves thus impose some order, but don't help us to understand the essence of human functioning. Although the models based on an inadequate scientific approach seem to confirm reality, the underlying dynamics related to the various constructs are not accounted for. Such models simply cannot be predictive - as in the case of history where, as Taleb (2010) points out, “…patterns can be identified, but predictions cannot be made seeing that “history does not reveal its mind to us” (p. 268). Taleb in fact regards the coherence of such models as “contagion” and mere speculation, which is scientifically not particularly desirable. Although Taleb (2010) primarily focused on principles in Economics, he offers words of wisdom which Psychometrics may well heed in predict human behaviour:

Derailers

Taleb (2010) emphasises that more mileage can be obtained from predicting potential derailers than from predicting potential successful performance: “… one cannot really tell if a successful person has skills, or if a person with skills will succeed – but we can pretty much predict the negative, that a person devoid of skills, will eventually fail” (p. 303). He reasons that a 1% modification of a system can lower fragility by 99%. In other words: “just work on removing the pebble in your shoe”.

It thus makes sense to identify and delete risks, or as he puts it: identify fragilities and bet on the collapse of the fragile system. This is a universal principle and Taleb points out that the most robust contribution to knowledge consists in removing what we think is wrong, referred to as “subtractive epistemology” (p 303). Knowledge grows by subtraction much more than by addition, and this also applies to the field of human behaviour. Karl Popper (1959) also emphasised the limitations of forecasting. Although not perfect, he regards disconfirmation as a useful scientific technique seeing that negative knowledge is more robust than mere confirmation.

Evidence of Potential

In addition, Taleb advises to focus on indications of potentially outstanding results or, even if these are small. He refers to it as evidence of potential “black swans”.

In applying this guideline in leadership assessment for example, the chances of identifying excellent talent is greatly enhanced, even though much potential may be missed. As Steve Jobs (Isaacson, 2011) observed: a good decision is to pick something which is exceptional while saying no to a many other good ideas.

This is because the popular 20/80 principle, actually manifests as a 99/1 principle in the real world, where Taleb (2010) points out that “almost everything has a winner takes all effect”.

Besides these useful suggestions from Taleb, there are also other considerations and principles that could improve the value of measurement, assessment and prediction. These are:

Simplicity

As indicated by a number of theorists, it seems that in complex environments, simple methods for forecasting work better than complex ones. The “fast and frugal heuristics” of Gigerenzer &

Goldstein (1996) is a good example here. A simple approach is also well suited to the complexities of psychological functioning where the holistic picture is indeed required for predictive purposes, but the emphasis should be on identifying small indications of potential derailers as well as that of explosive potential.

Integrated views

Grice et al (2012) suggests that integrated models are likely to prove more capable of explaining the human psyche and human behavior than variable-based models because integrated models are more amenable to thinking about systems than collections of variables to be measured and correlated. From an integrated and holistic perspective, a richer and more dynamic picture emerges. Grice thus developed the Observation Oriented Methodology (OOM) technique, using binary analytics anchored in real life observation, to move from a variable-based view of nature to a systems-based view of nature. Such a change in viewpoint, accompanied by innovated methods of data collection and analysis, may contribute to more effective psychological research in the future.

Holonically organised constructs

Ken Wilber (2000) coined the term “holon” as reflective of the essential organisational structure of the universe. It represents dynamically, integrated and hierarchically organised systems where each consecutive system includes and transcends the previous. This principle also applies in psychology and is proposed as most useful in understanding integrated personality functioning.

The use of constructs such as consciousness and cognition seem critically important in assessing and predicting human potential. Wilber’s (2000) four quadrant AQAL model with its various developmental streams or lines provides guidelines in this regard: the upper left quadrant of the AQAL model reflects the holonic structure of consciousness development of which cognitive development represents one line or stream of consciousness as a whole.

Conclusion

An emphasis on the holonically organised constructs of cognition as integrate with consciousness, as proposed by Wilber (2000) and applied in the case of the Cognitive Process Profile (CPP) (Prinsloo 1992, 2013) which measures cognition, and the Value Orientations (VO) (Beck & Cowan, 1996; Prinsloo, 2014) which assesses consciousness factors, meets most of the criteria for theory building and research in social sciences. These theoretical models and measurement techniques provide a coherent, structurally adequate, parsimonious, and plausible explanation of human awareness and behaviour, the power of which probably transcends that of traditional psychometrics.

The proposed integrated theoretical perspective in combination with appropriate research methodologies capitalising on depth of observation and understanding as well as philosophical realism, may yet lay the foundation for a more process oriented and contextualised approach to psychological measurement.

REFERENCES

Barrett, P. (2011). Various contributions on the Psychometric forum discussion platform

Barrett, P. (2013). Various contributions on the Psychometric forum discussion platform

Barrett, P. & Prinsloo, M. (2013). Rethinking Reliability and Validity of psychological measurements.

Beck, Don E. & Cowan, Cristopher C. (1996). Spiral Dynamics: mastering values, leadership and

change. Oxford, Cornwall: Blackwell.

Borsboom, D., Cramer, A.O.J., Kievit, R.A., Scholten, A.Z., & Franic, S. (2009). The end of construct validity. In Lissitz, R.W. (Eds.). The Concept of Validity: Revisions, New Directions, and Applications (Chapter 7, pp. 135‐170). Charlotte: Information Age Publishing. DeGroot, M. H. (1975). Probability and Statistics. Addison-Wesley Publishing Company. University of Michigan. Freedman, D.A. (1991). Statistical models and shoe leather. Sociological Methodology, 21, 1, 291‐313.

Gigerenzer, G & Goldstein. D. G. (1996) Reasoning the fast and frugal way: models of bounded rationality. Psychological Review, vol 103, no 4, 650 – 669.

Gosset W. S. (1908). The probable error of mean. Biometrika, 6, 1 – 24. Reprinted in Gosset (1942)

Grice, J.W., Barrett, P.T., Schlimgen, .L.A & Abramson, C.I. (2012). Toward a Brighter Future for Psychology as an Observation Oriented Science. Behavioral Science, 2012, 2(1), 1-22; doi:10.3390/bs2010001

Isaacson, W. (2011). The Exclusive Biography. Simon & Schuster. London.

Kennedy, P. (1985). A Guide to Econometrics. MIT Press Books.

Maraun, M.D. (1998). Measurement as a Normative Practice: Implications of Wittgenstein's Philosophy for Measurement in Psychology. Theory & Psychology, 8, 4, 435-461. Michell, J. (1990). An Introduction to the Logic of Psychological Measurement. New York: Lawrence Erlbaum. Michell, J. (1994) Numbers as quantitative relations and the traditional theory of measurement. British Journal for the Philosophy of Science, 45, 389‐406. Michell, J. (1997). Quantitative science and the definition of measurement in Psychology. British Journal of Psychology, 88, 3, 355‐383. Michell, J. (1999). Measurement in Psychology: Critical History of a Methodological Concept. Cambridge University Press. ISBN: 0‐521‐62120‐8. Michell, J. (2001). Teaching and mis‐teaching measurement in psychology. Australian Psychologist, 36, 3, 211‐217. Michell, J. (2009). Invalidity in Validity. In Lissitz, R.W. (Eds.), The Concept of Validity: Revisions, New Directions,and Applications (Chapter 6, pp. 111‐133). Charlotte: Information Age Publishing. Popper, K. (1959). The logic of scientific discovery. Hutchinson, London.

Pribram, K H (1986). The cognitive revolution and mind/brain issues. American Psychologist, Vol 41(5), May 1986, 507-520. doi: 10.1037/0003-066X.41.5.507

Prinsloo, M. (1992). A theoretical model and empirical technique for the study of problem solving processes. PhD. RAU, Johannesburg.

Prinsloo, M (2014). Consciousness Models in Action: Comparisons. Integral Leadership Review, Apr – Jun 2014. Prinsloo, M. (2012). CPP Research Manual. Cognadev. Johannesburg Prinsloo, M. & Barrett, P. (2013). Cognition: Theory, Measurement, Implications. Integral Leadership Review, Jun 2013.

Taleb, N. (2010). The black swan: the impact of the highly improbable. New York: Random House

Trade Paperbacks.

Wilber, K. (1981). Up from Eden: A transpersonal view of human evolution. Garden City, NY:Anchor Press. Wilber, K. (1995). Sex, ecology, spirituality: The spirit of evolution. Boston: Shambhala. Wilber, K. (2000). Integral psychology: Consciousness, spirit, psychology, therapy. Boston: Shambhala.

Wilber, K. (2000). Sex, ecology, spirituality: the spirit of evolution. Boston & London: Shambala.

Wilber, K. (2003). Excerpt D: The look of a feeling—The importance of post/structuralism. Retrieved from http://wilber.shambhala.com/html/books/kosmos/excerptD/part1.cfm Wilber, K. (2006). Integral spirituality: A startling new role for religion in the modern and postmodern world. Boston: Shambhala. Whiteley, S.E. (1977). Information processing on intelligence test items: Some response components. Applied Psychological Measurement, 1, 465-476. Ziliak, S. T, & McCloskey, D. N. (2009). The cult of statistical significance: How the standard error costs us jobs, justice, lives. University of Michigan Press.