content &statistical validity

Seied Beniamin HosseiniBIMS , University of Mysore

November 2016

Content & Statistical validity

Determining ValidityPredictive Validity - Convergent & Divergent ValidityConstruct & Content ValidityDiscriminant Validity Face Validity

Validity Refers to measuring what we intend to measure. If math and vocabulary truly represent intelligence then a math and vocabulary test might be said to have high validity when used as a measure of intelligence.

For Example - In developing a nursing licensure exam, experts on the field of nursing would identify the information and issues required to be an effective nurse and then choose (or rate) items that represent those areas of information and skills.

Describe the content domain

Compare the structure of the test with the structure of the content domain

Determine the areas of the content domain that are measured by each test item

Basic Procedure for Assessing Content Validity

For Example - With respect to educational achievement tests a test is considered content valid when the proportion of the material covered in the test approximates the proportion of material covered in the course.

. Essential

Useful but not essential

Not necessary

Lawshe (1975) proposed that each rater should respond to the following question for each item in content validity:

Is the skill or knowledge measured by this item

Content validity requires the use of recognized subject matter experts to evaluate whether test items assess defined content and more rigorous statistical tests than does the assessment of face validity.

Content validity is most often addressed in academic and vocational testing, where test items need to reflect the knowledge actually required for a given topic area (e.g., history) or job skill (e.g., accounting).

One widely used method of measuring content validity was developed by C. H. Lawshe. "Is the skill or knowledge measured by this item 'essential,' 'useful, but not essential,' or 'not necessary' to the performance of the construct.

Content validity is different from face validity, Face validity assesses whether the test "looks valid" to the examinees who take it, the administrative personnel who decide on its use, and other technically untrained observers.

The content Validity Ratio

Number of SME panelists indicating "essential"

total number of SME panelists

positive values indicate that at least half the SMEs rated the item as essential.

values range from +1 to -1

The mean CVR across items may be used as an indicator of overall test content validity.

SME (small-to-medium enterprise) is a convenient term for segmenting businesses and other organizations that are somewhere between the "small office-home office" ( SOHO ) size and the larger enterprise . The European Union has defined an SME as a legally independent company with no more than 500 employees.

Reasonable conclusions use

Type Ifinding a difference or correlation when none exists

Type IIfinding no difference

when one exists

quantitative, statistical, and qualitative data

Statistical validity

It is the degree to which conclusions about the relationship among variables based on the data are correct or ‘reasonable’.

Statistical conclusion validity involves ensuring the use of adequate sampling procedures, appropriate statistical tests, and reliable measurement procedures

Low statistical powerPower is the probability of correctly rejecting the null hypothesis when it is false

Low power occurs when the sample size of the study is too small given other factors (small effect size, large group variability, unreliable measures, etc.).

Experiments with low power have a higher probability of incorrectly accepting the null hypothesis

that is, committing a type II error and concluding that there is no effect when there actually is

(I.e. there is real co variation between the cause and effect).

Violated assumptions of the test statistics

Violations of assumptions may make tests more or less likely to make type I or II errors.

Violating the assumptions of statistical tests can lead to incorrect inferences about the cause-effect relationship.

Most statistical tests involve assumptions about the data that make the analysis suitable for testing a hypothesis.

Fishing and the error rate problem

The more the researcher repeatedly tests the data, the higher the chance of observing a type I error and making an incorrect inference about the existence of a relationship

Each hypothesis testing involves a set risk of a type I error .

If a researcher searches or "fishes" through their data, testing many different hypotheses to find a significant effect, they are inflating their type I error rate

content &statistical validity

Business