computational - actnext · scoring engines are all exemplars of software components that draw on...

COMPUTATIONAL PSYCHOMETRICS

A FIELD GUIDE

OR

A THOROUGH EXAMINATION OF THE FLORA AND FAUNA

BY: ACTNEXT

IOWA CITY

INHABITING A NEWLY DISCOVERED ECOSYSTEM

WHAT MAKES A COMPUTATIONAL PSYCHOMETRICS ECOSYSTEM?Computational Psychometrics is an emerging field of study spanning across multiple disciplines, many of which are undergoing a renaissance in their own right. Advances in computational horsepower and artificial intelligence, new innovations in software and hardware development, new insights into the science of learning, and theoretical advances in the understanding and modeling of knowledge acquisition are all pushing the boundaries of what can be measured and how it can be done. Which brings us to our computational psychometrics ecosystem.

As with an environmental ecosystem, no part of the computational psychometrics ecosystem is wholly separate from another, and all of the developments listed above are considered as holistic progress in psychometric science, with advances in one field making innovations in another possible via cross-pollination. We have divided our ecosystem into four broad areas, or habitats, representing active areas of research and development driving the evolution of learning, measurement, and navigation in the 21st century.

Our ecosystem is new, ever changing, and full of wonderful things waiting to be unearthed. We’re glad you’re here to explore with us, and hope you find this guide handy as you trek through the weeds and over the hills on your journey of discovery in the land of computational psychometrics. There’s a lot going on, so take notes. We left a few pages blank in the back for you. Bon Voyage!

LEARNING SCIENCE

MULTIMODAL ANALYTICS

PSYCHOMETRIC RESEARCH

ENGINEERING & DEVELOPMENT

Learning science is an inter-disciplinary approach to understanding how humans learn. It blends principles from design, cognitive science, educational psychology, anthropology, neuroscience, and sociology to take a holistic view of learning. Learning scientists use design-based research methods, which not only explain how learning occurs but also design solutions to educational problems.

LEARNING SCIENCE

Evidence centered design focuses on the process required to approach the construction of assessments in a way that ensure that those assessments measure what they intend to measure and in turn provide the evidence to support the claims being made by the assessment results.

EVIDENCE CENTERED DESIGN

Any systematic method of obtaining information, used to draw inferences about characteristics of people, objects, or programs; a systematic process to measure or evaluate the characteristics or performance of individuals, programs, or other entities, for purposes of drawing inferences. (AERA)

ASSESSMENT

HABITAT: LEARNING SCIENCE

An assessment process used by teachers and students during instruction that provides feedback to adjust ongoing teaching and learning with the goal of improving students’ achievement of intended instructional outcomes. (AERA)

FORMATIVE ASSESSMENT

The assessment of a test taker’s knowledge and skills typically carried out at the completion of a program of learning, such as the end of an instructional unit.(AERA)

SUMMATIVE ASSESSMENT


The degree to which test scores for a group of test takers are consistent over repeated applications of a measurement procedure and hence are inferred to be dependable and consistent for an individual test taker; the degree to which scores are free of random errors of measurement for a given group. (AERA)

RELIABILITY

The degree to which an instrument accurately and consistently measures what it is intended to measure.

RELIABILITY AND VALIDITY

The degree to which accumulated evidence and theory support a specific interpretation of test scores for a given use of a test. If multiple interpretations of a test score for different uses are intended, validity evidence for each interpretation is needed. (AERA)

VALIDITY


Multimodal analytics enable us to understand and explore educational problems in real-world environments that are not necessarily limited by sensing types. Instead, researchers can draw upon a large number of modalities that include audio, video, gestures, electro-dermal activation, emotions, and cognitive load among others. New developments of sensors and data mining techniques have both reached a level of maturity that allows researchers to tackle new research questions and develop new educational interventions.


HABITAT: MULTIMODAL ANALYTICS

is often meant to mimic human vision by conducting automatic extraction of information from images. Information can include object detection, recognition, searching, and grouping of image content.

COMPUTERIZED VISION

A type of deep neural net that creates two networks that “compete” against one another to identify the superior model. Each network “generates” a data distribution (e.g. given an outcome, what were the likely predictors) instead of classifying existing data to a set of predictors or other labels.

GENERATIVE ADVERSARIAL NETWORKS

Refers to the use of computers to replicate intelligent human behavior. This is often further refined as “narrow AI” that solves a small set of tasks (e.g. Siri) and “general AI” that replicates human intelligence (e.g. HAL 9,000 from 2001: a Space Odyssey).

ARTIFICIAL INTELLIGENCE

A branch of artificial intelligence that identifies patterns in data, frequently using automated techniques that do not require hypothesis creation or definition of features in advance. While often producing extremely accurate patterns, ML approaches can be difficult to interpret.

MACHINE LEARNING

Deep Learning is a specific subfield of machine learning. It is a mathematical framework that puts emphasis on learning from successive layers of increasingly meaningful representations from data. These layered representations are nearly always learned via models called artificial neural networks. The number of layers that contribute to a model is referred to as the depth of the model.

DEEP LEARNING



A computational model composed of a collection of connected nodes called artificial neurons, which loosely model the neurons in a biological brain. These models can be trained using machine learning algorithms to learn complex patterns in data and perform tasks such as classification and prediction. Artificial neural networks have been around since the 1960s but have gained much popularity with recent advances in Deep Learning.

ARTIFICIAL NEURAL NETWORK

An approach to machine learning that does not have an outcome variable; it is therefore used to identify common classifications of input data.

UNSUPERVISED LEARNING

An approach to machine learning that has an outcome variable of interest that is used to identify meaningful patterns in data related to that variable.

SUPERVISED LEARNING

A branch of machine learning that represents human speech with digitally-created features and applies algorithms to extract patterns and insights. These insights can be used to provide student writing feedback or to create models that replicate human scoring.

NATURAL LANGUAGE PROCESSING

A system that provides real-time and personalized feedback to learners. Intelligent Tutoring Systems leverage computer technology for personalized learning, that simulates one-on-one instructions from a human teacher.

INTELLIGENT TUTORING SYSTEM


At its core, computational psychometrics is about remixing. Blending traditional psychometric theory with emerging computational techniques requires innovative methods and components. In this area we consider some of the building blocks we use to infuse solutions with technologies as we develop new innovations within the field.


Innovation extends into new, cutting edge experiences that we can enable for learners. Advances in Artificial Intelligence and Machine Learning, along with the rise of smart speakers/hubs in homes, enables the development and deployment of learning assistants that can advise, recommend and participate in ongoing conversations in learning.

VOICED-BASED ASSISTANT

Learning experiences aren’t limited to the classroom, nor do they necessarily require traditional computer resources. Mobile experiences, like voice, unlocks new innovation possibilities to gather evidence of learning and deliver instructional content anytime and anywhere.

MOBILE APPS

HABITAT: ENGINEERING & DEVELOPMENT

Learning object repositories are specialized containers that maintain lists of content elements that can be beneficial to learners and educators. They use metadata to classify the content into various types and applications and they are searchable, so that learning assessment systems can retrieve the right resources at the right time for a learner.

LEARNING OBJECT REPOSITORIES

Metadata is essentially data about data. It helps systems understand the context of data assets and elements enabling critical processes like search, selection, sorting and categorizing. In conjunction with a taxonomy, metadata acts as markers connecting assessment and instructional content.

METADATA


The challenge to many learning and assessment systems is how to efficiently collect all of the raw learning analytics that we need at sufficient scale and in a way where we can easily link, combine, and investigate the data. Data lakes act as pools of learning data enabling necessary work flow activities.

DATA LAKES

An application programming interface (API) specifies what a software component can do and how to interact with it. An API allows a developer to encapsulate component functionality in a clear, defined way e.g. as a recommendation or diagnostic engine, a data access gateway, a natural language parser.

APPLICATION PROGRAMMING INTERFACE


Learning and assessment systems typically integrate with software components like APIs and need to process data from sources like data lakes. If each provider of these services defined their own format it would inhibit innovation. Content standards authored by standards bodies help enable easier integration between platforms and components.

CONTENT STANDARDS

Recommenders, diagnostic and scoring engines are all exemplars of software components that draw on many of these innovation development elements. They typically are deployed in the cloud, define an API, utilize content standards, can access data from a data lake and leverage taxonomies and metadata.

RECOMMENDATION / DIAGNOSTIC ENGINES



In computational psychometrics, taxonomies can be used to represent and transmit knowledge of a particular domain, like academic subjects or areas of study. This knowledge is typically hierarchical, classifying elements like categories under a subject, strands within a category, or skills within a taxonomic level.

TAXONOMIES

Software based services like APIs or databases need to be robust so that computational psychometric innovations are reliable and stable when put in practice. Hosting these services in the cloud enables automatic, elastic scaling that can grow to accommodate spikes in activity and shrink to be cost efficient.

CLOUD-BASED SERVICES

Psychometrics is the study and development of methodology to measure and assess psychological constructs, such as cognitive ability or collaborative problem solving. The statistical methodology has not had significant changes since the 1950s. In this section we discuss psychometric models aimed at modernizing the statistical methodology of psychometrics. In particular many of these models explicitly model learning and changes in ability.

PSYCHOMETRICRESEARCH

Any system that tracks change (in parameters) over time, in real time.

RATING SYSTEMS

A way of ranking or ordering a pair of individuals by using item response theory. This rating system is very common in games such as chess and Scrabble.

ELO RATING SYSTEM

A way of continuously tracking ability based on one’s responses to questions via an urn of red and black marbles. Urnings have known statistical properties, as compared to other rating systems (such as ELO, TrueSkill, Glicko, etc.).

URNINGS RATING SYSTEM

HABITAT: PSYCHOMETRIC RESEARCH

A model for how mastery of a skill evolves over time in Intelligent Tutoring Systems.

BAYESIAN KNOWLEDGE TRACING

One approach in psychometric tests: it contains a set of generalized statistical models to measure an individual student’s academic ability and estimate item characteristics (e.g. item difficulty, discriminability) according to students’ binary responses on test items.

ITEM RESPONSE THEORY



A statistical model that accounts for the hierarchical structure (latent tree) in observed variables (the leaves of the tree).

LATENT TREE MODEL

Data that are available for collection during the assessment process and instructional phase, e.g., response accuracy and response time.

PROCESS DATA

A unified view on statistical models used in as diverse fields as statistical physics, psychometrics, spatial statistics, econometrics, neuroscience, machine learning etc. As they all involve main effects and pairwise interactions, network psychometrics brings these models together, as different instantiations of one model.

NETWORK PSYCHOMETRICS

Statistical models for main effects and pairwise interactions, where a graph expresses the conditional (in)dependence structure between the random variables in the model.

GRAPHICAL MODELS


A measurement approach where the observed score is conceived of as a true score plus random error. Another approach in psychometric tests: it also measures the individual’s academic performance and item characteristics, but modeling individual performance using the sum of true score and error score.

CLASSICAL TEST THEORY


LEARNING SCIENCE


PSYCHOMETRIC RESEARCH


Many thanks to all who contributed their time and energy toward the

completion of this field guide.

computational - actnext · scoring engines are all exemplars of software components that draw on...

Documents