😶‍🌫️
Psych
  • Preface
  • [4/9/2025] A One-Stop Calculator and Guide for 95 Effect-Size Variants
  • [4/9/2025] the people make the place
  • [4/9/2025] Personality predicts things
  • [3/31/2025] Response surface analysis with multilevel data
  • [3/11/2025] A Complete Guide to Natural Language Processing
  • [3/4/2025] Personality - Self and Identity
  • [3/1/2025] Updating Vocational Interests Information
  • [2/25/2025] Abilities & Skills
  • [2/22/2025] APA table format
  • [2/19/2025] LLM that replace human participants can harmfully misportray and flatt
  • [2/18/2025] Research Methods Knowledge Base
  • [2/17/2025] Personality - Motives/Interests
  • [2/11/2025] Trait structure
  • [2/10/2025] Higher-order construct
  • [2/4/2025] RL for CAT
  • [2/4/2025] DoWhy | An end-to-end library for causal inference
  • [2/4/2025] DAGitty — draw and analyze causal diagrams
  • [2/2/2025] Personality States
  • [2/2/2025] Psychometric Properties of Automated Video Interview Competency Assessments
  • [2/2/2025] How to diagnose abhorrent science
  • [1/28/2025] LLM and personality/interest items
  • [1/28/2025] Personality - Dispositions
  • [1/28/2025] Causal inference in statistics
  • [1/27/2025] Personality differences between birth order categories and across sibship sizes
  • [1/27/2025] nomological network meta-analysis.
  • [1/25/2025] Classic Papers on Scale Development/Validation
  • [1/17/2025] Personality Reading
  • [1/15/2025] Artificial Intelligence: Redefining the Future of Psychology
  • [1/13/2025] R for Psychometics
  • [12/24/2024] Comparison of interest congruence indices
  • [12/24/2024] Most recent article on interest fit measures
  • [12/24/2024] Grammatical Redundancy in Scales: Using the “ConGRe” Process to Create Better Measures
  • [12/24/2024] Confirmatory Factor Analysis with Word Embeddings
  • [12/24/2024] Can ChatGPT Develop a Psychometrically Sound Situational Judgment Test?
  • [12/24/2024] Using NLP to replace human content coders
  • [11/21/2024] AI Incident Database
  • [11/20/2024] Large Language Model-Enhanced Reinforcement Learning
  • [11/05/2024] Self-directed search
  • [11/04/2024] Interview coding and scoring
  • [11/04/2024] What if there were no personality factors?
  • [11/04/2024] BanditCAT and AutoIRT
  • [10/29/2024] LLM for Literature/Survey
  • [10/27/2024] Holland's Theory of Vocational Choice and Adjustment
  • [10/27/2024] Item Response Warehouse
  • [10/26/2024] EstCRM - the Samejima's Continuous IRT Model
  • [10/23/2024] Idiographic Personality Gaussian Process for Psychological Assessment
  • [10/23/2024] The experience sampling method (ESM)
  • [10/21/2024] Ecological Momentary Assessment (EMA)
  • [10/20/2024] Meta-Analytic Structural Equation Modeling
  • [10/20/2024] Structure of vocational interests
  • [10/17/2024] LLMs for psychological assessment
  • [10/16/2024] Can Deep Neural Networks Inform Theory?
  • [10/16/2024] Cognition & Decision Modeling Laboratory
  • [10/14/2024] Time-Invariant Confounders in Cross-Lagged Panel Models
  • [10/13/2024] Polynomial regression
  • [10/13/2024] Bayesian Mixture Modeling
  • [10/10/2024] Response surface analysis (RSA)
  • [10/10/2024] Text-Based Personality Assessment with LLM
  • [10/09/2024] Circular unidimensional scaling: A new look at group differences in interest structure.
  • [10/07/2024] Video Interview
  • [10/07/2024] Relationship between Measurement and ML
  • [10/07/2024] Conscientiousness × Interest Compensation (CONIC) model
  • [10/03/2024] Response modeling methodology
  • [10/02/2024] Conceptual Versus Empirical Distinctions Among Constructs
  • [10/02/2024] Construct Proliferation
  • [09/23/2024] Psychological Measurement Paradigm through Interactive Fiction Games
  • [09/20/2024] A Computational Method to Reveal Psychological Constructs From Text Data
  • [09/18/2024] H is for Human and How (Not) To Evaluate Qualitative Research in HCI
  • [09/17/2024] Automated Speech Recognition Bias in Personnel Selection
  • [09/16/2024] Congruency Effect
  • [09/11/2024] privacy, security, and trust perceptions
  • [09/10/2024] Measurement, Scale, Survey, Questionnaire
  • [09/09/2024] Reporting Systematic Reviews
  • [09/09/2024] Evolutionary Neuroscience
  • [09/09/2024] On Personality Measures and Their Data
  • [09/09/2024] Two Dimensions of Professor-Student Rapport Differentially Predict Student Success
  • [09/05/2024] The SAPA Personality Inventory
  • [09/05/2024] Moderated mediation
  • [09/03/2024] BiGGen Bench
  • [09/02/2024] LMSYS Chatbot Arena
  • [09/02/2024] Introduction to Measurement Theory Chapters 1, 2 (2.1-2.8) and 3.
  • [09/01/2024] HCI measurememt
  • [08/30/2024] Randomization Test
  • [08/30/2024] Interview Quantative Statistical
  • [08/29/2024] Cascading Model
  • [08/29/2024] Introduction: The White House (IS_202)
  • [08/29/2024] Circular unidimensional scaling
  • [08/28/2024] Sex and Gender Differences (Neur_542_Week2)
  • [08/26/2024] Workplace Assessment and Social Perceptions (WASP) Lab
  • [08/26/2024] Computational Organizational Research Lab
  • [08/26/2024] Reading List (Recommended by Bo)
  • [08/20/2024] Illinois NeuroBehavioral Assessment Laboratory (INBAL)
  • [08/14/2024] Quantitative text analysis
  • [08/14/2024] Measuring complex psychological and sociological constructs in large-scale text
  • [08/14/2024] LLM for Social Science Research
  • [08/14/2024] GPT for multilingual psychological text analysis
  • [08/12/2024] Questionable Measurement Practices and How to Avoid Them
  • [08/12/2024] NLP for Interest (from Dan Putka)
  • [08/12/2024] ONet Interest Profiler (Long and Short Scale)
  • [08/12/2024] ONet Interests Data
  • [08/12/2024] The O*NET-SOC Taxonomy
  • [08/12/2024] ML Ratings for O*Net
  • [08/09/2024] Limited ability of LLMs to simulate human psychological behaviours
  • [08/08/2024] A large-scale, gamified online assessment
  • [08/08/2024] Text-Based Traitand Cue Judgments
  • [08/07/2024] Chuan-Peng Lab
  • [08/07/2024] Modern psychometrics: The science of psychological assessment
  • [08/07/2024] Interactive Survey
  • [08/06/2024] Experimental History
  • [08/06/2024] O*NET Research reports
  • [07/30/2024] Creating a psychological assessment tool based on interactive storytelling
  • [07/24/2024] My Life with a Theory
  • [07/24/2024] NLP for Interest Job Ratings
  • [07/17/2024] Making vocational choices
  • [07/17/2024] Taxonomy of Psychological Situation
  • [07/12/2024] PathChat 2
  • [07/11/2024] Using games to understand the mind
  • [07/10/2024] Gamified Assessments
  • [07/09/2024] Poldracklab Software and Data
  • [07/09/2024] Consensus-based Recommendations for Machine-learning-based Science
  • [07/08/2024] Using AI to assess personal qualities
  • [07/08/2024] AI Psychometrics And Psychometrics Benchmark
  • [07/02/2024] Prompt Engineering Guide
  • [06/28/2024] Observational Methods and Qualitative Data Analysis 5-6
  • [06/28/2024] Observational Methods and Qualitative Data Analysis 3-4
  • [06/28/2024] Interviewing Methods 5-6
  • [06/28/2024] Interviewing Methods 3-4
  • [06/28/2024] What is Qualitative Research 3
  • [06/27/2024] APA Style
  • [06/27/2024] Statistics in Psychological Research 6
  • [06/27/2024] Statistics in Psychological Research 5
  • [06/23/2024] Bayesian Belief Network
  • [06/18/2024] Fair Comparisons in Heterogenous Systems Evaluation
  • [06/18/2024] What should we evaluate when we use technology in education?
  • [06/16/2024] Circumplex Model
  • [06/12/2024] Ways of Knowing in HCI
  • [06/09/2024] Statistics in Psychological Research 1-4
  • [06/08/2024] Mathematics for Machine Learning
  • [06/08/2024] Vocational Interests SETPOINT Dimensions
  • [06/07/2024] How's My PI Study
  • [06/06/2024] Best Practices in Supervised Machine Learning
  • [06/06/2024] SIOP
  • [06/06/2024] Measurement, Design, and Analysis: An Integrated Approach (Chu Recommended)
  • [06/06/2024] Classical Test Theory
  • [06/06/2024] Introduction to Measurement Theory (Bo Recommended)
  • [06/03/2024] EDSL: AI-Powered Research
  • [06/03/2024] Perceived Empathy of Technology Scale (PETS)
  • [06/02/2024] HCI area - Quantitative and Qualitative Modeling and Evaluation
  • [05/26/2024] Psychometrics with R
  • [05/26/2024] Programming Grammer Design
  • [05/25/2024] Psychometric Network Analysis
  • [05/23/2024] Item Response Theory
  • [05/22/2024] Nature Human Behaviour (Jan - 20 May, 2024)
  • [05/22/2024] Nature Human Behaviour - Navigating the AI Frontier
  • [05/22/2024] Computer Adaptive Testing
  • [05/22/2024] Personality Scale (Jim Shard)
  • [05/22/2024] Reliability
  • [05/19/2024] Chatbot (Jim Shared)
  • [05/17/2024] GOMS and Keystroke-Level Model
  • [05/17/2024] The Psychology of Human-Computer Interaction
  • [05/14/2024] Computational Narrative (Mark's Group)
  • [05/14/2024] Validity Coding
  • [05/14/2024] LLM as A Evaluator
  • [05/14/2024] Social Skill Training via LLMs (Diyi's Group)
  • [05/14/2024] AI Persona
  • [05/09/2024] Psychological Methods Journal Sample Articles
  • [05/08/2024] Meta-Analysis
  • [05/07/2024] Mturk
  • [05/06/2024] O*NET Reports and Documents
  • [05/04/2024] NLP and Chatbot on Personality Assessment (Tianjun)
  • [05/02/2024] Reads on Construct Validation
  • [04/25/2024] Reads on Validity
  • [04/18/2024] AI for Assessment
  • [04/17/2024] Interest Assessment
  • [04/16/2024] Personality Long Reading List (Jim)
    • Personality Psychology Overview
      • Why Study Personality Assessment
    • Dimensions and Types
    • Reliability
    • Traits: Two Views
    • Validity--Classical Articles and Reflections
    • Validity-Recent Proposals
    • Multimethod Perspective and Social Desirability
    • Paradigm of Personality Assessment: Multivariate
    • Heritability of personality traits
    • Classical Test-Construction
    • IRT
    • Social desirability in scale construction
    • Traits and culture
    • Paradigms of personality assessment: Empirical
    • Comparison of personality test construction strategies
    • Clinical versus Actuarial (AI) Judgement and Diagnostics
    • Decisions: Importance of base rates
    • Paradigms of Personality Assessment: Psychodynamic
    • Paradigms of Assessment: Interpersonal
    • Paradigms of Personality Assessment: Personological
    • Retrospective reports
    • Research Paradigms
    • Personality Continuity and Change
Powered by GitBook
On this page

[04/18/2024] AI for Assessment

Previous[04/25/2024] Reads on ValidityNext[04/17/2024] Interest Assessment

Last updated 1 year ago

A Conceptual Framework for Investigating and Mitigating Machine-Learning Measurement Bias (MLMB) in Psychological Assessment

Summary

this paper discusses the growing concerns about bias and unfairness in the use of artificial intelligence (AI) and machine learning (ML) for psychological assessment. It introduces the concept of machine-learning measurement bias (MLMB) and provides a conceptual framework for investigating and mitigating MLMB from a psychometric perspective.

  • Concerns over bias and unfairness in AI and ML applications

  • Introduction of machine-learning measurement bias (MLMB)

  • Definition of MLMB as differential functioning of trained ML models between subgroups

  • Manifestation of MLMB in differential predicted score levels and predictive accuracies across subgroups

  • Sources of bias in ML models: data bias and algorithm-training bias

  • Importance of addressing measurement bias in ML assessments to avoid disparities and discrimination

  • Lack of methodological guidelines for defining and investigating ML bias

  • Focus on bias in ML measurements used to infer individuals' psychological attributes

  • Proposal of a conceptual framework for investigating and mitigating MLMB

  • Emphasis on the need for new statistical and algorithmic procedures to address bias

Takeaway

How is machine-learning measurement bias defined in the paper?

  • Machine-learning measurement bias (MLMB) is defined in the paper as the differential functioning of the trained ML model between subgroups, where one empirical manifestation is when a trained ML model produces different predicted score levels for individuals belonging to different subgroups despite them having the same ground-truth level for the underlying construct of interest. Another empirical manifestation is that the ML model yields differential predictive accuracies across the subgroups.

Psychological Measurement in the Information Age: Machine-Learned Computational Models

Summary

This paper discusses how psychological science can benefit from and contribute to emerging approaches in computing and information sciences, particularly focusing on machine-learned computational models (MLCMs). The authors highlight the potential of MLCMs to transform psychological measurement by combining the prowess of computers with human inferencing abilities, enabling the analysis of unstructured data sets in real-time and improving objectivity. They explain the process of developing MLCMs through supervised machine learning techniques, contrasting them with traditional computational models. The document emphasizes the importance of considering context and intended use when interpreting MLCM performance, as well as addressing concerns related to fairness, bias, interpretability, and responsible use.

  • Psychological science can benefit from emerging approaches in computing and information sciences

  • Machine-learned computational models (MLCMs) can transform psychological measurement

  • MLCMs combine computer capabilities with human inferencing abilities

  • MLCMs enable analysis of unstructured data sets in real-time and improve objectivity

  • Development of MLCMs involves supervised machine learning techniques

  • MLCMs are contrasted with traditional computational models

  • Importance of considering context and intended use when interpreting MLCM performance

  • Addressing concerns related to fairness, bias, interpretability, and responsible use in MLCMs adoption

  • Advocacy for the adoption of MLCMs in psychological science for enhanced measurement practices and research advancements

How Well Can an AI Chatbot Infer Personality? Examining Psychometric Properties of Machine-inferred Personality Scores

Summary

This paper explores the feasibility of measuring personality indirectly through an Artificial Intelligence (AI) chatbot. The study examines the psychometric properties of machine-inferred personality scores, including reliability, factorial validity, convergent and discriminant validity, and criterion-related validity. The research involved undergraduate students engaging with an AI chatbot and completing a self-report Big-Five personality measure. Key findings indicate that machine-inferred personality scores showed acceptable reliability, comparable factor structure to self-reported scores, good convergent validity but relatively poor discriminant validity, low criterion-related validity, and incremental validity over self-reported scores in some analyses.

  • The study explores measuring personality through an AI chatbot using machine learning algorithms.

  • Participants were undergraduate students who engaged with an AI chatbot and completed a self-report Big-Five personality measure.

  • Machine-inferred personality scores showed acceptable reliability, factor structure comparable to self-reported scores, good convergent validity but poor discriminant validity, low criterion-related validity, and incremental validity over self-reported scores in some analyses.

  • The research emphasizes the need for further validation and examination of the psychometric properties of machine-inferred personality scores.

  • The study discusses the potential of AI-based personality assessment and its implications for future research and practical applications.

Machine learning uncovers the most robust self-report predictors of relationship quality across 43 longitudinal couples studies

Summary

This paper explores the predictors of relationship quality in romantic relationships using machine learning techniques across 43 longitudinal datasets from 29 laboratories. The study aimed to quantify the predictability of relationship quality and identify the key predictors. The main findings include that relationship-specific predictors such as perceived-partner commitment, appreciation, and sexual satisfaction, as well as individual difference predictors like life satisfaction and attachment styles, were significant in predicting relationship quality. Actor-reported variables were found to be more predictive than partner-reported variables, and individual differences and partner reports did not have additional predictive effects beyond actor-reported relationship-specific variables. The study also found that changes in relationship quality over time were largely unpredictable from self-report variables. This research contributes to understanding the factors influencing relationship quality and highlights the importance of individual perceptions and experiences in shaping relationship outcomes.

  • Relationship quality is a crucial psychological construct with significant implications for health and well-being.

  • Machine learning techniques, specifically Random Forests, were used to analyze 43 longitudinal datasets from 29 laboratories.

  • Key predictors of relationship quality included perceived-partner commitment, appreciation, sexual satisfaction, and individual differences like life satisfaction and attachment styles.

  • Actor-reported variables were more predictive than partner-reported variables.

  • Individual differences and partner reports did not add predictive value beyond actor-reported relationship-specific variables.

  • Changes in relationship quality over time were largely unpredictable from self-report variables.

Fig. 1. Simplified process of machine-learning modeling.
Fig. 2. Measurement bias (MB) and machine-learning measurement bias (MLMB). MB and MLMB Case 1 represents a noncompensatory bias that creates different predicted subgroup distributions despite the same underlying subgroup distributions. MB and MLMB Case 2 represents a compensatory bias that creates equivalent predicted subgroup distributions even though there is measurement bias.
Fig. 4. Expanding the Brunswik lens model to identify the sources of machine-learning measurement bias: an illustration using personality as the focal construct. Areas highlighted in blue represent possible sources of machine-learning measurement bias; “platform-based personality”: the personality construct measured by input data (e.g., online personality assessed by social media data) used in machine-learning models to predict self-report personality.
Fig. 2. The four main approaches to computational modeling. The approaches differ in whether features, parameters, and structure are prespecified. Handcrafted and traditional psychological models are more explanatory, whereas standard machine-learning and deep-neural-learning models are more predictive. Note that the approaches are not mutually exclusive and can be combined in multiple ways; explanation and prediction goals are combined in blended approaches. In the case of deep-neural-learning models, raw data can be input directly; for the other model types, features are first prespecified and then computed from raw data and used as inputs for learning. Parameters are prespecified for handcrafted models only, and model structure is prespecified for both types of traditional models. Annotations are labels provided by humans to guide the machine-learning process by providing a supervisory signal. They are needed in the model-training phase for supervised machine learning. Technically, a response variable (not shown) is needed for traditional models, but such variables are not considered to be annotations. Dotted, solid black, and solid red lines indicate requirements for minimal, typical, and substantial human knowledge engineering, respectively.
Fig. 3. The basic pipeline for training standard machine-learned computational models (MLCMs). The arrows denote the flow of information processing; red arrows denote steps that are involved in the training process only and are skipped once MLCMs have been trained.
Fig. 4. Standard versus deep-learning approaches for training a machine-learned computational model to classify different types of spoken discourse from audio. In the standard approach, n-grams derived from training data are used to produce a random forest model. In the deep-learning approach, contextual semantics are learned from large corpora in a pretraining phase, and the deep neural network is then fine-tuned for the training data.
Fig. 5. Selected example cases of machine-learned computational models in four domains of psychological assessment, aligned with respect to four bands of action for the input modality and psychological construct assessed. See for additional details about the examples.
Antecedents and consequences of relationship quality (–). Schematic depiction of the field of relationship science. In their work, relationship scientists use an extensive assortment of overlapping individual difference and relationship-specific constructs. These constructs predict the way couple members behave toward and interact with each other, which in turn affects relationship quality and a variety of consequential outcomes. These processes are themselves embedded in social networks as well as broader cultural and historical structures.
Mediational pathway implied by current findings. This figure depicts the mediational model implied by the equivalent predictive power of the “all predictors” vs. “actor relationship” models in and . That is, any effects of self-reported individual differences or partner-reported relationship variables on relationship quality are likely mediated by the actor-reported relationship variables. Individual-difference × relationship variable interactions and actor × partner interactions are not depicted because they are likely quite small. Other constructs in (e.g., broader contextual forces) are not depicted because they were not examined in this study.
Newell’s (1990)
Table 1
1
9
Figs. 2
3
Fig. 1
OSF
Logo
Machine learning uncovers the most robust self-report predictors of relationship quality across 43 longitudinal couples studies | Proceedings of the National Academy of SciencesPNAS
https://journals-sagepub-com.proxy2.library.illinois.edu/doi/full/10.1177/09637214211056906journals-sagepub-com.proxy2.library.illinois.edu
Logo
https://journals-sagepub-com.proxy2.library.illinois.edu/doi/full/10.1177/25152459211061337journals-sagepub-com.proxy2.library.illinois.edu