😶‍🌫️
Psych
  • Preface
  • [4/9/2025] A One-Stop Calculator and Guide for 95 Effect-Size Variants
  • [4/9/2025] the people make the place
  • [4/9/2025] Personality predicts things
  • [3/31/2025] Response surface analysis with multilevel data
  • [3/11/2025] A Complete Guide to Natural Language Processing
  • [3/4/2025] Personality - Self and Identity
  • [3/1/2025] Updating Vocational Interests Information
  • [2/25/2025] Abilities & Skills
  • [2/22/2025] APA table format
  • [2/19/2025] LLM that replace human participants can harmfully misportray and flatt
  • [2/18/2025] Research Methods Knowledge Base
  • [2/17/2025] Personality - Motives/Interests
  • [2/11/2025] Trait structure
  • [2/10/2025] Higher-order construct
  • [2/4/2025] RL for CAT
  • [2/4/2025] DoWhy | An end-to-end library for causal inference
  • [2/4/2025] DAGitty — draw and analyze causal diagrams
  • [2/2/2025] Personality States
  • [2/2/2025] Psychometric Properties of Automated Video Interview Competency Assessments
  • [2/2/2025] How to diagnose abhorrent science
  • [1/28/2025] LLM and personality/interest items
  • [1/28/2025] Personality - Dispositions
  • [1/28/2025] Causal inference in statistics
  • [1/27/2025] Personality differences between birth order categories and across sibship sizes
  • [1/27/2025] nomological network meta-analysis.
  • [1/25/2025] Classic Papers on Scale Development/Validation
  • [1/17/2025] Personality Reading
  • [1/15/2025] Artificial Intelligence: Redefining the Future of Psychology
  • [1/13/2025] R for Psychometics
  • [12/24/2024] Comparison of interest congruence indices
  • [12/24/2024] Most recent article on interest fit measures
  • [12/24/2024] Grammatical Redundancy in Scales: Using the “ConGRe” Process to Create Better Measures
  • [12/24/2024] Confirmatory Factor Analysis with Word Embeddings
  • [12/24/2024] Can ChatGPT Develop a Psychometrically Sound Situational Judgment Test?
  • [12/24/2024] Using NLP to replace human content coders
  • [11/21/2024] AI Incident Database
  • [11/20/2024] Large Language Model-Enhanced Reinforcement Learning
  • [11/05/2024] Self-directed search
  • [11/04/2024] Interview coding and scoring
  • [11/04/2024] What if there were no personality factors?
  • [11/04/2024] BanditCAT and AutoIRT
  • [10/29/2024] LLM for Literature/Survey
  • [10/27/2024] Holland's Theory of Vocational Choice and Adjustment
  • [10/27/2024] Item Response Warehouse
  • [10/26/2024] EstCRM - the Samejima's Continuous IRT Model
  • [10/23/2024] Idiographic Personality Gaussian Process for Psychological Assessment
  • [10/23/2024] The experience sampling method (ESM)
  • [10/21/2024] Ecological Momentary Assessment (EMA)
  • [10/20/2024] Meta-Analytic Structural Equation Modeling
  • [10/20/2024] Structure of vocational interests
  • [10/17/2024] LLMs for psychological assessment
  • [10/16/2024] Can Deep Neural Networks Inform Theory?
  • [10/16/2024] Cognition & Decision Modeling Laboratory
  • [10/14/2024] Time-Invariant Confounders in Cross-Lagged Panel Models
  • [10/13/2024] Polynomial regression
  • [10/13/2024] Bayesian Mixture Modeling
  • [10/10/2024] Response surface analysis (RSA)
  • [10/10/2024] Text-Based Personality Assessment with LLM
  • [10/09/2024] Circular unidimensional scaling: A new look at group differences in interest structure.
  • [10/07/2024] Video Interview
  • [10/07/2024] Relationship between Measurement and ML
  • [10/07/2024] Conscientiousness × Interest Compensation (CONIC) model
  • [10/03/2024] Response modeling methodology
  • [10/02/2024] Conceptual Versus Empirical Distinctions Among Constructs
  • [10/02/2024] Construct Proliferation
  • [09/23/2024] Psychological Measurement Paradigm through Interactive Fiction Games
  • [09/20/2024] A Computational Method to Reveal Psychological Constructs From Text Data
  • [09/18/2024] H is for Human and How (Not) To Evaluate Qualitative Research in HCI
  • [09/17/2024] Automated Speech Recognition Bias in Personnel Selection
  • [09/16/2024] Congruency Effect
  • [09/11/2024] privacy, security, and trust perceptions
  • [09/10/2024] Measurement, Scale, Survey, Questionnaire
  • [09/09/2024] Reporting Systematic Reviews
  • [09/09/2024] Evolutionary Neuroscience
  • [09/09/2024] On Personality Measures and Their Data
  • [09/09/2024] Two Dimensions of Professor-Student Rapport Differentially Predict Student Success
  • [09/05/2024] The SAPA Personality Inventory
  • [09/05/2024] Moderated mediation
  • [09/03/2024] BiGGen Bench
  • [09/02/2024] LMSYS Chatbot Arena
  • [09/02/2024] Introduction to Measurement Theory Chapters 1, 2 (2.1-2.8) and 3.
  • [09/01/2024] HCI measurememt
  • [08/30/2024] Randomization Test
  • [08/30/2024] Interview Quantative Statistical
  • [08/29/2024] Cascading Model
  • [08/29/2024] Introduction: The White House (IS_202)
  • [08/29/2024] Circular unidimensional scaling
  • [08/28/2024] Sex and Gender Differences (Neur_542_Week2)
  • [08/26/2024] Workplace Assessment and Social Perceptions (WASP) Lab
  • [08/26/2024] Computational Organizational Research Lab
  • [08/26/2024] Reading List (Recommended by Bo)
  • [08/20/2024] Illinois NeuroBehavioral Assessment Laboratory (INBAL)
  • [08/14/2024] Quantitative text analysis
  • [08/14/2024] Measuring complex psychological and sociological constructs in large-scale text
  • [08/14/2024] LLM for Social Science Research
  • [08/14/2024] GPT for multilingual psychological text analysis
  • [08/12/2024] Questionable Measurement Practices and How to Avoid Them
  • [08/12/2024] NLP for Interest (from Dan Putka)
  • [08/12/2024] ONet Interest Profiler (Long and Short Scale)
  • [08/12/2024] ONet Interests Data
  • [08/12/2024] The O*NET-SOC Taxonomy
  • [08/12/2024] ML Ratings for O*Net
  • [08/09/2024] Limited ability of LLMs to simulate human psychological behaviours
  • [08/08/2024] A large-scale, gamified online assessment
  • [08/08/2024] Text-Based Traitand Cue Judgments
  • [08/07/2024] Chuan-Peng Lab
  • [08/07/2024] Modern psychometrics: The science of psychological assessment
  • [08/07/2024] Interactive Survey
  • [08/06/2024] Experimental History
  • [08/06/2024] O*NET Research reports
  • [07/30/2024] Creating a psychological assessment tool based on interactive storytelling
  • [07/24/2024] My Life with a Theory
  • [07/24/2024] NLP for Interest Job Ratings
  • [07/17/2024] Making vocational choices
  • [07/17/2024] Taxonomy of Psychological Situation
  • [07/12/2024] PathChat 2
  • [07/11/2024] Using games to understand the mind
  • [07/10/2024] Gamified Assessments
  • [07/09/2024] Poldracklab Software and Data
  • [07/09/2024] Consensus-based Recommendations for Machine-learning-based Science
  • [07/08/2024] Using AI to assess personal qualities
  • [07/08/2024] AI Psychometrics And Psychometrics Benchmark
  • [07/02/2024] Prompt Engineering Guide
  • [06/28/2024] Observational Methods and Qualitative Data Analysis 5-6
  • [06/28/2024] Observational Methods and Qualitative Data Analysis 3-4
  • [06/28/2024] Interviewing Methods 5-6
  • [06/28/2024] Interviewing Methods 3-4
  • [06/28/2024] What is Qualitative Research 3
  • [06/27/2024] APA Style
  • [06/27/2024] Statistics in Psychological Research 6
  • [06/27/2024] Statistics in Psychological Research 5
  • [06/23/2024] Bayesian Belief Network
  • [06/18/2024] Fair Comparisons in Heterogenous Systems Evaluation
  • [06/18/2024] What should we evaluate when we use technology in education?
  • [06/16/2024] Circumplex Model
  • [06/12/2024] Ways of Knowing in HCI
  • [06/09/2024] Statistics in Psychological Research 1-4
  • [06/08/2024] Mathematics for Machine Learning
  • [06/08/2024] Vocational Interests SETPOINT Dimensions
  • [06/07/2024] How's My PI Study
  • [06/06/2024] Best Practices in Supervised Machine Learning
  • [06/06/2024] SIOP
  • [06/06/2024] Measurement, Design, and Analysis: An Integrated Approach (Chu Recommended)
  • [06/06/2024] Classical Test Theory
  • [06/06/2024] Introduction to Measurement Theory (Bo Recommended)
  • [06/03/2024] EDSL: AI-Powered Research
  • [06/03/2024] Perceived Empathy of Technology Scale (PETS)
  • [06/02/2024] HCI area - Quantitative and Qualitative Modeling and Evaluation
  • [05/26/2024] Psychometrics with R
  • [05/26/2024] Programming Grammer Design
  • [05/25/2024] Psychometric Network Analysis
  • [05/23/2024] Item Response Theory
  • [05/22/2024] Nature Human Behaviour (Jan - 20 May, 2024)
  • [05/22/2024] Nature Human Behaviour - Navigating the AI Frontier
  • [05/22/2024] Computer Adaptive Testing
  • [05/22/2024] Personality Scale (Jim Shard)
  • [05/22/2024] Reliability
  • [05/19/2024] Chatbot (Jim Shared)
  • [05/17/2024] GOMS and Keystroke-Level Model
  • [05/17/2024] The Psychology of Human-Computer Interaction
  • [05/14/2024] Computational Narrative (Mark's Group)
  • [05/14/2024] Validity Coding
  • [05/14/2024] LLM as A Evaluator
  • [05/14/2024] Social Skill Training via LLMs (Diyi's Group)
  • [05/14/2024] AI Persona
  • [05/09/2024] Psychological Methods Journal Sample Articles
  • [05/08/2024] Meta-Analysis
  • [05/07/2024] Mturk
  • [05/06/2024] O*NET Reports and Documents
  • [05/04/2024] NLP and Chatbot on Personality Assessment (Tianjun)
  • [05/02/2024] Reads on Construct Validation
  • [04/25/2024] Reads on Validity
  • [04/18/2024] AI for Assessment
  • [04/17/2024] Interest Assessment
  • [04/16/2024] Personality Long Reading List (Jim)
    • Personality Psychology Overview
      • Why Study Personality Assessment
    • Dimensions and Types
    • Reliability
    • Traits: Two Views
    • Validity--Classical Articles and Reflections
    • Validity-Recent Proposals
    • Multimethod Perspective and Social Desirability
    • Paradigm of Personality Assessment: Multivariate
    • Heritability of personality traits
    • Classical Test-Construction
    • IRT
    • Social desirability in scale construction
    • Traits and culture
    • Paradigms of personality assessment: Empirical
    • Comparison of personality test construction strategies
    • Clinical versus Actuarial (AI) Judgement and Diagnostics
    • Decisions: Importance of base rates
    • Paradigms of Personality Assessment: Psychodynamic
    • Paradigms of Assessment: Interpersonal
    • Paradigms of Personality Assessment: Personological
    • Retrospective reports
    • Research Paradigms
    • Personality Continuity and Change
Powered by GitBook
On this page

[06/02/2024] HCI area - Quantitative and Qualitative Modeling and Evaluation

Previous[06/03/2024] Perceived Empathy of Technology Scale (PETS)Next[05/26/2024] Psychometrics with R

Last updated 1 year ago

The original blog is here. It is a cool blog. And I only added some PDFs in this blog.

Introduction

Two activities go hand-in-hand in a majority of HCI research: modeling and evaluation. Modeling addresses what you know about the user, and often their surrounding social and physical environment. A variety of existing models, such as the Human-Model Processor, and modeling techniques, such as Contextual Inquiry, address differing domains and levels of specificity. Models may be used to predict performance, organize field data, and describe potential interactions with a computer interface. As you read, examine the various models and modeling techniques that provide the foundation for the research. When will these models be useful in other research settings? What do you need to know to complete a model? How can you gather that information?

One use of models is to inform the evaluation of an interface. These activities are linked as the specificity and domain of the models constrains the questions that can be addressed in an evaluation. You will notice that specific, quantitative models are used to inform specific, quantitative evaluations. Likewise, more general, qualitative models are often the basis for various qualitative studies. The feasibility of combining various evaluation techniques is influenced by the compatibility of the underlying models. If the models make conflicting assumptions about the user, perhaps even disagreeing on what can or cannot be known, then the validity of combining evaluation techniques is in question.

One of the distinguishing characteristics of the HCI area in Computer Science is the importance of evaluation of how any computer-assisted system impacts its intended user population. Evaluation in HCI (and other human-centered disciplines) is quite different from evaluation in other areas of Computer Science, mainly because it is sometimes hard to construct experiments or observations that give definitive quantitative answers regarding the merit of one system over another. Instead, evaluation in HCI consists of demonstrating a scientific approach to answer questions about a systems relative merit in its context of use. This approach can consist of a myriad of techniques. Sometimes, a very reliable quantitative result is derivable, as is the case in narrowly-focused human motor observations such as a Fitts' Law experiment or a Keystroke-Level Model analysis. Other times, when the impact on work practices is sought, it is nearly impossible to control all influences in a natural setting. A student of HCI should become familiar with the variety of evaluation techniques and develop a sense of suitability of these techniques.

One of the best ways to achieve the ability to critique evaluation approaches is to read examples of evaluation work in the literature. As you read, critique the research based on the repeatability of the experimentation (Could a competent researcher reproduce the findings following the procedures described by the authors?) and the strength of the analysis and conclusions (Did the authors do enough to convince you of their evaluation results?) This is a particularly good way to assess quantitative results, and although this criteria can be used in the assessment of qualitative research, another useful criteria is to ask about the depth of explanation of the particular phenomena being reported.

One way to organize the information that you gather is to fill in the simple, 2x2 matrix:

Modeling

Evaluation

Quantitative

Qualitative

You should pay attention to the horizontal connections between modeling and evaluation techniques. Likewise, notice the connections and disconnections between quantitative qualitative techniques.

General Resources

Modeling

Fitts' Law, Model-Human Processor and GOMS

Many quantitative models arise from the Human Factors literature. Some models are best suited for describing expert (decision-free), simple motor and cognitive activities. The most well-known examples are Fitts' Law and GOMS. There have been numerous examinations of Fitts' Law in the context of graphical user interface design. Bill Buxton has published several papers on applications and extensions of Fitts' Law. A good example is:

  • I. Scott MacKenzie, William Buxton. (1992) Extending Fitts' Law to Two-Dimensional Tasks. Proceedings of ACM CHI'92 Conference on Human Factors in Computing Systems pp. 219-226.

GOMS is based on a well-known model of human cognition and behavior, the Model-Human Processor. The following paper describes this model as well as the Keystroke Level Model, first defined by Card, Moran and Newell:

  • Card, S.K., Moran, T.P and Newell, A. The Psychology of Human-Computer Interaction, Lawrence Erlbaum, 1983.

This work is the foundation for the GOMS family of evaluation techniques. GOMS has been one of the few widely known theoretical concepts in human-computer interaction. Two recent and good survey articles on the history and applications of GOMS are:

  • Bonnie E. John and David E. Kieras. (1996) Using GOMS for User Interface Design and Evaluation: Which Technique? ACM Transactions on Computer-Human Interaction, v.3 n.4 p.287-319.

  • Bonnie E. John and David E. Kieras. (1996) The GOMS Family of User Interface Analysis Techniques: Comparison and Contrast. Transactions on Computer-Human Interaction v.3 n.4 p.320-351.

Other Theories of Human Cognition

Other theories of human cognition examine the relationship between information in the head, such as a plan, and information in the world, such as a written to-do list. These theories may be the basis for both qualitative and quantitative models and evaluation techniques. Three contrasting theories are: situated activity, activity theory, and distributed cognition:

  • Nardi, B. (1996) Activity theory and HCI & Studying Context. In Bonnie Nardi (Ed). Context and Consciousness: Activity theory and human computer interaction. Cambridge: MIT press.

  • Halverson, C. (2002), Activity Theory and Distributed Cognition or What Does CSCW Need to Do With Theories?, Journal of CSCW.

Interaction Models

Some useful models explicitly place the user interacting with a computer interface. These models can be used to compare different interface designs. See:

  • Hutchins, Hollan, and Norman (1986) Direct Manipulation Interfaces, in Donald Norman and Stephen Draper, User Centered System Design, 1986, pp. 87-124.

  • Michel Beaudouin-Lafon (2000) Instrumental interaction: an interaction model for designing post-WIMP user interfaces, Proceedings of CHI'2000, pages 446-453.

Contextual Inquiry and Design

Contextual Inquiry is a set of methods for gathering qualitative information and human activity in a complex, social setting. A variety of models are used to represent these multivariate environments. Contextual Design is a methodology for using these models to inform an interface design.

  • Beyer, H & Holtzblatt, K. (1998) Contextual design: Defining customer-centered systems. San Francisco: Morgan Kaufmann.

Gathering Qualitative Data

A common method for gathering qualitative data is interviewing. This short book is an indispensable guide:

  • Interviewing as Qualitative Research, by I.E. Seidman

Two texts that cover the collection and analysis of Qualitative data are:

  • Analysing Social Settings by Lofland and Lofland

  • Strauss A, Corbin J. Basics of Qualitative Research: Grounded Theory Procedures and Techniques now in 3rd edition.

Many researchers now promote using ethnographic techniques to gather data about complex, social settings. As an example, see:

  • Hughes, Sommerville, Bentley & Randall. (1993) Designing with ethnography: Making work visible. Interacting with computers. Vol 5:2. Pp. 239-253.

Evaluation

Quantitative vs. Qualitative

The most basic distinction is between a quantitative or qualitative evaluation. In a quantitative evaluation, the purpose is to come up with some objective metric of human performance that can be used to compare interaction phenomena. This can be contrasted with a qualitative evaluation, in which the purpose is to derive deeper understanding of the human interaction experience. A typical example of a quantitative evaluation is the empirical user study, a controlled experiment in which some hypothesis about interaction is tested through direct measurement. A typical example of a qualitative evaluation is an open-ended interview with relevant users. Some resources:

Evaluation Techniques

There are a number of established evaluation techniques that are useful in different situations. When reading about these techniques, focus on understanding when a technique is valid, and the underlying model of human behavior.

Cognitive Walkthrough

The cognitive walkthrough technique is another example of a theory-based evaluation technique.

  • Peter Polson, Clayton Lewis, John Rieman, Cathleen Wharton, (1992) Cognitive Walkthroughs: A Method for Theory-Based Evaluation of User Interfaces. International Journal of Man-Machine Studies v.36 n.5 p.741-773.

Discount usability

In contrast to the theory-based techniques, there are a range of evaluation techniques that are more practically based, relying less on a foundational theory of human performance or cognition. Two good examples of this class of evaluation techniques are questionnaires and heuristic evaluation.

Summative vs. Formative

An important question to ask when performing evaluation is when to perform the evaluation with respect to the overall life cycle of a system. Formative evaluation occurs prior to much investment in implementation of a design, whereas summative evaluation occurs after a full system has been deployed. Many evaluation techniques can be employed in either a formative or summative mode, but it is important to know what the difference is when applied before or after an artifact has been implemented. You must also take into account the co-evolutionary influence of human tasks and interaction technology.

What is enough evaluation?

It is also important to understand that within the HCI research community, there are different expectations for evaluation. We should not expect the same amount of evaluation efforts in a paper that talks about a toolkit supporting multimodal gesture recognition as we would in a paper concerned with the impact of some existing technology in a domestic environment. When reading a research paper in the HCI area, you need to determine what the appropriate expectations should be for user-centered evaluation and judge accordingly. Remember that all systems have users (a programmer uses a toolkit) and proper consideration for the needs of that user should always be apparent in HCI research.

Surveys and detailed coverage of many modeling and evaluation techniques are covered in CS 6750: Introduction to HCI. Follow on courses such as CS 6455: User Interface Design and Evaluation address , while courses such as PSYC 6018 Principles of Research Design and PSYC 7101 Engineering Psychology I: Methods address . Being familiar with both, as well as being able to which methods are suitable for your own research is an essential skill for an HCI researcher.

Many papers in the SIGCH conference series on , also known as the CHI conference, include significant modeling and evaluation work, both quantitative and qualitative. This is also true of the CSCW conference series (both the ACM conference and the European series), though CSCW research tends to include more qualitative modeling and evaluation. The ACM conference usually does not emphasize modeling and evaluation as much, but there are occasional stellar papers that provide a judicious balance between technology development and evaluation.

Suchman, L. A. (1987). Plans and Situated Actions: The problem of human-machine communication. Cambridge: Cambridge University Press. (Also see ) Also available as (2006) .

Hutchins, E. (1995). Cognition in the Wild. MIT Press. (Also see )

See discussion of techniques.

Qualitative Methods
Quantitative Methods
pick
Human Factors in Computing Systems
CSCW
ECSCW
UIST
http
://jupiter.informatik.umu.se/~mjson/hcipd/suchman.html
Human-Machine Reconfigurations: Plans and Situated Actions 2nd edition
http
://www.cogs.susx.ac.uk/users/yvonner/dcog.html
Saul Greenberg's
757KB
142750.142794.pdf
pdf
301KB
235833.236050.pdf
pdf
614KB
235833.236054.pdf
pdf
166KB
A_1015298005381.pdf
pdf
1011KB
332040.332473.pdf
pdf
3MB
1-s2.0-002073739290039N-main.pdf
pdf
HCI area - Quantitative and Qualitative Modeling and Evaluation | College of Computing
Logo