Tools of Machine Learning in Psychometrics
Rudolf Debelak & David Goretzko
Full day short course (9:00am – 5:00pm)
Short Course #2
Participants of this workshop will explore basic concepts and methods of machine learning and deep learning and will learn basic applications and examples in the open source language R. Examples of such applications include the prediction of numerical variables as well as natural language processing.
This workshop will cover four main topics: a) A theoretical overview of some core techniques and concepts of machine learning, b) basic concepts of deep learning and recent architectures such as transformers, c) the implementation of these methods in R packages, and d) natural language processing, text generation and other applications of these methods.
Intended Audience
Participants should already have some basic knowledge about working with R, and should be familiar with basic statistical models, such as linear and logistic regression and decision trees. The target audience of this workshop encompasses undergraduate and graduate students as well as researchers and practitioners in academic and industry roles who are interested in machine learning and related topics such as deep learning and natural language processing, and how they might apply these tools to their own work.
Summary
Machine learning, deep learning and related topics such as pre trained large language models have found numerous applications in academic research and the industry. A wealth of research has discussed possible applications of these methods in psychometrics and psychological testing, including the automated scoring of essays and text generation.
Modern tools of machine learning build on more basic, widely known concepts and models from statistics, such as the linear and logistic regression model. Models of machine learning aim to generalize these well known models in several aspects , such as the modeling of non linear relationships and predictions based on non numerical data, such as texts and pictures.
The resulting flexibility of machine learning methods leads to important challenges, such as the estimation of model parameters, the selection of a suitable model and model architecture for a problem at hand, the assessment of the accuracy of the model predictions, and the explanation of the model prediction. Moreover, there exists a wealth of software packages in R that allow the application of these methods. This workshop aims to guide through the available algorithms, point at practical software implementations in R, and will demonstrate how these techniques can be applied to solve problems in psychology and education. We will also cover central concepts of machine learning, such as the definition of training, validation and test data, and the use of pre-trained models, which are crucial for successful applications of machine learning models.
The purpose of this workshop is threefold:
- We illustrate modern algorithms of machine learning, such as random forests or artificial neural networks, and shed light on how they build on well known, simpler models such as decision trees or linear regression.
- We demonstrate how simple and advanced machine learning models can be applied how simple and advanced machine learning models can be applied using R software, including advanced methods from the field of natural language from the field of natural language process.
- We outline the basic ideas of modern applications of machine learning methods, such as natural language processing, and discuss their application in R.
The workshop will include theoretical introductions as well as practical examples and exercises in R software. We encourage workshop participants to bring their own laptops with R pre-installed so that they can easily follow these examples. R is available for Windows, Linux and macOS operating systems. Hand-outs and R scripts will be made available before the workshop.
References
- Kjell, O., Giorgi, S., & Schwartz, H. A. (2023). The text package: An R package for analyzing and visualizing human language using natural language processing and transformers. Psychological Methods, 28 (6), 1478 1498. https://doi.org/10.1037/met0000542
- Pargent, F., Schoedel, R., & Stachl, C. (2023). Best Practices in Supervised Machine Learning: A Tutorial for Psychologists. Advances in Methods and Practices in Psychological Science 6 (3). doi:10.1177/25152459231162559
- Urban, C. J., & Gates, K. M. (2021). Deep learning: A primer
for psychologists. Psychological
Methods, 26 (6), 743 773. https://doi.org/10.1037/met0000374
About the instructors
Rudolf Debelak
Rudolf Debelak is a Senior Researcher at the Chair of Psychological Methods, Evaluation and Statistics at the University of Zurich, Switzerland. His research interests include psychometrics, with a focus on item response theory, machine learning, and the mathematical and statistical foundations of psychological research methods. He has degrees in psychology and mathematics and received a PhD in Quantitative Psychology from the University of Vienna as well as a Habilitation in Psychological Methods from the University of Zurich. His teaching includes basic and advanced courses on statistics, data science in R and Python, and machine learning. Before working in academia, he was employed in the psychological test industry for several years.
David Goretzko
David Goretzko is an assistant professor for Methodology and Statistics at Utrecht University (UU). He holds degrees in physics, psychology and statistics and received both a PhD and Habilitation in Psychological Methods from Ludwig-Maximilians-University (LMU) in Munich. His research combines psychometrics and latent variable modeling with machine learning and meta-heuristics. Beyond this, he explores the potential of machine learning and its cost-sensitive extensions to address substantive research questions in psychology and psychological assessment. His broad teaching background includes statistics, data science, machine learning and R programming courses in Bachelor, Master and PhD programs, summer schools and workshops, as well as statistical consulting for PhD students and faculty members at both UU and LMU