Susu Zhang, Columbia University
Uncovering cross-situational consistency with canonical correlation analysis of process data
The process data collected from computer-based interactive items, which document the sequence of actions that test-takers perform during problem-solving, can contain great amount of information about the test-takers: They not only reflect test-takers’ proficiency on the measured trait but may also reveal consistent behavioral patterns that carry over from one assessment item to another, including problem-solving strategies, response style, and preferences. Through these cross-situational consistencies in test-takers’ problem-solving processes, test developers can get a glance of each test-taker’s cognitive and behavioral profile, as well as the different types of individual differences that the items can reveal. The current study proposes an approach to uncovering consistent behavioral patterns in pairs of items through canonical correlation analysis (CCA) of the log data, where actions that are highly correlated across two test items (e.g., individuals who use strategy A on item 1 are very likely to use strategy B on item 2) can be identified through the extraction and inspection of the canonical variables. Three approaches to extract canonical variables are considered, including (1) linear CCA applied to the process features extracted with multidimensional scaling, (2) deep CCA directly applied to the action sequences, and (3) sparse CCA of the n-gram features extracted from the log data. Interactive visualization tools are developed to aid the empirical interpretation of the extracted canonical variables. The proposed methods are applied to the log data from the Problem Solving in Technology-Rich Environments assessment in the 2012 Programme for International Assessment of Adult Competencies.