Towards more replicable content analysis for learning analytics

Publisher:
ACM
Publication Type:
Conference Proceeding
Citation:
LAK2023: LAK23: 13th International Learning Analytics and Knowledge Conference, 2023, pp. 303-314
Issue Date:
2023-03-13
Filename Description Size
3576050.3576096.pdfPublished version529.92 kB
Adobe PDF
Full metadata record
Content analysis (CA) is a method frequently used in the learning sciences and so increasingly applied in learning analytics (LA). Despite this ubiquity, CA is a subtle method, with many complexities and decision points affecting the outcomes it generates. Although appearing to be a neutral quantitative approach, coding CA constructs requires an attention to decision making and context that aligns it with a more subjective, qualitative interpretation of data. Despite these challenges, we increasingly see the labels in CA-derived datasets used as training sets for machine learning (ML) methods in LA. However, the scarcity of widely shareable datasets means research groups usually work independently to generate labelled data, with few attempts made to compare practice and results across groups. A risk is emerging that different groups are coding constructs in different ways, leading to results that will not prove replicable. We report on two replication studies using a previously reported construct. A failure to achieve high inter-rater reliability suggests that coding of this scheme is not currently replicable across different research groups. We point to potential dangers in this result for those who would use ML to automate the detection of various educationally relevant constructs in LA.
Please use this identifier to cite or link to this item: