Towards more replicable content analysis for learning analytics
- Publisher:
- ACM
- Publication Type:
- Conference Proceeding
- Citation:
- LAK2023: LAK23: 13th International Learning Analytics and Knowledge Conference, 2023, pp. 303-314
- Issue Date:
- 2023-03-13
Closed Access
Filename | Description | Size | |||
---|---|---|---|---|---|
3576050.3576096.pdf | Published version | 529.92 kB |
Copyright Clearance Process
- Recently Added
- In Progress
- Closed Access
This item is closed access and not available.
Content analysis (CA) is a method frequently used in the learning sciences and so increasingly applied in learning analytics (LA). Despite this ubiquity, CA is a subtle method, with many complexities and decision points affecting the outcomes it generates. Although appearing to be a neutral quantitative approach, coding CA constructs requires an attention to decision making and context that aligns it with a more subjective, qualitative interpretation of data. Despite these challenges, we increasingly see the labels in CA-derived datasets used as training sets for machine learning (ML) methods in LA. However, the scarcity of widely shareable datasets means research groups usually work independently to generate labelled data, with few attempts made to compare practice and results across groups. A risk is emerging that different groups are coding constructs in different ways, leading to results that will not prove replicable. We report on two replication studies using a previously reported construct. A failure to achieve high inter-rater reliability suggests that coding of this scheme is not currently replicable across different research groups. We point to potential dangers in this result for those who would use ML to automate the detection of various educationally relevant constructs in LA.
Please use this identifier to cite or link to this item: