Reinforcement learning in information searching

University of Sheffield
Publication Type:
Journal article
Bai, Chen, Gan, Liren, and Cen, Yonghua 2013, 'Reinforcement learning in information searching', Information Research, vol. 18, no. 1, pp. 1-24.
Issue Date:
Full metadata record
Files in This Item:
Filename Description Size
Thumbnail2012006221OK.pdf9.78 MB
Adobe PDF
Introduction. The study seeks to answer two questions: How do university students learn to use correct strategies to conduct scholarly information searches without instructions? and, What are the differences in learning mechanisms between users at different cognitive levels? Method. Two groups of users, thirteen first year undergraduate students (freshmen) and thirty-four final year undergraduate students (seniors), were recruited into our experimental study and executed ten different search tasks independently. Five reinforcement learning models were introduced to quantitatively simulate the micro process of users' self-regulated learning of search expertise by trial and error. Analysis. The experimental data were divided into two parts. The first 70% of the data was used to estimate the parameters of each model. The remaining 30% was fitted by the estimated models. The model best fitting the data of users in each group was used to explain their learning behaviour. Results. Most undergraduates tended to repeat the strategies that brought success in their earlier experiences. Freshmen's learning behaviour manifested remarkable Markov properties. Their strategy selection was always made according to the feedback obtained in the last search activity. Seniors' strategy adjustment depended on the accumulated effect of past strategy adoptions. They displayed strong characteristics of rational thinking. Conclusions. In the process of learning searching expertise, users demonstrate reinforcement characteristics. Moreover, users at different cognitive levels exhibit different reinforcement patterns. Theoretical and practical implications were proposed from the perspectives of training programme design, adaptive information retrieval system design and information behaviour model development.
Please use this identifier to cite or link to this item: