Students semantic mistakes in writing seven different types of SQL queries

Publication Type:
Conference Proceeding
Citation:
Annual Conference on Innovation and Technology in Computer Science Education, ITiCSE, 2016, 11-13-July-2016 pp. 272 - 277
Issue Date:
2016-07-11
Filename Description Size
1-2016-Semantic-DB.pdfAccepted Manuscript version418.31 kB
Adobe PDF
Full metadata record
Computer science researchers have studied extensively the mistakes of novice programmers. In comparison, little attention has been given to studying the mistakes of people who are novices at writing database queries. This paper represents the first large scale analysis of students' semantic mistakes in writing different types of SQL SELECT statements. Over 160 thousand snapshots of SQL queries were collected from over 2300 students across nine years. We describe the most common semantic mistakes that these students made when writing different types of SQL statements, and suggest reasons behind those mistakes. We mapped the semantic mistakes we identified in our data to different semantic categories found in the literature. Our findings show that the majority of semantic mistakes are of the type "omission". Most of these omissions happen in queries that require a JOIN, a subquery, or a GROUP BY operator. We conclude that it is important to explicitly teach students techniques for choosing the appropriate type of query when designing a SQL query.
Please use this identifier to cite or link to this item: