Automated Classification and Assignment of Java Questions to Teaching Units and Problem Areas
The purpose of this thesis is to address the large number of uncategorized java questions from multiple different sources which are currently in use in the programming courses by the client by creating solutions and research possible future options.
Niklas Baumgartner, 2023
Bachelor Thesis, Fachhochschule Nordwestschweiz
Betreuende Dozierende: Thomas Hanne
Keywords: programming questions, java questions, classification, tagging, difficulty
There are currently a large amount of uncategorized Java questions in use, primarily on the platform Moodle. Regarding these questions, there is a need to be able to categorize them according to not only the different keywords contained within, but also the approximate difficulty of the questions. Right now, these questions contain some tags with which they can be identified, but these tags are not sufficient. Also, no measure of difficulty currently exists in the metadata of the exercises.
There are three main objectives of this thesis, the first of which is to provide a way of assigning tags to the java questions currently in use on Moodle. The second objective would be to analyze available literature, which should find out if there are any currently accepted ways of categorizing java questions by difficulty, and if yes, what those ways are. The third and final objective would be to provide such a way of categorizing the java questions to the client.
As of now, this thesis demonstrates various ways of addressing the stated problem, which solution approaches are useful, and which ones are not, based on the available literature. Further, it contains useful scripts which solve some of the stated problems already. Most notably, these scripts can categorize java questions by assigning them tags based on their source code, and assign them a rough difficulty estimate, based on the question’s code complexity. However, some limitations apply. Firstly, many of the datasets either contained corrupted data, such as truncated java code, or were too small to deliver meaningful results. Additionally, checking the correctness of the assigned difficulty labels is almost not possible without either manual review, or comparing with a large set of student scores from the same group of students. Overall, however, the goals set out in this project are mostly fulfilled.
Studiengang: Business Information Technology (Bachelor)
Fachbereich der Arbeit: