A significant part of professional communication development in engineering is the ability to learn and understand technical vocabulary. Mastering such vocabulary is often a desired learning outcome of engineering education. In promoting this goal, this research investigates the development of a tool that creates wordlists of characteristic discipline-specific vocabulary for a given course. These wordlists explicitly highlight requisite vocabulary learning and, when used as a teaching aid, can promote greater accessibility in the learning environment.
Literature, including work in higher education, diversity and language learning, suggest that designing accessible learning environments can increase the quality of instruction and learning for all students. Studying the student/instructor interface using the framework of Universal Instructional Design identified vocabulary learning as an invisible barrier in engineering education. A preliminary investigation of this barrier suggested that students have difficulty assessing their understanding of technical vocabulary. Subsequently, computing word frequency on engineering course material was investigated as an approach for characterizing this barrier. However, it was concluded that a more nuanced method was necessary.
This research program was built on previous work in the fields of linguistics and computer science, and lead to the design of an algorithm. The developed algorithm is based on a statistical technique called, Term Frequency-Inverse Document Frequency. Comparator sets of documents are used to hierarchically identify characteristic terms on a target document, such as course materials from a previous term of study. The approach draws on a standardized artifact of the engineering learning environment as its dataset; a repository of 2254 engineering final exams from the University of Toronto, to process the target material.
After producing wordlists for ten courses, with the goal of highlighting characteristic discipline-specific terms, the effectiveness of the approach was evaluated by comparing the computed results to the judgment of subject-matter experts. The overall data show a good correlation between the program and the subject-matter experts. The results indicated a balance between accuracy and feasibility, and suggested that this approach could mimic subject-matter expertise to create a list discipline-specific vocabulary from course materials.