Early dropout prediction via machine learning in professional online courses
DOI:
https://doi.org/10.5944/ried.23.2.26356Keywords:
distance study, dropout, machine learning, forecasting, mathematical models, algorithmsAbstract
Despite the advantages of e-learning, this way of learning is prone to dropping out. Previous studies show that machine-learning techniques can be applied to records of interactions between students and the platform to predict abandonment. In this line, this work tries to find predictive dropout models in virtual courses that last between six and sixteen weeks, using Moodle logs from the first two. Models’ sensitivity, specificity and precision were evaluated, but priority was given to the extent to which these models made it easier to avoid attrition through cost-effective retention actions. Specifically, data from several cohorts of four courses with different themes and durations were used. All four dictated by the Secretariat of Extension of the National Technological University of the Argentine Republic, Regional Buenos Aires between February 2018 and October 2019. Different algorithms were used to generate predictive models and optimize them in order to mitigate the economic losses caused by attrition. It was analyzed if any one in particular generated the best models for all courses. It was studied whether it was convenient to build separate models per course or one for the entire data set of the four courses. It was found that it is possible to build successful predictive models and that the algorithm that produced the best models was a neural network in three of the four courses. The model that fit each one separately turned out better.
Downloads
References
Baker, R. S., Lindrum, D., Lindrum, M. J., & Perkowski, D. (2015). Analyzing Early At-Risk Factors in Higher Education e-Learning Courses. International Educational Data Mining Society.
Braxton, J., Sullivan, A. y Johnson R. (1997) Appraising Tinto’s theory of college student departure. J.C. Smart (Ed.), Higher education: Handbook of theory and research, (vol. XII), Agathon Press.
Buschetto Macarini, L. A., Cechinel, C., Batista Machado, M. F., Faria Culmant Ramos, V. y Munoz, R. (2019). Predicting Students Success in Blended Learning—Evaluating Different Interactions Inside Learning Management Systems. Applied Sciences, 9(24), 5523. https://doi.org/10.3390/app9245523
Casey, K. y Azcona, D. (2017). Utilizing student activity patterns to predict performance. International Journal of Educational Technology in Higher Education, 14(1), 23. https://doi.org/10.1186/s41239-017-0044-3
Centro de e-Learning de la SCEU-FRBA-UTN (2019) Pautas generales para el cursado de cursos, diplomaturas, expertos y carreras online. Recuperado de http://www.sceu.frba.utn.edu.ar/e-learning/quienes-somos/309.html
Cohen, A. (2017). Analysis of student activity in web-supported courses as a tool for predicting dropout. Educational Technology Research and Development, 65(5), 1285-1304.
Evangelista, E. D. (2019). Development of Machine Learning Models using Study Behavior Predictors of Students’ Academic Performance through Moodle. International Journal of Innovative Technology and Exploring Engineering (IJITEE), 8(6S3).
Félix, I. M., Ambrósio, A. P., Neves, P. S., Siqueira, J., & Brancher, J. D. (2017). Moodle Predicta: A Data Mining Tool for Student Follow Up. Proceedings of the 9th International Conference on Computer Supported Education (CSEDU) pp. 339-346.
Felix, I., Ambrósio, A., Lima, P., & Brancher, J. (2018). Data Mining for Student Outcome Prediction on Moodle: a systematic mapping. Brazilian Symposium on Computers in Education (Simpósio Brasileiro de Informática na Educação - SBIE), 29(1), 1393. http://dx.doi.org/10.5753/cbie.sbie.2018.1393
Fierro-Saltos W. et al. (2020) Autonomous Learning Mediated by Digital Technology Processes in Higher Education: A Systematic Review. Ahram T., Karwowski W., Pickl S., Taiar R. (eds) Human Systems Engineering and Design II. IHSED 2019. Advances in Intelligent Systems and Computing, vol 1026. Springer, Cham.
Fritsch, S., Guenther, F., & Guenther, M. F. (2019). Package ‘neuralnet’. Training of Neural Networks. Recuperado de https://cran.r-project.org/web/packages/neuralnet/neuralnet.pdf
García Aretio, L. (2011). Perspectivas teóricas de la educación a distancia y virtual. Revista Española de Pedagogía, (Año LXIX n°249), 255–271.
García Saiz, D. (2016). García Saiz, D. (2016). Minería de datos aplicada a la enseñanza virtual: nuevas propuestas para la construcción de modelos y su integración en un entorno amigable para el usuario no experto. Tesis Doctoral, Universidad de Cantabria, Departamento de Ingeniería Informática y Electrónica, Cantabria.
Garrison, D. R., Anderson, T., & Archer, W. (1999). Critical inquiry in a text-based environment: Computer conferencing in higher education. The internet and higher education, 2(2-3), 87-105.
Gray, C. C., & Perkins, D. (2019). Utilizing early engagement and machine learning to predict student outcomes. Computers & Education, 131, 22-32.
Jokhan, A., Sharma, B., & Singh, S. (2019). Early warning system as a predictor for student performance in higher education blended courses. Studies in Higher Education, 44(11), 1900-1911.
Leppänen, L., Leinonen, J., Ihantola, P., & Hellas, A. (2017, Septiembre). Predicting academic success based on learning material usage. Proceedings of the 18th Annual Conference on Information Technology Education (pp. 13-18). ACM.
Muñoz, A., Delgado, R., Rubio, E., Grilo, C. y Basto-Fernandes, V. (2017). Forum participation plugin for Moodle: Development and Discussion. Procedia Computer Science, 121, 982–989. https://doi.org/10.1016/j.procs.2017.11.127
Murray, M., Pérez, J., Geist, D., & Hedrick, A. (2013, Julio). Student interaction with content in online and hybrid courses: Leading horses to the proverbial water. Proceedings of the Informing Science and Information Technology Education Conference (pp. 99-115). Informing Science Institute.
Robnik-Sikonja, M., Savicky, P., & Robnik-Sikonja, M. M. (2018). Package ‘CORElearn’. Recuperado de https://cran.r-project.org/web/packages/CORElearn/CORElearn.pdf
Romero, C., Zafra, A., Luna, J. M., & Ventura, S. (2013). Association rule mining using genetic programming to provide feedback to instructors from multiple‐choice quiz data. Expert Systems, 30(2), 162-172.
Therneau, T., Atkinson, B., Ripley, B., & Ripley, M. B. (2015). Package ‘rpart’. Recuperado de cran.ma.ic.ac.uk/web/packages/rpart/rpart.pdf.
Kara, M., Erdoğdu, F., Kokoç, M. y Cagiltay, K. (2019). Challenges Faced by Adult Learners in Online Distance Education: A Literature Review. Open Praxis, 11(1), 5. https://doi.org/10.5944/openpraxis.11.1.929
Sorour, S.E., Mine, T., Goda, K., & Hirokawa, S. (2015). A Predictive Model to Evaluate Student Performance. JIP, 23, 192-201.
Tinto, V. (2017 [first published online December 11th, 2015]). Through the Eyes of Students. Journal of College Student Retention: Research, Theory & Practice, 19(3), 254–269. https://doi.org/10.1177/1521025115621917
Usman, U. I., Salisu, A., Barroon, A. I. E., & Yusuf, A. (2019). A Comparative Study of Base Classifiers in Predicting Students’ Performance based on Interaction with LMS Platform. FUDMA Journal of Sciences. ISSN: 2616-1370, 3(1), 231-239.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2020 RIED. Revista Iberoamericana de Educación a Distancia

This work is licensed under a Creative Commons Attribution 4.0 International License.
The articles that are published in this journal are subject to the following terms:
1. The authors grant the exploitation rights of the work accepted for publication to RIED, guarantee to the journal the right to be the first publication of research understaken and permit the journal to distribute the work published under the license indicated in point 2.
2. The articles are published in the electronic edition of the journal under a Creative Commons Attribution 4.0 International (CC BY 4.0) license. You can copy and redistribute the material in any medium or format, adapt, remix, transform, and build upon the material for any purpose, even commercially. You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
3. Conditions for self-archiving. Authors are encouraged to disseminate electronically the OnlineFirst version (assessed version and accepted for publication) of its articles before publication, always with reference to its publication by RIED, favoring its circulation and dissemination earlier and with this a possible increase in its citation and reach among the academic community.

