Early dropout prediction via machine learning in professional online courses

Authors

DOI:

https://doi.org/10.5944/ried.23.2.26356

Keywords:

distance study, dropout, machine learning, forecasting, mathematical models, algorithms

Abstract

Despite the advantages of e-learning, this way of learning is prone to dropping out. Previous studies show that machine-learning techniques can be applied to records of interactions between students and the platform to predict abandonment. In this line, this work tries to find predictive dropout models in virtual courses that last between six and sixteen weeks, using Moodle logs from the first two. Models’ sensitivity, specificity and precision were evaluated, but priority was given to the extent to which these models made it easier to avoid attrition through cost-effective retention actions. Specifically, data from several cohorts of four courses with different themes and durations were used. All four dictated by the Secretariat of Extension of the National Technological University of the Argentine Republic, Regional Buenos Aires between February 2018 and October 2019. Different algorithms were used to generate predictive models and optimize them in order to mitigate the economic losses caused by attrition. It was analyzed if any one in particular generated the best models for all courses. It was studied whether it was convenient to build separate models per course or one for the entire data set of the four courses. It was found that it is possible to build successful predictive models and that the algorithm that produced the best models was a neural network in three of the four courses. The model that fit each one separately turned out better.

Downloads

Download data is not yet available.

Author Biographies

Ignacio Urteaga, Universidad Tecnológica Nacional (UTN) - Universidad del Salvador (USAL)

Profesor titular en materias de posgrado de las áreas de informática, análisis de negocios y gestión de proyectos dependientes de la Secretaría de Extensión Universitaria de la Universidad Tecnológica Nacional de la República Argentina. Profesor titular e investigador director en la Universidad del Salvador, Maestría en Dirección de Sistemas de Información. Exgerente de áreas de investigación de datos en importantes empresas. MBA, PGP, ITILp y físico.

Laura Siri, Universidad de Buenos Aires - Universidad Tecnológica Nacional

Comunicóloga graduada de la Universidad de Buenos Aires. Docente en la carrera de Ciencias de la Comunicación de dicha universidad, en el área de políticas tecnológicas. Integrante del equipo docente del área de Extensión de la Universidad Tecnológica Nacional de la República Argentina. Integrante desde 1994 de equipos de investigación, con artículos, libros y capítulos de libros publicados. Periodista y editora especializada en informática. Responsable regional de medios sociales online para sistemas empresariales de una compañía global de hardware, software y servicios entre 2011 y 2015.

Guillermo Garófalo, Universidad Tecnológica Nacional (UTN)

Profesor de Historia e integrante del equipo docente del área de Extensión Universitaria de la Universidad Tecnológica Nacional de la República Argentina.

References

Baker, R. S., Lindrum, D., Lindrum, M. J., & Perkowski, D. (2015). Analyzing Early At-Risk Factors in Higher Education e-Learning Courses. International Educational Data Mining Society.

Braxton, J., Sullivan, A. y Johnson R. (1997) Appraising Tinto’s theory of college student departure. J.C. Smart (Ed.), Higher education: Handbook of theory and research, (vol. XII), Agathon Press.

Buschetto Macarini, L. A., Cechinel, C., Batista Machado, M. F., Faria Culmant Ramos, V. y Munoz, R. (2019). Predicting Students Success in Blended Learning—Evaluating Different Interactions Inside Learning Management Systems. Applied Sciences, 9(24), 5523. https://doi.org/10.3390/app9245523

Casey, K. y Azcona, D. (2017). Utilizing student activity patterns to predict performance. International Journal of Educational Technology in Higher Education, 14(1), 23. https://doi.org/10.1186/s41239-017-0044-3

Centro de e-Learning de la SCEU-FRBA-UTN (2019) Pautas generales para el cursado de cursos, diplomaturas, expertos y carreras online. Recuperado de http://www.sceu.frba.utn.edu.ar/e-learning/quienes-somos/309.html

Cohen, A. (2017). Analysis of student activity in web-supported courses as a tool for predicting dropout. Educational Technology Research and Development, 65(5), 1285-1304.

Evangelista, E. D. (2019). Development of Machine Learning Models using Study Behavior Predictors of Students’ Academic Performance through Moodle. International Journal of Innovative Technology and Exploring Engineering (IJITEE), 8(6S3).

Félix, I. M., Ambrósio, A. P., Neves, P. S., Siqueira, J., & Brancher, J. D. (2017). Moodle Predicta: A Data Mining Tool for Student Follow Up. Proceedings of the 9th International Conference on Computer Supported Education (CSEDU) pp. 339-346.

Felix, I., Ambrósio, A., Lima, P., & Brancher, J. (2018). Data Mining for Student Outcome Prediction on Moodle: a systematic mapping. Brazilian Symposium on Computers in Education (Simpósio Brasileiro de Informática na Educação - SBIE), 29(1), 1393. http://dx.doi.org/10.5753/cbie.sbie.2018.1393

Fierro-Saltos W. et al. (2020) Autonomous Learning Mediated by Digital Technology Processes in Higher Education: A Systematic Review. Ahram T., Karwowski W., Pickl S., Taiar R. (eds) Human Systems Engineering and Design II. IHSED 2019. Advances in Intelligent Systems and Computing, vol 1026. Springer, Cham.

Fritsch, S., Guenther, F., & Guenther, M. F. (2019). Package ‘neuralnet’. Training of Neural Networks. Recuperado de https://cran.r-project.org/web/packages/neuralnet/neuralnet.pdf

García Aretio, L. (2011). Perspectivas teóricas de la educación a distancia y virtual. Revista Española de Pedagogía, (Año LXIX n°249), 255–271.

García Saiz, D. (2016). García Saiz, D. (2016). Minería de datos aplicada a la enseñanza virtual: nuevas propuestas para la construcción de modelos y su integración en un entorno amigable para el usuario no experto. Tesis Doctoral, Universidad de Cantabria, Departamento de Ingeniería Informática y Electrónica, Cantabria.

Garrison, D. R., Anderson, T., & Archer, W. (1999). Critical inquiry in a text-based environment: Computer conferencing in higher education. The internet and higher education, 2(2-3), 87-105.

Gray, C. C., & Perkins, D. (2019). Utilizing early engagement and machine learning to predict student outcomes. Computers & Education, 131, 22-32.

Jokhan, A., Sharma, B., & Singh, S. (2019). Early warning system as a predictor for student performance in higher education blended courses. Studies in Higher Education, 44(11), 1900-1911.

Leppänen, L., Leinonen, J., Ihantola, P., & Hellas, A. (2017, Septiembre). Predicting academic success based on learning material usage. Proceedings of the 18th Annual Conference on Information Technology Education (pp. 13-18). ACM.

Muñoz, A., Delgado, R., Rubio, E., Grilo, C. y Basto-Fernandes, V. (2017). Forum participation plugin for Moodle: Development and Discussion. Procedia Computer Science, 121, 982–989. https://doi.org/10.1016/j.procs.2017.11.127

Murray, M., Pérez, J., Geist, D., & Hedrick, A. (2013, Julio). Student interaction with content in online and hybrid courses: Leading horses to the proverbial water. Proceedings of the Informing Science and Information Technology Education Conference (pp. 99-115). Informing Science Institute.

Robnik-Sikonja, M., Savicky, P., & Robnik-Sikonja, M. M. (2018). Package ‘CORElearn’. Recuperado de https://cran.r-project.org/web/packages/CORElearn/CORElearn.pdf

Romero, C., Zafra, A., Luna, J. M., & Ventura, S. (2013). Association rule mining using genetic programming to provide feedback to instructors from multiple‐choice quiz data. Expert Systems, 30(2), 162-172.

Therneau, T., Atkinson, B., Ripley, B., & Ripley, M. B. (2015). Package ‘rpart’. Recuperado de cran.ma.ic.ac.uk/web/packages/rpart/rpart.pdf.

Kara, M., Erdoğdu, F., Kokoç, M. y Cagiltay, K. (2019). Challenges Faced by Adult Learners in Online Distance Education: A Literature Review. Open Praxis, 11(1), 5. https://doi.org/10.5944/openpraxis.11.1.929

Sorour, S.E., Mine, T., Goda, K., & Hirokawa, S. (2015). A Predictive Model to Evaluate Student Performance. JIP, 23, 192-201.

Tinto, V. (2017 [first published online December 11th, 2015]). Through the Eyes of Students. Journal of College Student Retention: Research, Theory & Practice, 19(3), 254–269. https://doi.org/10.1177/1521025115621917

Usman, U. I., Salisu, A., Barroon, A. I. E., & Yusuf, A. (2019). A Comparative Study of Base Classifiers in Predicting Students’ Performance based on Interaction with LMS Platform. FUDMA Journal of Sciences. ISSN: 2616-1370, 3(1), 231-239.

Published

2020-07-01

How to Cite

Urteaga, I., Siri, L., & Garófalo, G. (2020). Early dropout prediction via machine learning in professional online courses. RIED. Revista Iberoamericana De Educación a Distancia, 23(2), 147–167. https://doi.org/10.5944/ried.23.2.26356