Home | Issues | Profile | History | Submission | Review
Vol: 57(71) No: 4 / December 2012 

Application of Semantic Clustering in Question Generation Engine
László Kovács
Department of Information Technology, University of Miskolc, 3595Miskolc,Egyetem-város, Hungary, phone: 06 46 565-111, e-mail: kovacs@iit.uni-miskolc.hu, web: http://iit.uni-miskolc.hu
László Bednarik
Department of Information Technology, University of Miskolc, 3595Miskolc,Egyetem-város, Hungary, e-mail: laszlobednarik@gmail.com


Keywords: text mining, question generation, semantic clustering, e-learning

Abstract
The aim of automated question generation engine is to create assessment questions for a given textbook. The paper describes the process to generate \'multi-choice\' type questions. One of the main phases in the process is to generate candidate words for the placeholder position. This component relies usually on a thesaurus of the given domain, but for some minor languages and domain topics no such thesaurus exists. The paper presents methodology and algorithm how to generate a local similarity graph for the contained words and how to use this graph in generation of candidate words. The presented method is tested in the Hungarian language. The last part of the paper shows the evaluation of the implemented prototype system.

References
[1] F.Bodon, “Databaminingalgoritms”, Free Software Foundation, Budapest, 2010, pp. 175-181.
[2] N. K. Sondheimer. Spatial Reference and Natural Language Machine Control. Int. Journal of Man-Machine Studies, 8:329–336, 1976.
[3] T. Zjang, R. Ramakrishanan and M. Livny, “BIRCH: An Efficient Data Clustering Method for Very Large Databases”, Proc of ACM SIGMOD 1996, 1996, pp. 103-115.
[4] L. Heyer, R. Ramakrishanan and M. Livny, “BIRCH: An Efficient Data Clustering Method for Very Large Databases”, Genome Research, Vol. 9, 1999, pp. 1106-1115.
[5] B. Bloom, “Taxonomy of Educational Objectives”, Handbook 1: Cognitive Domain, Addison-Wesley Publ. 1956.
[6] G. Miller, “WordNet: a lexical database for English”, Communication of the ACM, Vol. 38, 1995, pp. 39-41.
[7] K. Boyer, W. Lahti, R. Phillips, M. Wallis, A. Vouk and J. Lester, “An empirically-derived question taxonomy for task-oriented tutorial dialogue”, Proceedings of AIED 2009, pp. 9-16.
[8] D. Coniam, “A Preliminary Inquiry into Using Corpus Word Frequency Data in the Automatic Generation of English Language Tests”, CALICO Journal, 14 (2-4), 1997, 15-33.
[9] M. Collins and N. Duffy, “New Ranking Algorithms for Parsing and Tagging: Kernels over Discrete Structures, and the Voted Perceptron”, Proc. of 40th Annual Meeting of the Association for Computational Linguistics, 2007, 263-270.
[10] E. Sumita, F. Sugaya, and S. Yamamoto, “Automatic Generation Method of a Fill-in-the-blank Question for Measuring English Proficiency”, Technical report of IEICE, 104 (503), 2004, 17-22.
[11] Y. C. Lin, L. C. Sung and M.C. Chen, “An Automatic Multiple-Choice Question Generation Scheme for English Adjective Understanding”, ICCE 2007 Workshop Proc. of Modeling,Management and Generation of Problems / Questions in eLearning,2007, 137-142.
[12] B. Wyse and P. Piwek, “Generating Questions from OpenLearn study units”, Proceedings of AIED 2009, pp.66-73.
[13] P. Halácsy, A. Kornai, L. Németh, A. Rung, I. Szakadáth and V. Trón, “Szószablyaprojekt”, www.szoszablya.hu, 2003.
[14] L. Kovacs, E. Gyöngyösi L. Bednarik, “Development of classification module for automated question generation framework”, Teaching Mathematics and Computer Science, 2012, in press.
[15] S. E. Canella, Ciancimino andM. L. Campos, “Mixed e- Assessment: an application of the student-generated question technique”, Proc of IEEE International Conference EDUCON 2010, Madrid, Spain
[16] R. Earl, “The practice of social research, California”, USA: Wadsworth Publishing Company, 1998.
[17] N. J. Nilsson. Shakey the Robot. Technical Note 323, Artificial Intelligence Center, SRI International, Menlo Park, CA, 1984.
[18] T. Sato and S. Hirai, “Language-Aided Robotic Teleoperation System (LARTS) for Advanced Teleoperation”, IEEE Journal on Robotics and Automation (RA), 3(5):476–480, 1987.
[19] M. C. Torrance, “Natural Communication with Robots”, Master’s thesis, MIT, Department of Electrical Engineering and Computer Science, Cambridge, MA, 1994.
[20] H. Lobin. SituierteAgentenalsnatürlichsprachlicheSchnittstellen.ArbeitsberichteComputerlinguistik 3-92, Univ. Bielefeld, Germany, 1992.
[21] N. I. Badler, B. L. Webber, J. Kalita, and J. Esakov, “Animation from Instructions”, in: N. I. Badler, B. A. Barsky, and D. Zeltzer (eds.), Making Them Move: Mechanics, Control, and Animation of Articulaited Figures, pp. 51–93. San Mateo, CA: Morgan Kaufmann, 1991.
[22] D. Chapman, Vision, Instruction, and Action. Cambridge, MA: MIT Press, 1991.
[23] S. Vere and T. Bickmore, “A Basic Agent”, Computational Intelligence, 6(1):41–60, 1990.
[24] O. Lemon, Adaptive Natural Language Generation in Dialogue using Reinforcement Learning, Theory and Applications of Natural Language Processing, Springer, 2011.
[25] M. Berry, Survey of Text Mining, Springer-Verlag, 2004.
[26] H. Chen, K.. Lynch, “Automatic construction of networks of concepts characterizing document databases”, IEEE Transactions on Systems, Man, and Cybernetics, 22 (5), pp 885-902, 1992.
[27] G. Salton, C. Yang, C. Yu, “A theory of term importance in automatic text analysis”, Journal of the American Society for Information Science, 26(1), pp. 33-44, 1975.
[28] G. Grefenstette, “Automatic thesaurus generation from raw text using knowledge-poor techniques”, Proc. of Making Sense of Words, 1993.
[29] J. Kleinberg, “Authoritative sources in a hyperlinked environment”, Journal of the ACM, 46(5), pp. 604-632, 1999.
[30] http://bednarik.ctif.hu/index.html, 2012.