Home | Issues | Profile | History | Submission | Review
Vol: 53(67) No: 2 / June 2008        

Knowledge Base Representation in a Grammar Induction System with Extended Conceptual Graph
Erika Baksa-Varga
Department of Information Technology, University of Miskolc, Faculty of Mechanical Engineering and Informatics, H-3515 Miskolc-Egyetemváros, Hungary, phone: (36 46) 565-136, e-mail: vargae@iit.uni-miskolc.hu
László Kovács
Department of Information Technology, University of Miskolc, Faculty of Mechanical Engineering and Informatics, H-3515 Miskolc-Egyetemváros, Hungary, e-mail: kovacs@iit.uni-miskolc.hu


Keywords: knowledge representation, semantic models

Abstract
The goal of our research is to develop a grammar induction system that works on the basis of statistical learning methods. Our approach is based on a model where free text is annotated with a semantic ontology description. In the model, the main task is to learn the association rules between free text sentences and ontology representations. The novelties of the project are the use of ontology as support for grammar induction, and the sentence-based approach. The paper starts with Peirce’s theory of signs and its consequent ideas as the background for the definition of signal processing in intelligent artificial agents, which build up their internal knowledge base from the received signals. The knowledge base is represented by an ontology. As a first step, an appropriate formalism is searched for the semantic model that represents the knowledge base of agents to be converted into sentences. For this purpose we have created a comparative review of semantic models. According to our analyses, traditional semantic modeling languages are not adequate for reflecting the process of conceptualization. Therefore we developed, and in this paper we introduce Extended Conceptual Graph, a novel semantic model for representing the knowledge base in a grammar induction system. The paper presents the basic requirements and a detailed description of the model with the formal definition of its elements, together with a graphical formalism for its visualization, and a representative example to illustrate its application.

References
[1] G. B. Varile, A. Zampolli, R. A. Cole, J. Mariani, H. Uszkoreit, A. Zaenen and V. Zue, Survey of the State of the Art in Human Language Technology, Cambridge University Press, 1997.
[2] E. Charniak, Statistical Language Learning, MIT Press, Cambridge, MA, 1996.
[3] C.D. Manning and H. Schütze, Foundations of Statistical Natural Language Processing, MIT Press, Cambridge, MA, 1999.
[4] A. Clark, Unsupervised Language Acquisition: Theory and Practice, PhD Dissertation, COGS, University of Sussex, 2001.
[5] A. Roberts and E. Atwell, Unsupervised Grammar Inference Systems for Natural Language, Research Report 2002.20, University of Leeds, School of Computing, 2002.
[6] A. McEnery, R. Xiao and Y. Tono, Corpus-Based Language Studies: An Advanced Resource Book, in ser. Routledge Applied Linguistics, Routledge, 2005.
[7] L. Reeve and H. Han, “Survey of semantic annotation platforms”, in Proc. of the 2005 ACM Symposium on Applied Computing, Santa Fe, New Mexico, pp. 1634 – 1638, 2005.
[8] D. Maynard, “Benchmarking ontology-based annotation tools for the semantic web”, in Proc. of the UK e-Science All Hands Meeting, Nottingham, UK, 2005.
[9] E. Kaufmann, A. Bernstein and R. Zumstein, Querix: “A natural language interface to query ontologies based on clarification dialogs”, in Proc. of the 5th International Semantic Web Conference, 2006.
[10] C. Wang, M. Xiong, Q. Zhou and Y. Yu, “PANTO: A portable natural language interface to ontologies”, in Proc. of the 4th European Semantic Web Conference, 2007.
[11] Smaranda Muresan, Learning Constraint-based Grammars from Representative Examples: Theory and Applications, PhD Dissertation, Columbia University, NY, 2006.
[12] J. F. Sowa, “Ontology, metadata, and semiotics”, in Conceptual Structures: Logical, Linguistic, and Computational Issues, Lecture Notes in AI #1867, B. Ganter and G. W. Mineau Eds., Springer-Verlag, Berlin, pp. 55 – 81, 2000.
[13] C. S. Peirce, Collected Papers of C. S. Peirce, C. Hartshorne, P. Weiss and A. Burks Eds., 8 vols., Harvard University Press, Cambridge, MA, 1931–1958.
[14] A. Atkin, “Peirce’s theory of signs”, in The Stanford Encyclopedia of Philosophy, Edward N. Zalta Ed., Fall 2007.
[15] C. K. Ogden and I. A. Richards, The Meaning of Meaning: A Study of the Influence of Language Upon Thought and of the Science of Symbolism, Routledge & Kegan Paul, London, 1923.
[16] T. Sieber and L. Kovács, “Multi-layer semantic data model”, unpublished manuscript, 2007.
[17] L. Kovács and E. Baksa-Varga, “Logical representation and assessment of semantic models for knowledge base representation in a grammar induction system”, in Journal of Computer Science and Control Systems, University of Oradea, Ro., pp. 48–53, 2008.
[18] L. Kovács, Adatbázisok tervezésének és kezelésének módszertana, ComputerBooks, Budapest, 2004.
[19] J. F. Sowa, “Semantic Networks”, in Encyclopedia of Artificial Intelligence, Stuart C. Shapiro Ed., 2nd ed., Wiley, 1992.
[20] M. Quillian, “Semantic Memory”, in Semantic Information Processing, M. Minsky Ed., MIT Press, Cambridge, MA, pp. 216 – 270, 1968.
[21] Resource Description Framework (RDF): Concepts and Abstract Syntax, W3C Recommendation, G. Klyne and J. J. Carroll Eds., February 2004.
[22] J. F. Sowa, “Conceptual graphs for a database interface”, in IBM Journal of Research and Development, vol. 20(4), pp. 336 – 357, 1976.
[23] RDF Semantics, W3C Recommendation, P. Hayes Ed., Febr. 2004.
[24] J. F. Sowa, Conceptual Structures: Information Processing in Mind and Machine, Addison–Wesley, Reading, MA, 1984.
[25] M. Minsky, A Framework for Representing Knowledge, in The Psychology of Computer Vision, P. Winston Ed., McGraw–Hill, New York, 1975.
[26] F. Baader, D. Calvanese, D. McGuinness, D. Nardi and P. Patel-Schneider, The Description Logic Handbook: Theory, Implementation and Applications, Cambridge University Press, 2003.
[27] D. Jurafsky and J. H. Martin, Speech and Language Processing: An Introduction to NLP, Computational Linguistics, and Speech Recognition, 2nd ed., Prentice Hall, 2007.