Vol: 55(69) No: 1 / March 2010 CRISP-DM as a Framework for Discovering Knowledge in Small and Medium Sized Enterprises Data Z. Bošnjak Department of Business Information Systems, University of Novi Sad, Faculty of Economics, Segedinski put 9-11, 24000 Subotica, Serbia, phone: (381) (0)24- 628-045, e-mail: b.zita@ef.uns.ac.rs O. Grljević Department of Business Information Systems, University of Novi Sad, Faculty of Economics, Segedinski put 9-11, 24000 Subotica, Serbia, phone: (381) (0)24- 628-166, e-mail: oliverag@ef.uns.ac.rs S. Bošnjak Department of Business Information Systems, University of Novi Sad, Faculty of Economics, Segedinski put 9-11, 24000 Subotica, Serbia, phone: (381) (0)24- 628-004, e-mail: bsale@ef.uns.ac.rs Keywords: data mining, CRISP-DM methodology, data analysis models, data exploration Abstract Discovering knowledge from a waste amount of data has become a promising area nowadays, but at the same time it is a very intricate, uncertain and time consuming process. The complexity of a data collection, the oscillations in data quality and their impact on the discovery process, as well as the applicability of results, urge for an extensive research and gain of experience to overcome the difficulties that can jeopardize the knowledge in data discovery (KDD) process as a whole. In this article we described the limitations and challenges of discovering knowledge, that we have experienced analyzing small and medium sized enterprises (SMEs) data. References [1] E. Chapman (NCR), J. Clinton (SPSS), R. Kerber (NCR), T. Khabaza (SPSS), T. Reinartz (DaimlerCrysler), C. Shearer (SPSS), and R. Wirth (DaimlerCrysler), CRISP-DM 1.0 Step-by-Step Data Mining Guide, SPSS, http://www.crisp-dm.org/CRISPWP-0800.pdf, 2000. [2] O. Grljević, and Z. Bošnjak, “Primena CRISP-DM metodologije u analizi podataka o malim i srednjim preduzećima” (“CRISP-DM methodology Utilization in Preprocessing Small and Medium Sized Enterprises Data”), Book of proceedings of XXXV Symposium on OR, SYM-OP-IS 2008, ISBN: 978-86-7395-248-2, pp. 275-279, 2008. [3] I.H. Witten, and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, Elsevier Inc., 2005. [4] D. Pyle, Data Preparation for Data Mining, Morgan Kaufman Publisher Inc., 1999. [5] I. Bratko, M. Kubat, and R.S. Michalski, Machine Learning and Data Mining: Methods and Applications, John Wiley & Sons Inc., 1998. [6] K.J. Cios,L.A. Kurgan, R.W. Swiniarski, and W. Pedrycz, Data Mining: A Knowledge Discovery Approach, Springer Science + Business Media LLC, 2007. [7] M.E. Colleen, F.C. Monique, and W. Robin, “Influence of Missing Values on Artificial Neural Network Performance”, http://www.sce.carleton.ca/faculty/frize/MIRG_2001/Ennet_ medinfo2001.pdf , 2001. [8] H. Iwamya, and B. Kermanashahi, Sensitivity Analysis and Artificial Neural Network used for Long-term Forecasting of 9 Japanese Power Utilities, Department of Electronics & Information Engineering Tokyo University of Agriculture and Technology, Tokyo, 2000. [9] H. Jiawei, and M. Kamber, Data Mining Concepts and Techniques, Morgan Kaufman Publishers, San Francisco, 2001. [10] K.A. Smith, and J.N.D. Gupta, Neural Networks in Business: Techniques and Applications, IRM Press, London, England, 2002. [11] Z. Bošnjak, and O. Grljević, „Data mining as a mean for devising actions for development of the sector of small and medium sized enterprises”, unpublished. [12] DataEngine - Users Guide, MIT, Germany, 1998. [13] T. Warren Liao, and E. Triantaphyllou, Recent Advances in Data Mining of Enterprise Data: Algorithms and Applications, Vol. 6, Series on Computers and Operations Research, ISBN 978-981-277-985-4, 2008. [14] J. W. Seifert, “Data Mining: An Overview”, CRS Report for Congress, http://www.fas.org/irp/crs/RL31798.pdf, December 16, 2004. [15] G. Cahlink, “Data Mining Taps the Trends,” Government Executive Magazine, http://www.govexec.com/tech/articles/1000managete ch.html, October 1, 2000. [16] H. R. Nemati, and C. D. Barko, Organizational Data Mining: Leveraging Enterprise Data Resources for Optimal Performance, Idea Group Inc (IGI), ISBN 1591402220, 9781591402220, 2003. [17] P.G. Harrison, and C.M. Llado, “Performance Evaluation of a Distributed Enterprise Data Mining System Source”, Lecture Notes In Computer Science, Vol. 1786, Springer-Verlag, London, pp. 117- 131, 2000. [18] Z. Bošnjak, S. Bošnjak, “Expert System Support in Data Mining Method Selection”, Book of Abstracts, pp. 154, 20th European Conference on Operations Research, Rhodes, Greece, July 4-7, 2004. [19] Bošnjak Z., Bošnjak S., Stojković M.: “Application of Fuzzy Clustering for Searching Trends in Data - The Public Transport Company in Subotica Case Study“, Proceedings of EUROFUSE 2005, ISBN 86-7172-022-5, pp. 26-35, [The Ninth Meeting of the EURO Working Group on Fuzzy Sets ”Fuzzy for Better”, Jun 15-18, Belgrade, Serbia, 2005]. |