Home | Issues | Profile | History | Submission | Review
Vol: 53(67) No: 3 / September 2008

Document Conversions Using Grid-based e-Infrastructure for Digital Libraries
Dana Petcu
Department of Computer Science, West University of Timisoara, B-dul Vasile Parvan 4, 300223 Timisoara, Romania, phone: (+40) 256-592-370, e-mail: petcu@info.uvt.ro, web: http://web.info.uvt.ro/~petcu
Silviu Panica
Department of Computer Science, West University of Timisoara, B-dul Vasile Parvan 4, 300223 Timisoara, Romania, e-mail: silviu@info.uvt.ro, web: http://web.info.uvt.ro/~silviu


Keywords: grid computing, digital libraries, image ex-traction, efficiency

Abstract
Digital libraries document conversions are time-consuming processes. Grid infrastructures proved to be useful in supporting processes that are intensively requiring computational resources. Several scenarios are identified when document conversions are requested and the response time is an issue. The results of two experiments are presented confirming that the usage of a Grid infrastructure is a convenient technical solution.

References
[1] D. Petcu, S. Panica, D. Banciu, V. Negru and A. Eckstein, ²Optical character recognition on a Grid infrastructure”, Procs. 3rd International Conference on Automated Production of Cross Media Content for Multi-channel Distribution (AXMEDIS 2007), J. Delgado, K. Ng, P. Nesi, P. Bellini (eds.), IEEE Computer Society Press, Los Alamitos, pp. 21-25, 2007.
[2] I. Frommholz, P. Knezevic, B. Mehta, C. Niederee, T. Risse, and Ulrich Thiel, ²Supporting information access in next generation digital library architectures”. In: M. Agosti; H.Jg Schek; C.Trker (eds.): Digital Library Architectures: Peer-to-Peer, Grid, and Service-Orientation. Procs. 6th Thematic Workshop of the EU Network of Excellence DELOS. Cagliari, Italy, 2004.
[3] B. Mehta and P. Fankhauser, ”To Grid or not to Grid: Digital libraries based on Grid infrastructure”, Procs. of DLSci Workshop at ECDL 2006.
[4] DILIGENT Consortium, ²A testbed digital library infrastructure on Grid enabled technologies², available at http://diligentproject.org, 2004.
[5] EGEE-II Consortium, “Enabling Grids for science”, available at http://www.eu-egee.org, 2006.
[6] DILIGENT Consortium, ²gCube”, available at http://www.gcube-system.org, 2007.
[7] D4SCIENCE consortium, ²Distributed collaboraties infrastructure on Grid enabled technologies for science², available at http://www.d4science.org, 2008.
[8] Google, “Books library project”, available at http://books.google. com/googlebooks/library.html, 2007.
[9] Open Content Alliance, “OCA”, available at http://www.open contentalliance.org, 2006.
[10] Google, “Ocropus”, available at http://code.google.com/p/ocropus/, 2007.
[11] H. Goto, “OCRGrid : A platform for distributed and cooperative OCR systems”, in Procs. 18th ICPR, Vol. 2, 982–985, 2006.
[12] R. Mason, H. Schmidt, and R. Trott, “Down on the OCR farm: How we produced searchable PDFs for 7 million documents in a student computer lab”, in Procs. 5th JCDL, 391–391, 2005.
[13] SEE-Grid Consortium, “South Eastern European Grid-enabled e-Infrastructure Development”, available at http://www.see-grid.org, 2008.
[14] EGEE Consortium, “gLite: Lightweight middlerware for Grid computing”, http://glite.web.cern.ch/glite/