Vol: 61(75) No: 1 / March 2016 Provenance Based Runtime Manipulation and Dynamic Execution Framework for Scientific Workflows Eszter Kail Obuda University, John von Neumann Faculty of Informatics, Bécsi str. 96/b., H-1034, Budapest, Hungary, phone: (+36) 555-5555, e-mail: kail.eszter@nik.uni-obuda.hu Anna Bánáti Obuda University, John von Neumann Faculty of Informatics , Bécsi str. 96/b., H-1034, Budapest, Hungary, e-mail: banati.anna@nik.uni-obuda.hu Péter Kacsuk MTA SZTAKI, LPDS, Kende str. 13-17, H-1111, Budapest, Hungary, e-mail: miklos.kozlovszky@sztaki.mta.hu Miklós Kozlovszky University of Westminster, 115 New Cavendish Street, London W1W 6UW, United Kingdom, e-mail: peter.kacsuk@sztaki.mta.hu Keywords: scientific workflow, dynamic execution, interaction, user steering, provenance. Abstract Scientific workflows (swf) are commonly used to model and execute large-scale scientific experiments. From the scientist\'s perspective the workflow execution is like black boxes. The scientist submits the workflow and at the end, the result or a notification about failed completion is returned. Concerning long running experiments or when workflows are in experimental phase it may not be acceptable. Scientist need some feedback about the actual status of the execution, about failures and about intermediary results in order to save energy and time and to make adequat deceisions about the continuation. Thus scientists need to monitor the experiment during its execution in order to fine-tune their experiments or to analyze provenance data and dynamically interfere with the execution of the scientific experiment. To the best of our knowledge most of the existing solutions for dynamic execution aim to help in better optimisation but do not solve the problem of real user steering. To support the scientist with special user interaction tool we introduced the concept of intervention points (iPoints) where the user takes over the control for a while and has the possibility to interfere, namely to change some parameters or data, to stop, to restart the workflow or even to deviate from the original workflow model during enactment. We plan to implement our solution in IWIR language which was targeted to provide interoperability between four existing well-known Scientific Workflow Management Systems (SWfMS) within the framework of the SHIWA project. References 1. E. Kail, A. Banati, K. Karoczkai, P. Kacsuk, M. Kozlovszky, Dynamic workflow support in gUSE”. In Information and Communication Technology, Electronics and Microelectronics (MIPRO), 2014 37th International Convention on, 354–59. IEEE, 2014. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6859590 2. www.shiwa-workflow.eu/project 3. The moteur Project, http://www.shiwa-workflow.eu 4. The Triana Project, http://www.trianacode.org 5. The Taverna Project, http://www.taverna.org.uk , 2009. 6. The Askalon Project, http://www.askalon.org 7. Kacsuk, P. (2011), P-GRADE portal family for grid infrastructures. Concurrency Computat.: Pract. Exper., 23: 235–245. doi: 10.1002/cpe.1654 8. Terstyánszky G., Kukla T., Kiss T., Kacsuk P., Balaskó Á., Farkas Z.: Enabling scientific workflow sharing through coarse-grained interoperability. Future Generation Computer Systems-The International Journal Of Grid Comput, vol. 37. pp. 46-59, 2014. ISSN 0167-739X, 2014 10.1016/j.future.2014.02.016 9. Plankensteiner K., Prodan R., Janetschek M., Fahringer T., Montagnat J., Rogers D., Harvey I., Taylor I., Balaskó Á., Kacsuk P., Fine-Grain Interoperability of Scientific Workflows in Distributed Computing Infrastructures. Journal of Grid Computing, vol. 3 pp. 429–55, 2013. doi:10.1007/s10723-013-9261-8. 10. Plankensteiner K., Prodan R., Fahringer T., Montagnat J., Harrison A., Glatard T., Hermann G., Kozlovszky M.: IWIR Specification v1. 1, 2011. 11. https://www.shiwa-workflow.eu/documents/10753/55350/IWIR+v1.1+Specification. 12. E. M. Bahsi, Dynamic Workflow Management For Large Scale Scientific Applications, PhD Thesis, B.S., Fatih University, 2006. 13. Mattoso, Marta, Kary Ocaña, Felipe Horta, Jonas Dias, Eduardo Ogasawara, Vitor Silva, Daniel de Oliveira, Flavio Costa, és Igor Araújo. „User-Steering of HPC Workflows: State-of-the-Art and Future Directions”, 1–6. ACM Press, 2013. doi:10.1145/2499896.2499900. 14. P. Heinl, S. Horn, S. Jablonski, J. Neeb, K. Stein, and M. Teschke. A Comprehensive Approach to Flexibility in Workflow Management Systems. In Proceedings of the International Joint Conference on Work Activities Coordination and Collaboration (WACC ’99), pages 79–88. ACM Press, New York, NY, USA, 1999. 15. M. Pesic: Constraint-Based Workflow Management Systems: Shifting Control to Users, PhD thesis, Eindhoven: Technische Universiteit Eindhoven, 2008. 16. K. Lee, R. Sakellariou, N. W. Paton, and A. A. A. Fernandes. Workflow adaptation as an autonomic computing problem. In 2nd Workshop on Workflows in Support of Large-Scale Science (Works 07) in Proceedings of HPDC, pages 29–34, 2007. 17. K. Vahi, M. Rynge, G. Juve, R. Mayani, és E. Deelman, „Rethinking data management for big data scientific workflows”, in Big Data, 2013 IEEE International Conference on, 2013, o 27–35. 18. E. Deelman, G. Singh, M.-H. Su, J. Blythe, Y. Gil, C. Kesselman, G. Mehta, K. Vahi, G. B. Berriman, és J. Good, „Pegasus: A framework for mapping complex scientific workflows onto distributed systems”, Scientific Programming, vol 13, sz 3, o 219–237, 2005. 19. T. Heinis, C. Pautasso, and G. Alonso. Design and evaluation of an autonomic workflow engine. In 2nd International Conference on Autonomic Computing, pages 27–38. IEEE Computer Society, 2005 20. R. Duan, R. Prodan, and T. Fahringer. Run-time optimisation of grid workflow applications. In Proc. Intl. Conference on Grid Computing, pages 33–40. IEEE Press, 2006. 21. J.-H. Lee, S.-H. Chin, H.-M. Lee, T. Yoon, K.-S. Chung, and H.-C. Yu. Adaptive workflow scheduling strategy in service-based grids. In GPC, pages 298–309. Springer, 2007. 22. B. Ludäscher, I. Altintas, S. Bowers, J. Cummings, T. Critchlow, E. Deelman, D. D. Roure, J. Freire, C. Goble, és M. Jones, „Scientific process automation and workflow management”, Scientific Data Management: Challenges, Existing Technology, and Deployment, Computational Science Series, o 476–508, 2009. 23. Vahi, K.; Harvey, I.; Samak, T.; Gunter, D.; Evans, K.; Rogers, D.; Taylor, I.; Goode, M.; Silva, F.; Al-Shakarchi, E.; Mehta, G.; Jones, A.; Deelman, E., \"A General Approach to Real-Time Workflow Monitoring,\" High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion: , vol., no., pp.108,118, 10-16 Nov. 2012. 24. Dias, J., Ogasawara, E., de Oliveira, D., Porto, F., Coutinho, A.L., Mattoso, M., “Supporting dynamic parameter sweep in adaptive and user-steered workflow”, 2011 in: Proceedings of the 6th Workshop on Workflows in Support of Large-Scale Science. ACM, pp. 31–36. 25. Julliano Trindade Pintas, Daniel de Oliveira, Kary A. C. S. Ocaña, Eduardo Ogasawara, Marta Mattoso, “SciLightning: A Cloud Provenance-Based Event Notification for Parallel Workflows”, 2013. 26. Ailamaki, 2011, Managing scientific data: lessons, challenges, and opportunities, In:Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, p. 1045–1046. 27. P. Missier, S. Soiland-Reyes, S. Owen, W. Tan, A. Nenadic, I. Dun-lop, A. Williams, T. Oinn, and C. Goble, 2010, Taverna, reloaded, In:Proceedings of the 22nd international conference on Scientific and statistical database management, p. 471-481. 28. OPM: http://openprovenance.org/. 29. PROV: http://www.w3.org/TR/prov-overview/. 30. A. Benabdelkader, Provenance Manager: PROV-man an Implementation of the PROV Standard, Provenance Taskforce Budapest 2014 31. F. Costa, V Silva., D. de Oliveira, K. Ocaña, E. Ogasawara, J. Dias, M. Mattoso, Capturing and querying workflow runtime provenance with PROV: a practical approach, in: Proceedings of the Joint EDBT/ICDT 2013 Workshops. ACM, pp. 282–289. 32. Cruz, S.M.S. da, Campos, M.L.M., Mattoso, M., 2009. Towards a Taxonomy of Provenance in Scientific Workflow Management Systems. IEEE, pp. 259–266. |