Authors: Vijay S Kumar Tahsin Kurc Varun Ratnakar Jihie Kim Gaurang Mehta Karan Vahi Yoonju Lee Nelson P Sadayappan Ewa Deelman Yolanda Gil Mary Hall Joel Saltz
Publish Date: 2010/04/10
Volume: 13, Issue: 3, Pages: 315-333
Abstract
Data analysis processes in scientific applications can be expressed as coarsegrain workflows of complex data processing operations with data flow dependencies between them Performance optimization of these workflows can be viewed as a search for a set of optimal values in a multidimensional parameter space consisting of input performance parameters to the applications that are known to affect their execution times While some performance parameters such as grouping of workflow components and their mapping to machines do not affect the accuracy of the analysis others may dictate trading the output quality of individual components and of the whole workflow for performance This paper describes an integrated framework which is capable of supporting performance optimizations along multiple such parameters Using two realworld applications in the spatial multidimensional data analysis domain we present an experimental evaluation of the proposed framework
Keywords: