Main Content

Pwrake: A Distrubuted Workflow Engine for E-Science

About the Talk

November 13, 2010 5:25 AM

New Orleans

New Orleans

e-Science is scientific research enabled by widely distributed computational resources in collaboration among several institutes. One of issues in making use of e-Science infrastructure is to define complex workflows (composition of many tasks and their dependencies). We propose to employ Rake as a workflow definition language. In contrast to Makefile, Rake is an internal DSL and takes advantage of Ruby's scripting power which requires to define complex scientific workflows. In order to execute Rake workflow on distributed computer resources, we develop Pwrake, an Parallel Workflow extension for Rake. Pwrake is designed to work on Gfarm, a wide-area distributed file system. Gfarm provides a unified filesystem and consistent file time stamps among distributed computers, and also high performance of parallel I/O. We show a powerfulness of Rake as a workflow language and the scalable performance of distributed computing using Pwrake and Gfarm.

Ratings and Recommendations

Avg. Rating

Average based
on 1 rating

comments powered by Disqus