Project details for epac

Screenshot epac 0.10

by jinpengli - October 9, 2013, 14:00:15 CET [ Project Homepage BibTeX Download ]

view (6 today), download ( 2 today ), 0 subscriptions


Embarrassingly Parallel Array Computing (EPAC) is a machine learning workflow builder. Building machine learning workflow is like playing “lego” game: user can combine machine learning building blocks into sequential pipelines (Pipe primitive) and bind them into parallel branches. Parallel branches may stem from different pipelines (Methods primitive) processing the same array or the same pipeline applied on different arrays. As it is the case with resampled data, using cross-validation (CV primitive) or permutation.

EPAC allows a “write it once, run it everywhere” workflow construction: the same workflow can be executed on a local multi-core computer or on a remote cluster (soma-workflow [1]). EPAC parallel execution is based on mapreduce paradigm providing an efficient adequation between computing resources and workflows. Large arrays (10 GB) can be processed thanks to EPAC memory management features (joblib [2] or memory mapping) avoiding unnecessary array duplication.

EPAC enable user to define their own bricks which can be plugged into the workflow. User defined bricks (classes) are automatically packed to be remotely executed without any additional configuration. Such feature, based on dill [3], makes the workflow easy to extend and reinforce the “write it once, run it everywhere” capability.

In summary, EPAC provides users with a way to easily build machine learning workflow, to run workflow faster with minimum memory footprint, to easily extend your own workflow. Feel free to try EPAC on

[1] Soizic Laguitton et al., Soma-workflow: A unified and simple interface to parallel computing resources, HBM Annual Meeting 2011, Quebec

[2] Ogrisel et al., joblib: running Python function as pipeline jobs,

[3] M.M. McKerns, L. Strand, T. Sullivan, A. Fang, M.A.G. Aivazis, "Building a framework for predictive science", Proceedings of the 10th Python in Science Conference, 2011

Changes to previous version:

Initial Announcement on

BibTeX Entry: Download
Supported Operating Systems: Linux, Windows, Macos
Data Formats: Numpy
Tags: Machine Learning, Array Computing, Workflow Builder
Archive: download here


No one has posted any comments yet. Perhaps you'd like to be the first?

Leave a comment

You must be logged in to post comments.