-
- Description:
KeplerWeka represents the integration of all the functionality of the WEKA Machine Learning Workbench [1] into the open-source scientific workflow Kepler [2]. Among them are classification, clustering, attribute selection and association rules. Data can be read from multiple data-sources, pre-processed, visualized (ROC, cost-curves, ...) and also saved in various formats (file or database). Schemes can be evaluated via random splits, cross-validation or dedicated train and test sets. The Weka Experimenter is available in the workflow as a separate component (or "actor" in Kepler terms) as well. In contrast to the standalone application, the Experiment actor is not limited to files stored on disk, but can also be run with data generated in the workflow. Furthermore, one can feed parameter sweeps of classifiers into the Experiment actor as well, using the ClassifierSetupGenerator actor.
The workflow engine Kepler is based on the Ptolemy II system [3] for heterogeneous, concurrent modeling and design. Although Ptolemy II was not originally intended for scientific workflows, it provides a mature platform for building and executing workflows, and supports multiple models of computation.
[1] Ian H. Witten and Eibe Frank (2005) "Data Mining: Practical machine learning tools and techniques", 2nd Edition, Morgan Kaufmann, San Francisco, 2005.
[2] Kepler Project, http://kepler-project.org/
[3] Ptolemy II, http://ptolemy.berkeley.edu/ptolemyII/
- Changes to previous version:
- Now compatible with Kepler 2.0
- New version of WEKA included (patched 3.7.2 release), WEKA's new package manager works in conjunction with Kepler
- Renamed actor Count to ConditionalTee, introduced new Count actor
- Removed actors OutputLogger, MultiSync, TwinSync
- BibTeX Entry: Download
- Supported Operating Systems: Agnostic
- Data Formats: None
- Tags: Association Rules, Attribute Selection, Classification, Clustering, Workflow
- Archive: download here
Other available revisons
-
Version Changelog Date 20101008 - Now compatible with Kepler 2.0
- New version of WEKA included (patched 3.7.2 release), WEKA's new package manager works in conjunction with Kepler
- Renamed actor Count to ConditionalTee, introduced new Count actor
- Removed actors OutputLogger, MultiSync, TwinSync
October 9, 2010, 05:27:13 20090409 -
New actors
Rename, EvaluationValuePicker, SequencePlotter, ObjectToString, OutputLogger -
Updated actors
Evaluators (CV, random split, testset), Classifying, Clustering, Filter - added scripts for updating the icons (update_icons.sh) and ontologies (update_ontologies.sh) - useful for developers
April 9, 2009, 05:13:45 20090112 Project page moved to sf.net.
November 5, 2008, 05:55:02
Comments
No one has posted any comments yet. Perhaps you'd like to be the first?
Leave a comment
You must be logged in to post comments.