Project details for python weka wrapper

Screenshot python weka wrapper 0.1.0

by fracpete - April 28, 2014, 23:18:48 CET [ Project Homepage BibTeX Download ]

view (53 today), download ( 8 today ), 3 subscriptions

Description:

A thin Python wrapper that uses the javabridge Python library to communicate with a Java Virtual Machine executing Weka API calls. Offers all major APIs, like data generators, loaders, savers, filters, classifiers, clusterers, attribute selection, associations and experiments. However, it does not provide any graphical frontend at this stage.

Changes to previous version:

Initial Announcement on mloss.org.

BibTeX Entry: Download
URL: Project Homepage
Supported Operating Systems: Agnostic
Data Formats: Arff, Csv, Libsvm, Xrff
Tags: Machine Learning, Weka
Archive: download here

Other available revisons

Version Changelog Date
0.3.1
  • added "get_tags" class method to "Tags" class for easier instantiation of Tag arrays
  • added "find" method to "Tags" class to locate "Tag" object that matches the string
  • fixed "getitem" and "setitem" methods of the "Tags" class
  • added "GridSearch" meta-classifier with convenience properties to module "weka.classifiers"
  • added "SetupGenerator" and various parameter classes to "weka.core.classes"
  • added "MultiSearch" meta-classifier with convenience properties to module "weka.classifiers"
  • added "quote"/"unquote" and "backquote"/"unbackquote" methods to "weka.core.classes" module
  • added "main" method to "weka.core.classes" for operations on options: join, split, code
  • added support for option handling to "weka.core.classes" module
April 23, 2015, 00:06:57
0.3.0
  • added method "ndarray_to_instances" to "weka.converters" module for converting Numpy 2-dimensional array into "Instances" object
  • added method "plot_learning_curve" to "weka.plot.classifiers" module for creating learning curves for multiple classifiers for a specific metric
  • added plotting of experiments with "plot_experiment" methid in "weka.plot.experiments" module
  • "Instance.create_instance" method now takes list of tuples (index, internal float value) when generating sparse instances
  • added "weka.core.database" module for loading data from a database
  • added "make_copy" class method to "Clusterer" class
  • added "make_copy" class method to "Associator" class
  • added "make_copy" class method to "Filter" class
  • added "make_copy" class method to "DataGenerator" class
  • most classes (like Classifier and Filter) now have a default classname value in the constructor
  • added "TextDirectoryLoader" class to "weka.core.converters"
  • moved all methods from "weka.core.utils" to "weka.core.classes"
  • fixed "Attribute.index_of" method for determining label index
  • fixed "Attribute.add_string_value" method (used incorrect JNI parameter)
  • "create_instance" and "create_sparse_instance" methods of class "Instance" now ensure that list values are float
  • added "to_help" method to "OptionHandler" class which outputs a help string generated from the base class's "globalInfo" and "listOptions" methods
  • fixed "test_model" method of "Evaluation" class when supplying a "PredictionOutput" object (previously generated "No dataset structure provided!" exception)
  • added "batch_finished" method to "Filter" class for incremental filtering
  • added "line_plot" method to "weka.plot.dataset" module for plotting dataset using internal format (one line plot per instance)
  • added "is_serializable" property to "JavaObject" class
  • added "has_class" convenience property to "Instance" class
  • added "repr" method to "JavaObject" classes (simply calls "toString()" method)
  • added "Stemmer" class in module "weka.core.stemmers"
  • added "Stopwords" class in module "weka.core.stopwords"
  • added "Tokenizer" class in module "weka.core.tokenizers"
  • added "StringToWordVector" filter class in module "weka.filters"
  • added simple workflow engine (see documentation on Flow)
April 15, 2015, 12:37:22
0.2.2
  • added convenience methods "no_class" (to unset class) and "has_class" (class set?) to "Instances" class
  • switched to using faster method objects for methods "classify_instance"/"distribution_for_instance" in "Classifier" class
  • switched to using faster method objects for methods "cluster_instance"/"distribution_for_instance" in "Clusterer" class
  • switched to using faster method objects for methods "class_index", "is_missing", "get/set_value", "get/set_string_value", "weight" in "Instance" class
  • switched to using faster method objects for methods "input", "output", "outputformat" in "Filter" class
  • switched to using faster method objects for methods "attribute", "attribute_by_name", "num_attributes", "num_instances", "class_index", "class_attribute", "set_instance", "get_instance", "add_instance" in "Instances" class
January 5, 2015, 03:43:56
0.2.1
  • added unit testing framework
  • added method "refresh_cache()" to "weka/core/packages.py" to allow user to refresh local cache
  • method "get_classname" in "weka.core.utils" now handles Python objects and class objects as well
  • added convenience method "get_jclass" to "weka.core.utils" to instantiate a Java class
  • added a "JavaArray" wrapper for "arrays, which allows getting/setting elements and iterating
  • added property "classname" to class "JavaObject" for easy access to classname of underlying object
  • added class method "parse_matlab" for parsing Matlab matrix strings to "CostMatrix" class
  • "predictions" method of "Evaluation" class now return "None" if predictions are discarded
  • "Associator.get_capabilities()" method is now a property: "Associator.capabilities"
  • added wrapper classes for Java enums: "weka.core.classes.Enum"
  • fixed retrieval of "sumSq" in "Stats" class (used by "AttributeStats")
  • fixed "cluster_instance" method in "Clusterer" class
  • fixed "filter" and "clusterer" properties in clusterer classes ("SingleClustererEnhancer", "FilteredClusterer")
  • added "crossvalidate_model" method to "ClusterEvaluation"
  • added "get_prc" method to "plot/classifiers.py" for calculating the area under the precision-recall curve
  • "Filter.filter" method now handles list of "Instances" objects as well, applying the filter sequentially to all the datasets (allows generation of compatible train/test sets)
January 5, 2015, 00:30:01
0.2.0

NB: This release is not backwards compatible!

  • requires "JavaBridge" 1.0.9 at least
  • moved from Java-like get/set ("getIndex()" and "setIndex(int)") to nicer Python properties
  • using Python properties (also only read-only ones) wherevere possible
  • added "weka.core.version" for accessing the Weka version currently in use
  • added "jwrapper" and "jclasswrapper" methods to "JavaObject" class (the mother of all objects in python-weka-wrapper) to allow generic access to an object's methods: http://pythonhosted.org//javabridge/highlevel.html#wrapping-java-objects-using-reflection
  • added convenience methods "class_is_last()" and "class_is_first()" to "weka.core.Instances" class
  • added convenience methods "delete_last_attribute()" and "delete_first_attribute()" to "weka.core.Instances" class
December 22, 2014, 09:21:53
0.1.17
  • fixed "setup.py" to download Weka 3.7.12 instead of 3.7.11 (this time correct URL)
December 17, 2014, 21:43:28
0.1.16
  • fixed setup.py to download Weka 3.7.12 instead of 3.7.11
December 17, 2014, 21:38:31
0.1.15
  • fixed "Instance.set_value" method: https://github.com/fracpete/python-weka-wrapper/issues/24
  • added sub-section "From source" to section on installing the library
  • upgraded to Weka 3.7.12
December 17, 2014, 21:11:43
0.1.14
  • fixed setup.py to include the jars again when using eggs (via include_package_data etc)
  • added detailed instructions for installing the library
December 16, 2014, 01:58:17
0.1.13
  • added "get_class" method to "weka.core.utils" which returns the Python class object associated with the classname in dot-notation
  • "from_commandline" method in "weka.core.utils" now takes an optional "classname" argument, which is the classname (in dot-notation) of the wrapper class to return - instead of the generic "OptionHandler"
  • added "Kernel" and "KernelClassifier" convenience classes to better handle kernel based classifiers
November 1, 2014, 09:59:54
0.1.12
  • added "create_string" class method to the "Attribute" class for creating a string attribute
  • ROC/PRC curves can now consist of multiple plots (ie multiple class labels)
  • switched command-line option handling from "getopt" to "argparse"
  • fixed Instance.get_dataset(self) method
  • added iterators for: rows/attributes in dataset, values in dataset row
  • incremental loaders can be iterated now
October 17, 2014, 00:16:26
0.1.11
  • moved wekaexamples module to separate github project: https://github.com/fracpete/python-weka-wrapper-examples
  • added "stratify", "train_cv" and "test_cv" methods to the Instances class
  • fixed "to_summary" method of the Evaluation class: failed when providing a custom title
September 25, 2014, 00:39:02
0.1.10
  • fixed adding custom classpath using jvm.start(class_path=[...])
August 29, 2014, 05:00:14
0.1.9
  • added static methods to Instances class: summary, merge_instances, append_instances
  • added methods to Instances class: delete_with_missing, equal_headers
August 29, 2014, 04:58:45
0.1.8
  • fixed installer: MANIFEST.in now includes CHANGES.rst and DESCRIPTION.rst as well
June 26, 2014, 02:38:12
0.1.7
  • fixed weka/plot/dataset.py imports to avoid error when testing for matplotlib availability
  • Instance.create_instance (weka/core/dataset.py) now accepts Python list and Numpy array
June 26, 2014, 02:14:16
0.1.6
  • added troubleshooting section for Mac OSX to documentation
  • recompiled helper jars with 1.6 rather than 1.7
  • added link to Google Group
May 29, 2014, 06:02:41
0.1.5
  • added CostMatrix support in the classifier evaluation
  • fixed various retrievals of double arrays (accessed them incorrectly as float arrays), like distributionForInstance for a classifier
  • Instances object can now retrieve all (internal) values of an attribute/column as numpy array
  • added plotting of cluster assignments to weka.plot.clusterers
  • fixed weka.core.utils.from_commandline method
  • fixed weka.classifiers.PredictionOutput (get/set_header methods)
  • predictions can be turned into an Instances object now using weka.classifiers.predictions_to_instances
May 23, 2014, 06:40:42
0.1.4
  • dependencies for plotting are now optional (pygraphviz, PIL, matplotlib)
  • plots now support custom titles
May 19, 2014, 03:02:32
0.1.3
  • improved documentation
  • added PRC curve plot
  • aligned to PEP8 style guidelines
  • fixed variety of little bugs (not so commonly used methods)
  • fixed lib directory reference in make files for Java helper classes
May 17, 2014, 13:37:54
0.1.2
  • added matrix plot
  • added scatter plot for two attributes
  • fixes in constructors of classes
  • added MultiFilter convenience class
  • predictions (of classifiers) can now be collected and output using the PredictionOutput class
  • added support for attribute statistics
May 13, 2014, 07:11:07
0.1.1
  • constructors now take list of commandline options as well
  • added Weka package support (list/install/uninstall)
  • ROC plotting for classifiers
  • improved code documentation
  • added more examples
  • added more datasets
  • using javabridge 1.0.1 now
May 2, 2014, 03:35:38
0.1.0

Initial Announcement on mloss.org.

April 28, 2014, 23:18:48

Comments

No one has posted any comments yet. Perhaps you'd like to be the first?

Leave a comment

You must be logged in to post comments.