Project details for python weka wrapper

Screenshot python weka wrapper 0.3.12

by fracpete - February 18, 2018, 04:29:24 CET [ Project Homepage BibTeX Download ]

view (22 today), download ( 4 today ), 0 subscriptions

Description:

A thin Python wrapper that uses the javabridge Python library to communicate with a Java Virtual Machine executing Weka API calls. Offers all major APIs, like data generators, loaders, savers, filters, classifiers, clusterers, attribute selection, associations and experiments. Weka packages can be listed/installed/uninstalled as well. It does not provide any graphical frontend, but some basic plotting and graph visualizations are available through matplotlib and pygraphviz. A simple workflow engine was added with release 0.3.0.

Changes to previous version:
  • upgraded to Weka 3.9.2
  • properly initializing package support now, rather than adding package jars to classpath
  • added weka.core.ClassHelper Java class for obtaining classes and static fields, as javabridge only uses the system class loader
BibTeX Entry: Download
Supported Operating Systems: Agnostic
Data Formats: Arff, Csv, Libsvm, Xrff
Tags: Machine Learning, Weka
Archive: download here

Other available revisons

Version Changelog Date
0.3.12
  • upgraded to Weka 3.9.2
  • properly initializing package support now, rather than adding package jars to classpath
  • added weka.core.ClassHelper Java class for obtaining classes and static fields, as javabridge only uses the system class loader
February 18, 2018, 04:29:24
0.3.11
  • added check_for_modified_class_attribute method to FilterClassifier class
  • added complete_classname method to weka.core.classes module, which allows completion of partial classnames like .J48 to weka.classifiers.trees.J48; if there is a unique match; JavaObject.new_instance and JavaObject.check_type now make use of this functionality, allowing for instantiations like Classifier(cls=".J48")
  • jvm.start(system_cp=True) no longer fails with a KeyError: 'CLASSPATH' if there is no CLASSPATH environment variable defined
  • Libraries mtl.jar, core.jar and arpack_combined_all.jar were added as is to the weka.jar in the 3.9.1 release instead of adding their content to it. Repackaged weka.jar to fix this issue.
August 23, 2017, 01:17:24
0.3.10
  • "types.double_matrix_to_ndarray" no longer assumes a square matrix (https://github.com/fracpete/python-weka-wrapper/issues/48)
  • "len(Instances)" now returns the number of rows in the dataset (module "weka.core.dataset")
  • added method "insert_attribute" to the "Instances" class
  • added class method "create_relational" to the "Attribute" class
  • upgraded Weka to 3.9.1
January 4, 2017, 10:21:33
0.3.9
  • plot_learning_curve method of module weka.plot.classifiers now accepts a list of test sets; * is index of test set in label template string
  • added missing_value() methods to weka.core.dataset module and Instance class
  • output variable y for convenience method create_instances_from_lists in module weka.core.dataset is now optional
  • added convenience method create_instances_from_matrices to weka.core.dataset module to easily create an Instances object from numpy matrices (x and y)
October 18, 2016, 22:55:00
0.3.8
  • works now with javabridge 1.0.14
May 9, 2016, 04:17:42
0.3.7
  • upgraded Weka to 3.9.0
May 4, 2016, 00:02:47
0.3.6
  • Loader.load_file method now checks whether the dataset file really exists, otherwise a previously loaded file gets loaded again without an error message (seems to be a Weka issue)
  • replaced org.pentaho.packageManagement with weka.core.packageManagement as the package management code is now part of Weka rather than a third-party library
  • jvm.start() no longer tries to load packages and therefore suppresses error message if $HOME/wekafiles/packages should not yet exist
April 2, 2016, 11:57:09
0.3.5
  • added support for weka.core.BatchPredictor to class Classifier in module weka.classifiers
  • upgraded Weka to revision 12410 (post 3.7.13) to avoid performance bottleneck when using setOptions method
  • fixed class SetupGenerator from module weka.core.classes
  • added load_any_file method to the weka.core.converters module
  • added save_any_file method to the weka.core.converters module
  • if GridSearch instantiation (module weka.classifiers) fails, it now outputs message whether package installed and JVM with package support started
January 29, 2016, 05:22:58
0.3.4
  • added convenience method create_instances_from_lists to weka.core.dataset module to easily create an Instances object from numeric lists (x and y)
  • added get_object_tags method to Tags class from module weka.core.classes, to allow obtaining weka.core.Tag array from the method of a JavaObject rather than a static field (MultiSearch)
  • updated MultiSearch wrapper in module weka.classifiers to work with the multi-search package version 2016.1.15 or later
January 29, 2016, 05:21:27
0.3.3
  • updated to Weka 3.7.13
  • documentation now covers the API as well
September 26, 2015, 06:11:42
0.3.2
  • The "packages" parameter of the "weka.core.jvm.start()" function can be used for specifying an alternative Weka home directory now as well
  • added "train_test_split" method to "weka.core.Instances" class to easily create train/test splits
  • "evaluate_train_test_split" method of "weka.classifiers.Evaluation" class now uses the "train_test_split" method
June 28, 2015, 23:09:13
0.3.1
  • added "get_tags" class method to "Tags" class for easier instantiation of Tag arrays
  • added "find" method to "Tags" class to locate "Tag" object that matches the string
  • fixed "getitem" and "setitem" methods of the "Tags" class
  • added "GridSearch" meta-classifier with convenience properties to module "weka.classifiers"
  • added "SetupGenerator" and various parameter classes to "weka.core.classes"
  • added "MultiSearch" meta-classifier with convenience properties to module "weka.classifiers"
  • added "quote"/"unquote" and "backquote"/"unbackquote" methods to "weka.core.classes" module
  • added "main" method to "weka.core.classes" for operations on options: join, split, code
  • added support for option handling to "weka.core.classes" module
April 23, 2015, 00:06:57
0.3.0
  • added method "ndarray_to_instances" to "weka.converters" module for converting Numpy 2-dimensional array into "Instances" object
  • added method "plot_learning_curve" to "weka.plot.classifiers" module for creating learning curves for multiple classifiers for a specific metric
  • added plotting of experiments with "plot_experiment" methid in "weka.plot.experiments" module
  • "Instance.create_instance" method now takes list of tuples (index, internal float value) when generating sparse instances
  • added "weka.core.database" module for loading data from a database
  • added "make_copy" class method to "Clusterer" class
  • added "make_copy" class method to "Associator" class
  • added "make_copy" class method to "Filter" class
  • added "make_copy" class method to "DataGenerator" class
  • most classes (like Classifier and Filter) now have a default classname value in the constructor
  • added "TextDirectoryLoader" class to "weka.core.converters"
  • moved all methods from "weka.core.utils" to "weka.core.classes"
  • fixed "Attribute.index_of" method for determining label index
  • fixed "Attribute.add_string_value" method (used incorrect JNI parameter)
  • "create_instance" and "create_sparse_instance" methods of class "Instance" now ensure that list values are float
  • added "to_help" method to "OptionHandler" class which outputs a help string generated from the base class's "globalInfo" and "listOptions" methods
  • fixed "test_model" method of "Evaluation" class when supplying a "PredictionOutput" object (previously generated "No dataset structure provided!" exception)
  • added "batch_finished" method to "Filter" class for incremental filtering
  • added "line_plot" method to "weka.plot.dataset" module for plotting dataset using internal format (one line plot per instance)
  • added "is_serializable" property to "JavaObject" class
  • added "has_class" convenience property to "Instance" class
  • added "repr" method to "JavaObject" classes (simply calls "toString()" method)
  • added "Stemmer" class in module "weka.core.stemmers"
  • added "Stopwords" class in module "weka.core.stopwords"
  • added "Tokenizer" class in module "weka.core.tokenizers"
  • added "StringToWordVector" filter class in module "weka.filters"
  • added simple workflow engine (see documentation on Flow)
April 15, 2015, 12:37:22
0.2.2
  • added convenience methods "no_class" (to unset class) and "has_class" (class set?) to "Instances" class
  • switched to using faster method objects for methods "classify_instance"/"distribution_for_instance" in "Classifier" class
  • switched to using faster method objects for methods "cluster_instance"/"distribution_for_instance" in "Clusterer" class
  • switched to using faster method objects for methods "class_index", "is_missing", "get/set_value", "get/set_string_value", "weight" in "Instance" class
  • switched to using faster method objects for methods "input", "output", "outputformat" in "Filter" class
  • switched to using faster method objects for methods "attribute", "attribute_by_name", "num_attributes", "num_instances", "class_index", "class_attribute", "set_instance", "get_instance", "add_instance" in "Instances" class
January 5, 2015, 03:43:56
0.2.1
  • added unit testing framework
  • added method "refresh_cache()" to "weka/core/packages.py" to allow user to refresh local cache
  • method "get_classname" in "weka.core.utils" now handles Python objects and class objects as well
  • added convenience method "get_jclass" to "weka.core.utils" to instantiate a Java class
  • added a "JavaArray" wrapper for "arrays, which allows getting/setting elements and iterating
  • added property "classname" to class "JavaObject" for easy access to classname of underlying object
  • added class method "parse_matlab" for parsing Matlab matrix strings to "CostMatrix" class
  • "predictions" method of "Evaluation" class now return "None" if predictions are discarded
  • "Associator.get_capabilities()" method is now a property: "Associator.capabilities"
  • added wrapper classes for Java enums: "weka.core.classes.Enum"
  • fixed retrieval of "sumSq" in "Stats" class (used by "AttributeStats")
  • fixed "cluster_instance" method in "Clusterer" class
  • fixed "filter" and "clusterer" properties in clusterer classes ("SingleClustererEnhancer", "FilteredClusterer")
  • added "crossvalidate_model" method to "ClusterEvaluation"
  • added "get_prc" method to "plot/classifiers.py" for calculating the area under the precision-recall curve
  • "Filter.filter" method now handles list of "Instances" objects as well, applying the filter sequentially to all the datasets (allows generation of compatible train/test sets)
January 5, 2015, 00:30:01
0.2.0

NB: This release is not backwards compatible!

  • requires "JavaBridge" 1.0.9 at least
  • moved from Java-like get/set ("getIndex()" and "setIndex(int)") to nicer Python properties
  • using Python properties (also only read-only ones) wherevere possible
  • added "weka.core.version" for accessing the Weka version currently in use
  • added "jwrapper" and "jclasswrapper" methods to "JavaObject" class (the mother of all objects in python-weka-wrapper) to allow generic access to an object's methods: http://pythonhosted.org//javabridge/highlevel.html#wrapping-java-objects-using-reflection
  • added convenience methods "class_is_last()" and "class_is_first()" to "weka.core.Instances" class
  • added convenience methods "delete_last_attribute()" and "delete_first_attribute()" to "weka.core.Instances" class
December 22, 2014, 09:21:53
0.1.17
  • fixed "setup.py" to download Weka 3.7.12 instead of 3.7.11 (this time correct URL)
December 17, 2014, 21:43:28
0.1.16
  • fixed setup.py to download Weka 3.7.12 instead of 3.7.11
December 17, 2014, 21:38:31
0.1.15
  • fixed "Instance.set_value" method: https://github.com/fracpete/python-weka-wrapper/issues/24
  • added sub-section "From source" to section on installing the library
  • upgraded to Weka 3.7.12
December 17, 2014, 21:11:43
0.1.14
  • fixed setup.py to include the jars again when using eggs (via include_package_data etc)
  • added detailed instructions for installing the library
December 16, 2014, 01:58:17
0.1.13
  • added "get_class" method to "weka.core.utils" which returns the Python class object associated with the classname in dot-notation
  • "from_commandline" method in "weka.core.utils" now takes an optional "classname" argument, which is the classname (in dot-notation) of the wrapper class to return - instead of the generic "OptionHandler"
  • added "Kernel" and "KernelClassifier" convenience classes to better handle kernel based classifiers
November 1, 2014, 09:59:54
0.1.12
  • added "create_string" class method to the "Attribute" class for creating a string attribute
  • ROC/PRC curves can now consist of multiple plots (ie multiple class labels)
  • switched command-line option handling from "getopt" to "argparse"
  • fixed Instance.get_dataset(self) method
  • added iterators for: rows/attributes in dataset, values in dataset row
  • incremental loaders can be iterated now
October 17, 2014, 00:16:26
0.1.11
  • moved wekaexamples module to separate github project: https://github.com/fracpete/python-weka-wrapper-examples
  • added "stratify", "train_cv" and "test_cv" methods to the Instances class
  • fixed "to_summary" method of the Evaluation class: failed when providing a custom title
September 25, 2014, 00:39:02
0.1.10
  • fixed adding custom classpath using jvm.start(class_path=[...])
August 29, 2014, 05:00:14
0.1.9
  • added static methods to Instances class: summary, merge_instances, append_instances
  • added methods to Instances class: delete_with_missing, equal_headers
August 29, 2014, 04:58:45
0.1.8
  • fixed installer: MANIFEST.in now includes CHANGES.rst and DESCRIPTION.rst as well
June 26, 2014, 02:38:12
0.1.7
  • fixed weka/plot/dataset.py imports to avoid error when testing for matplotlib availability
  • Instance.create_instance (weka/core/dataset.py) now accepts Python list and Numpy array
June 26, 2014, 02:14:16
0.1.6
  • added troubleshooting section for Mac OSX to documentation
  • recompiled helper jars with 1.6 rather than 1.7
  • added link to Google Group
May 29, 2014, 06:02:41
0.1.5
  • added CostMatrix support in the classifier evaluation
  • fixed various retrievals of double arrays (accessed them incorrectly as float arrays), like distributionForInstance for a classifier
  • Instances object can now retrieve all (internal) values of an attribute/column as numpy array
  • added plotting of cluster assignments to weka.plot.clusterers
  • fixed weka.core.utils.from_commandline method
  • fixed weka.classifiers.PredictionOutput (get/set_header methods)
  • predictions can be turned into an Instances object now using weka.classifiers.predictions_to_instances
May 23, 2014, 06:40:42
0.1.4
  • dependencies for plotting are now optional (pygraphviz, PIL, matplotlib)
  • plots now support custom titles
May 19, 2014, 03:02:32
0.1.3
  • improved documentation
  • added PRC curve plot
  • aligned to PEP8 style guidelines
  • fixed variety of little bugs (not so commonly used methods)
  • fixed lib directory reference in make files for Java helper classes
May 17, 2014, 13:37:54
0.1.2
  • added matrix plot
  • added scatter plot for two attributes
  • fixes in constructors of classes
  • added MultiFilter convenience class
  • predictions (of classifiers) can now be collected and output using the PredictionOutput class
  • added support for attribute statistics
May 13, 2014, 07:11:07
0.1.1
  • constructors now take list of commandline options as well
  • added Weka package support (list/install/uninstall)
  • ROC plotting for classifiers
  • improved code documentation
  • added more examples
  • added more datasets
  • using javabridge 1.0.1 now
May 2, 2014, 03:35:38
0.1.0

Initial Announcement on mloss.org.

April 28, 2014, 23:18:48

Comments

No one has posted any comments yet. Perhaps you'd like to be the first?

Leave a comment

You must be logged in to post comments.