Projects supporting the arff data format.
Showing Items 1-20 of 34 on page 1 of 2: 1 2 Next

Logo DynaML 1.4.1

by mandar2812 - April 20, 2017, 18:32:33 CET [ Project Homepage BibTeX Download ] 181 views, 22 downloads, 1 subscription

About: DynaML is a Scala environment for conducting research and education in Machine Learning. DynaML comes packaged with a powerful library of classes implementing predictive models and a Scala REPL where one can not only build custom models but also play around with data work-flows.

Changes:

Initial Announcement on mloss.org.


Logo scikit multilearn 0.0.5

by niedakh - February 25, 2017, 03:51:59 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 2306 views, 504 downloads, 3 subscriptions

About: A native Python, scikit-compatible, implementation of a variety of multi-label classification algorithms.

Changes:
  • a general matrix-based label space clusterer has been added which can cluster the output space using any scikit-learn compatible clusterer (incl. k-means)
  • support for more single-class and multi-class classifiers you can now use problem transformation approaches with your favourite neural networks/deep learning libraries: theano, tensorflow, keras, scikit-neuralnetworks
  • support for label powerset based stratified kfold added
  • graph-tool clusterer supports weighted graphs again and includes stochastic blockmodel calibration
  • bugs were fixed in: classifier chains and hierarchical neuro fuzzy clasifiers

Logo NaN toolbox 3.1.2

by schloegl - January 22, 2017, 12:24:59 CET [ Project Homepage BibTeX Download ] 57695 views, 11707 downloads, 3 subscriptions

About: NaN-toolbox is a statistics and machine learning toolbox for handling data with and without missing values.

Changes:

Changes in v.3.1.2 - improve configuration and build system - support of more platforms (including Octave 4.2.0) improved

Changes in v.3.0.3 - improve compatibility for Octave on Windows

Changes in v.3.0.1 - fix packaging for octave

Changes in v.2.8.5 - bug fix: trimmean - compiler support for gcc-5 and clang - fix typos

For details see the CHANGELOG at http://pub.ist.ac.at/~schloegl/matlab/NaN/CHANGELOG


Logo Java Statistical Analysis Tool 0.0.7

by EdwardRaff - January 15, 2017, 22:21:50 CET [ Project Homepage BibTeX Download ] 2523 views, 616 downloads, 2 subscriptions

About: General purpose Java Machine Learning library for classification, regression, and clustering.

Changes:

See github release tab for change info


Logo python weka wrapper3 0.1.2

by fracpete - January 4, 2017, 10:27:40 CET [ Project Homepage BibTeX Download ] 2144 views, 402 downloads, 3 subscriptions

About: A thin Python3 wrapper that uses the javabridge Python library to communicate with a Java Virtual Machine executing Weka API calls.

Changes:
  • "typeconv.double_matrix_to_ndarray" no longer assumes a square matrix (https://github.com/fracpete/python-weka-wrapper3/issues/4)
  • "len(Instances)" now returns the number of rows in the dataset (module "weka.core.dataset")
  • added method "insert_attribute" to the "Instances" class
  • added class method "create_relational" to the "Attribute" class
  • upgraded Weka to 3.9.1

Logo python weka wrapper 0.3.10

by fracpete - January 4, 2017, 10:21:33 CET [ Project Homepage BibTeX Download ] 42765 views, 8481 downloads, 3 subscriptions

About: A thin Python wrapper that uses the javabridge Python library to communicate with a Java Virtual Machine executing Weka API calls.

Changes:
  • "types.double_matrix_to_ndarray" no longer assumes a square matrix (https://github.com/fracpete/python-weka-wrapper/issues/48)
  • "len(Instances)" now returns the number of rows in the dataset (module "weka.core.dataset")
  • added method "insert_attribute" to the "Instances" class
  • added class method "create_relational" to the "Attribute" class
  • upgraded Weka to 3.9.1

Logo ADAMS 16.12.1

by fracpete - December 22, 2016, 05:24:00 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 26553 views, 4935 downloads, 3 subscriptions

About: The Advanced Data mining And Machine learning System (ADAMS) is a novel, flexible workflow engine aimed at quickly building and maintaining real-world, complex knowledge workflows.

Changes:

Some highlights:

  • Over 80 new actors, nearly 30 new conversions
  • Weka Investigator -- the big brother of the Weka Explorer, or how to be more efficient with less clicks using multiple datasets in multiple sessions and multiple predefined outputs per evaluation run
  • Weka Multi-Experimenter -- simple interface for running Weka and ADAMS experiments.
  • File commander -- dual-pane file manager (inspired by Norton/Midnight commander) that allows you to manage local and remote files (ftp, sftp, smb); usually faster than native file managers (like Windows Explorer, Nautilus, Caja) in terms of handling 10s of thousand of files in a single directory
  • experimental deeplearning4j module
  • module for querying/consuming webservices using Groovy
  • basic terminal-based GUI for remote machines (eg cloud)
  • many interactive actors can be used in headless environment now as well
  • Fixed a memory leak introduced by Java's logging framework
  • Flow editor now has predefined rules for swapping actors, e.g. Trigger with Tee or ConditionalTrigger, maintaining as many options as possible (including any sub-actors).
  • improved imaging and PDF support

Logo AMIDST Toolbox 0.6.0

by ana - October 14, 2016, 19:35:27 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 6287 views, 1118 downloads, 4 subscriptions

About: A Java Toolbox for Scalable Probabilistic Machine Learning.

Changes:
  • Added sparklink module implementing the integration with Apache Spark. More information here.
  • Fluent pattern in latent-variable-models
  • Predefined model implementing the concept drift detection

Detailed information can be found in the toolbox's web page


Logo JMLR JKernelMachines 3.0

by dpicard - May 4, 2016, 17:53:28 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 36768 views, 7900 downloads, 4 subscriptions

Rating Whole StarWhole StarWhole StarWhole Star1/2 Star
(based on 4 votes)

About: machine learning library in java for easy development of new kernels and kernel algorithms

Changes:

Version 3.0

/! Warning: this version is incompatible with previous code

  • change license to BSD 3-clauses
  • change package name to net.jkernelmachines
  • change to maven build system (available through central)
  • online training interfaces to allow continuous online learning
  • add a new budget oriented kernel classifier
  • new kernel and processing especially for strings

Logo JaTeCS 1.0.0

by aesuli - April 5, 2016, 17:23:12 CET [ Project Homepage BibTeX Download ] 2718 views, 556 downloads, 2 subscriptions

About: Jatecs is an open source Java library focused on automatic text categorization.

Changes:

Initial Announcement on mloss.org.


Logo ELKI 0.7.1

by erich - March 14, 2016, 13:44:02 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 25642 views, 4504 downloads, 4 subscriptions

About: ELKI is a framework for implementing data-mining algorithms with support for index structures, that includes a wide variety of clustering and outlier detection methods.

Changes:

Additions and improvements from ELKI 0.7.0 to 0.7.1:

Algorithm additions:

  • GriDBSCAN: DBSCAN using grid partitioning (Minkowski distances only)

  • Compare-Means and Sort-Means k-means variations (much faster than traditional k-means)

  • Visualization of dendrograms.

Important bug fixes:

  • Classes with no package ("default package") would cause errors.

  • The fast power function implementation was sometimes returning incorrect results.

  • Random sampling was sometimes not sampling from the full data set.

UI improvements:

  • The file input source will now automatically choose the Arff parser for .arff files.

  • MiniGUI now allows choosing other applications.

  • MiniGUI now displays the command line in a separate field.

  • MiniGUI displays an error message, if an incorrect classpath or JAyatana (on Ubuntu) is detected.

  • Export to png now works, we added a work-around for an open Batik bug.

Smaller changes:

  • Many smaller bug fixes.

  • C-Index for cluster evaluation now can process larger data sets.

  • OPTICS output of undefined reachability fixed.

  • External distance matrixes are easier to use and perform additional checks.

  • Precomputed distance matrixes can answer range and kNN queries.

  • Voronoi visualization can be switched in the menu now.

  • Improved backwards command line compatibility with additional aliases.

  • Added generated @since annotations in JavaDoc.

  • Many new unit tests, renamed to the Java conventions.

  • Low-level reading of service files, to have faster startup.


Logo PyScriptClassifier 0.3.0

by cjb60 - November 25, 2015, 04:07:51 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 3560 views, 874 downloads, 2 subscriptions

About: Easily prototype WEKA classifiers and filters using Python scripts.

Changes:

0.3.0

  • Filters have now been implemented.
  • Classifier and filter classes satisfy base unit tests.

0.2.1

  • Can now choose to save the script in the model using the -save flag.

0.2.0

  • Added Python 3 support.
  • Added uses decorator to prevent non-essential arguments from being passed.
  • Fixed nasty bug where imputation, binarisation, and standardisation would not actually be applied to test instances.
  • GUI in WEKA now displays the exception as well.
  • Fixed bug where single quotes in attribute values could mess up args creation.
  • ArffToPickle now recognises class index option and arguments.
  • Fix nasty bug where filters were not being saved and were made from scratch from test data.

0.1.1

  • ArffToArgs gets temporary folder in a platform-independent way, instead of assuming /tmp/.
  • Can now save args in ArffToPickle using save.

0.1.0

  • Initial release.

Logo Apache Mahout 0.11.1

by gsingers - November 9, 2015, 16:12:06 CET [ Project Homepage BibTeX Download ] 24462 views, 6196 downloads, 3 subscriptions

About: Apache Mahout is an Apache Software Foundation project with the goal of creating both a community of users and a scalable, Java-based framework consisting of many machine learning algorithm [...]

Changes:

Apache Mahout introduces a new math environment we call Samsara, for its theme of universal renewal. It reflects a fundamental rethinking of how scalable machine learning algorithms are built and customized. Mahout-Samsara is here to help people create their own math while providing some off-the-shelf algorithm implementations. At its core are general linear algebra and statistical operations along with the data structures to support them. You can use is as a library or customize it in Scala with Mahout-specific extensions that look something like R. Mahout-Samsara comes with an interactive shell that runs distributed operations on a Spark cluster. This make prototyping or task submission much easier and allows users to customize algorithms with a whole new degree of freedom. Mahout Algorithms include many new implementations built for speed on Mahout-Samsara. They run on Spark 1.3+ and some on H2O, which means as much as a 10x speed increase. You’ll find robust matrix decomposition algorithms as well as a Naive Bayes classifier and collaborative filtering. The new spark-itemsimilarity enables the next generation of cooccurrence recommenders that can use entire user click streams and context in making recommendations.


Logo KEEL Knowledge Extraction based on Evolutionary Learning 3.0

by keel - September 18, 2015, 12:38:54 CET [ Project Homepage BibTeX Download ] 2374 views, 580 downloads, 1 subscription

About: KEEL (Knowledge Extraction based on Evolutionary Learning) is an open source (GPLv3) Java software tool that can be used for a large number of different knowledge data discovery tasks. KEEL provides a simple GUI based on data flow to design experiments with different datasets and computational intelligence algorithms (paying special attention to evolutionary algorithms) in order to assess the behavior of the algorithms. It contains a wide variety of classical knowledge extraction algorithms, preprocessing techniques (training set selection, feature selection, discretization, imputation methods for missing values, among others), computational intelligence based learning algorithms, hybrid models, statistical methodologies for contrasting experiments and so forth. It allows to perform a complete analysis of new computational intelligence proposals in comparison to existing ones. Moreover, KEEL has been designed with a two-fold goal: research and educational. KEEL is also coupled with KEEL-dataset: a webpage that aims at providing to the machine learning researchers a set of benchmarks to analyze the behavior of the learning methods. Concretely, it is possible to find benchmarks already formatted in KEEL format for classification (such as standard, multi instance or imbalanced data), semi-supervised classification, regression, time series and unsupervised learning. Also, a set of low quality data benchmarks is maintained in the repository.

Changes:

Initial Announcement on mloss.org.


Logo JMLR Mulan 1.5.0

by lefman - February 23, 2015, 21:19:05 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 28298 views, 8796 downloads, 2 subscriptions

About: Mulan is an open-source Java library for learning from multi-label datasets. Multi-label datasets consist of training examples of a target function that has multiple binary target variables. This means that each item of a multi-label dataset can be a member of multiple categories or annotated by many labels (classes). This is actually the nature of many real world problems such as semantic annotation of images and video, web page categorization, direct marketing, functional genomics and music categorization into genres and emotions.

Changes:

Learners

  • MLCSSP.java: Added the MLCSSP algorithm (from ICML 2013)
  • Enhancements of multi-target regression capabilities
  • Improved CLUS support
  • Added pairwise classifier and pairwise transformation

Measures/Evaluation

  • Providing training data in the Evaluator is unnecessary in the case of specific measures.
  • Examples with missing ground truth are not skipped for measures that handle missing values.
  • Added logistics and squared error losses and measures

Bug fixes

  • IndexOutOfBounds in calculation of MiAP and GMiAP
  • Bug fix in Rcut.java
  • When in rank/score mode the meta-data contained additional unecessary attributes. (Newton Spolaor)

API changes

  • Upgrade to Java 7
  • Upgrade to Weka 3.7.10

Miscalleneous

  • Small changes and improvements in the wrapper classes for the CLUS library
  • ENTCS13FeatureSelection.java (new experiment)
  • Enumeration is now used for specifying the type of meta-data. (Newton Spolaor)

Logo Hub Miner 1.1

by nenadtomasev - January 22, 2015, 16:33:51 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 4957 views, 873 downloads, 2 subscriptions

About: Hubness-aware Machine Learning for High-dimensional Data

Changes:
  • BibTex support for all algorithm implementations, making all of them easy to reference (via algref package).

  • Two more hubness-aware approaches (meta-metric-learning and feature construction)

  • An implementation of Hit-Miss networks for analysis.

  • Several minor bug fixes.

  • The following instance selection methods were added: HMScore, Carving, Iterative Case Filtering, ENRBF.

  • The following clustering quality indexes were added: Folkes-Mallows, Calinski-Harabasz, PBM, G+, Tau, Point-Biserial, Hubert's statistic, McClain-Rao, C-root-k.

  • Some more experimental scripts have been included.

  • Extensions in the estimation of hubness risk.

  • Alias and weighted reservoir methods for weight-proportional random selection.


Logo JEMLA 1.0

by bathaeian - January 4, 2015, 08:34:49 CET [ Project Homepage BibTeX Download ] 2132 views, 665 downloads, 3 subscriptions

About: Java package for calculating Entropy for Machine Learning Applications. It has implemented several methods of handling missing values. So it can be used as a lab for examining missing values.

Changes:

Discretizing numerical values is added to calculate mode of values and fractional replacement of missing ones. class diagram is on the web http://profs.basu.ac.ir/bathaeian/free_space/jemla.rar


Logo pySPACE 1.2

by krell84 - October 29, 2014, 15:36:28 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 6939 views, 1319 downloads, 1 subscription

About: pySPACE is the abbreviation for "Signal Processing and Classification Environment in Python using YAML and supporting parallelization". It is a modular software for processing of large data streams that has been specifically designed to enable distributed execution and empirical evaluation of signal processing chains. Various signal processing algorithms (so called nodes) are available within the software, from finite impulse response filters over data-dependent spatial filters (e.g. CSP, xDAWN) to established classifiers (e.g. SVM, LDA). pySPACE incorporates the concept of node and node chains of the MDP framework. Due to its modular architecture, the software can easily be extended with new processing nodes and more general operations. Large scale empirical investigations can be configured using simple text- configuration files in the YAML format, executed on different (distributed) computing modalities, and evaluated using an interactive graphical user interface.

Changes:

improved testing, improved documentation, windows compatibility, more algorithms


Logo JMLR Waffles 2014-07-05

by mgashler - July 20, 2014, 04:53:54 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 43141 views, 10778 downloads, 2 subscriptions

About: Script-friendly command-line tools for machine learning and data mining tasks. (The command-line tools wrap functionality from a public domain C++ class library.)

Changes:

Added support for CUDA GPU-parallelized neural network layers, and several other new features. Full list of changes at http://waffles.sourceforge.net/docs/changelog.html


Logo JMLR MOA Massive Online Analysis Nov-13

by abifet - April 4, 2014, 03:50:20 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 20231 views, 6886 downloads, 1 subscription

About: Massive Online Analysis (MOA) is a real time analytic tool for data streams. It is a software environment for implementing algorithms and running experiments for online learning from evolving data streams. MOA includes a collection of offline and online methods as well as tools for evaluation. In particular, it implements boosting, bagging, and Hoeffding Trees, all with and without Naive Bayes classifiers at the leaves. MOA supports bi-directional interaction with WEKA, the Waikato Environment for Knowledge Analysis, and it is released under the GNU GPL license.

Changes:

New version November 2013


Showing Items 1-20 of 34 on page 1 of 2: 1 2 Next