20 projects found that use java as the programming language.
Showing Items 1-20 of 85 on page 1 of 5: 1 2 3 4 5 Next

Logo python weka wrapper 0.3.5

by fracpete - January 29, 2016, 05:22:58 CET [ Project Homepage BibTeX Download ] 22368 views, 4665 downloads, 3 subscriptions

About: A thin Python wrapper that uses the javabridge Python library to communicate with a Java Virtual Machine executing Weka API calls.

Changes:
  • added support for weka.core.BatchPredictor to class Classifier in module weka.classifiers
  • upgraded Weka to revision 12410 (post 3.7.13) to avoid performance bottleneck when using setOptions method
  • fixed class SetupGenerator from module weka.core.classes
  • added load_any_file method to the weka.core.converters module
  • added save_any_file method to the weka.core.converters module
  • if GridSearch instantiation (module weka.classifiers) fails, it now outputs message whether package installed and JVM with package support started

Logo KeLP 2.0.1

by kelpadmin - January 13, 2016, 12:47:31 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 5711 views, 1416 downloads, 3 subscriptions

About: Kernel-based Learning Platform (KeLP) is Java framework that supports the implementation of kernel-based learning algorithms, as well as an agile definition of kernel functions over generic data representation, e.g. vectorial data or discrete structures. The framework has been designed to decouple kernel functions and learning algorithms, through the definition of specific interfaces. Once a new kernel function has been implemented, it can be automatically adopted in all the available kernel-machine algorithms. KeLP includes different Online and Batch Learning algorithms for Classification, Regression and Clustering, as well as several Kernel functions, ranging from vector-based to structural kernels. It allows to build complex kernel machine based systems, leveraging on JSON/XML interfaces to instantiate prediction models without writing a single line of code.

Changes:

In addition to minor bug fixes, this release includes:

  • Soft Confidence Weighted Classification algorithm: a brand new online learning algorithm from Wang, J., Zhao, P., Hoi, S.C.: Exact soft confidence-weighted learning. In Proceedings of the ICML 2012. ACM, New York, NY, USA (2012)

  • Optimization of the kernel caching mechanism

  • The Smooth Partial Tree Kernel and the Partial Tree Kernel now have the possibility to specify a maximum branching factor (parameter: maxSubseqLeng) in the tree fragments considered by the kernel operation.

Check out this new version from our repositories. API Javadoc is already available. Your suggestions will be very precious for us, so download and try KeLP 2.0.1!


Logo ADAMS 0.4.12

by fracpete - December 21, 2015, 22:48:18 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 17578 views, 3498 downloads, 3 subscriptions

About: The Advanced Data mining And Machine learning System (ADAMS) is a novel, flexible workflow engine aimed at quickly building and maintaining real-world, complex knowledge workflows.

Changes:

Some highlights of this release:

  • added adams-nlp package for some basic natural language processing (Stanford parser, tweet parsing)
  • VLC-based video player
  • Fonts can be customized now via preferences dialog (e.g. for better unicode support)
  • Flows can be saved/loaded with custom encodings
  • Many tweaks to search, preview browser, flow editor to improve interaction

Logo ELKI 0.7.0

by erich - November 27, 2015, 18:23:16 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 17382 views, 3185 downloads, 4 subscriptions

About: ELKI is a framework for implementing data-mining algorithms with support for index structures, that includes a wide variety of clustering and outlier detection methods.

Changes:

Additions and Improvements from ELKI 0.6.0:

ELKI is now available on Maven: https://search.maven.org/#artifactdetails|de.lmu.ifi.dbs.elki|elki|0.7.0|jar

Please clone https://github.com/elki-project/example-elki-project for a minimal project example.

Uncertain data types, and clustering algorithms for uncertain data.

Major refactoring of distances - removal of Distance values and removed support for non-double-valued distance functions (in particular DoubleDistance was removed). While this reduces the generality of ELKI, we could remove about 2.5% of the codebase by not having to have optimized codepaths for double-distance anymore. Generics for distances were present in almost any distance-based algorithm, and we were also happy to reduce the use of generics this way. Support for non-double-valued distances can trivially be added again, e.g. by adding the specialization one level higher: at the query instead of the distance level, for example. In this process, we also removed the Generics from NumberVector. The object-based get was deprecated for a good reason long ago, and e.g. doubleValue are more efficient (even for non-DoubleVectors).

Dropped some long-deprecated classes.

K-means:

  • speedups for some initialization heuristics.

  • K-means++ initialization no longer squares distances (again).

  • farthest-point heuristics now uses minimum instead of sum (renamed).

  • additional evaluation criteria.

  • Elkan's and Hamerly's faster k-means variants.

CLARA clustering.

X-means.

Hierarchical clustering:

  • Renamed naive algorithm to AGNES.

  • Anderbergs algorithm (faster than AGNES, slower than SLINK).

  • CLINK for complete linkage clustering in O(n²) time, O(n) memory.

  • Simple extraction from HDBSCAN.

  • "Optimal" extraction from HDBSCAN.

  • HDBSCAN, in two variants.

LSDBC clustering.

EM clustering was refactored and moved into its own package. The new version is much more extensible.

OPTICS clustering:

  • Added a list-based variant of OPTICS to our heap-based.

  • FastOPTICS (contributed by Johannes Schneider).

  • Improved OPTICS Xi cluster extraction.

Outlier detection:

  • KDEOS outlier detection (SDM14).

  • k-means based outlier detection (distance to centroid) and Silhouette coefficient based approach (which does not work too well on the toy data sets - the lowest silhouette are usually where two clusters touch).

  • bug fix in kNN weight, when distances are tied and kNN yields more than k results.

  • kNN and kNN weight outlier have their k parameter changed: old 2NN outlier is now 1NN outlier, as commonly understood in classification literature (1 nearest neighbor other than the query object; whereas in database literature the 1NN is usually the query object itself). You can get the old result back by decreasing k by one easily.

  • LOCI implementation is now only O(n^3 log n) instead of O(n^4).

  • Local Isolation Coefficient (LIC).

  • IDOS outlier detection with intrinsic dimensionality.

  • Baseline intrinsic dimensionality outlier detection.

  • Variance-of-Volumes outlier detection (VOV).

Parallel computation framework, and some parallelized algorithms

  • Parallel k-means.

  • Parallel LOF and variants.

LibSVM format parser.

kNN classification (with index acceleration).

Internal cluster evaluation:

  • Silhouette index.

  • Simplified Silhouette index (faster).

  • Davis-Bouldin index.

  • PBM index.

  • Variance-Ratio-Criteria.

  • Sum of squared errors.

  • C-Index.

  • Concordant pair indexes (Gamma, Tau).

  • Different noise handling strategies for internal indexes.

Statistical dependence measures:

  • Distance correlation dCor.

  • Hoeffings D.

  • Some divergence / mutual information measures.

Distance functions:

  • Big refactoring.

  • Time series distances refactored, allow variable length series now.

  • Hellinger distance and kernel function.

Preprocessing:

  • Faster MDS implementation using power iterations.

Indexing improvements:

  • Precomputed distance matrix "index".

  • iDistance index (static only).

  • Inverted-list index for sparse data and cosine/arccosine distance.

  • Cover tree index (static only).

  • Additional LSH hash functions.

Frequent Itemset Mining:

  • Improved APRIORI implementation.

  • FP-Growth added.

  • Eclat (basic version only) added.

Uncertain clustering:

  • Discrete and continuous data models.

  • FDBSCAN clustering.

  • UKMeans clustering.

  • CKMeans clustering.

  • Representative Uncertain Clustering (Meta-algorithm).

  • Center-of-mass meta Clustering (allows using other clustering algorithms on uncertain objects).

Mathematics:

  • Several estimators for intrinsic dimensionality.

MiniGUI has two "secret" new options: -minigui.last -minigui.autorun to load the last saved configuration and run it, for convenience.

Logging API has been extended, to make logging more convenient in a number of places (saving some lines for progress logging and timing).


Logo PROFET 1.0.0

by Hamda - November 26, 2015, 13:20:28 CET [ Project Homepage BibTeX Download ] 553 views, 139 downloads, 2 subscriptions

About: Software for Automatic Construction and Inference of DBNs Based on Mathematical Models

Changes:

Initial Announcement on mloss.org.


Logo PyScriptClassifier 0.3.0

by cjb60 - November 25, 2015, 04:07:51 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 1510 views, 379 downloads, 2 subscriptions

About: Easily prototype WEKA classifiers and filters using Python scripts.

Changes:

0.3.0

  • Filters have now been implemented.
  • Classifier and filter classes satisfy base unit tests.

0.2.1

  • Can now choose to save the script in the model using the -save flag.

0.2.0

  • Added Python 3 support.
  • Added uses decorator to prevent non-essential arguments from being passed.
  • Fixed nasty bug where imputation, binarisation, and standardisation would not actually be applied to test instances.
  • GUI in WEKA now displays the exception as well.
  • Fixed bug where single quotes in attribute values could mess up args creation.
  • ArffToPickle now recognises class index option and arguments.
  • Fix nasty bug where filters were not being saved and were made from scratch from test data.

0.1.1

  • ArffToArgs gets temporary folder in a platform-independent way, instead of assuming /tmp/.
  • Can now save args in ArffToPickle using save.

0.1.0

  • Initial release.

Logo Apache Mahout 0.11.1

by gsingers - November 9, 2015, 16:12:06 CET [ Project Homepage BibTeX Download ] 19821 views, 5180 downloads, 3 subscriptions

About: Apache Mahout is an Apache Software Foundation project with the goal of creating both a community of users and a scalable, Java-based framework consisting of many machine learning algorithm [...]

Changes:

Apache Mahout introduces a new math environment we call Samsara, for its theme of universal renewal. It reflects a fundamental rethinking of how scalable machine learning algorithms are built and customized. Mahout-Samsara is here to help people create their own math while providing some off-the-shelf algorithm implementations. At its core are general linear algebra and statistical operations along with the data structures to support them. You can use is as a library or customize it in Scala with Mahout-specific extensions that look something like R. Mahout-Samsara comes with an interactive shell that runs distributed operations on a Spark cluster. This make prototyping or task submission much easier and allows users to customize algorithms with a whole new degree of freedom. Mahout Algorithms include many new implementations built for speed on Mahout-Samsara. They run on Spark 1.3+ and some on H2O, which means as much as a 10x speed increase. You’ll find robust matrix decomposition algorithms as well as a Naive Bayes classifier and collaborative filtering. The new spark-itemsimilarity enables the next generation of cooccurrence recommenders that can use entire user click streams and context in making recommendations.


Logo Cognitive Foundry 3.4.2

by Baz - October 30, 2015, 06:53:03 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 24789 views, 4198 downloads, 4 subscriptions

About: The Cognitive Foundry is a modular Java software library of machine learning components and algorithms designed for research and applications.

Changes:
  • General:
    • Upgraded MTJ to 1.0.3.
  • Common:
    • Added package for hash function computation including Eva, FNV-1a, MD5, Murmur2, Prime, SHA1, SHA2
    • Added callback-based forEach implementations to Vector and InfiniteVector, which can be faster for iterating through some vector types.
    • Optimized DenseVector by removing a layer of indirection.
    • Added method to compute set of percentiles in UnivariateStatisticsUtil and fixed issue with percentile interpolation.
    • Added utility class for enumerating combinations.
    • Adjusted ScalarMap implementation hierarchy.
    • Added method for copying a map to VectorFactory and moved createVectorCapacity up from SparseVectorFactory.
    • Added method for creating square identity matrix to MatrixFactory.
    • Added Random implementation that uses a cached set of values.
  • Learning:
    • Implemented feature hashing.
    • Added factory for random forests.
    • Implemented uniform distribution over integer values.
    • Added Chi-squared similarity.
    • Added KL divergence.
    • Added general conditional probability distribution.
    • Added interfaces for Regression, UnivariateRegression, and MultivariateRegression.
    • Fixed null pointer exception that can happen in K-means with an empty cluster.
    • Fixed name of maxClusters property on AgglomerativeClusterer (was called maxMinDistance).
  • Text:
    • Improvements to LDA Gibbs sampler.

Logo KEEL Knowledge Extraction based on Evolutionary Learning 3.0

by keel - September 18, 2015, 12:38:54 CET [ Project Homepage BibTeX Download ] 888 views, 269 downloads, 1 subscription

About: KEEL (Knowledge Extraction based on Evolutionary Learning) is an open source (GPLv3) Java software tool that can be used for a large number of different knowledge data discovery tasks. KEEL provides a simple GUI based on data flow to design experiments with different datasets and computational intelligence algorithms (paying special attention to evolutionary algorithms) in order to assess the behavior of the algorithms. It contains a wide variety of classical knowledge extraction algorithms, preprocessing techniques (training set selection, feature selection, discretization, imputation methods for missing values, among others), computational intelligence based learning algorithms, hybrid models, statistical methodologies for contrasting experiments and so forth. It allows to perform a complete analysis of new computational intelligence proposals in comparison to existing ones. Moreover, KEEL has been designed with a two-fold goal: research and educational. KEEL is also coupled with KEEL-dataset: a webpage that aims at providing to the machine learning researchers a set of benchmarks to analyze the behavior of the learning methods. Concretely, it is possible to find benchmarks already formatted in KEEL format for classification (such as standard, multi instance or imbalanced data), semi-supervised classification, regression, time series and unsupervised learning. Also, a set of low quality data benchmarks is maintained in the repository.

Changes:

Initial Announcement on mloss.org.


Logo WEKA 3.7.13

by mhall - September 11, 2015, 04:55:02 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 53855 views, 8000 downloads, 4 subscriptions

Rating Whole StarWhole StarWhole StarWhole StarEmpty Star
(based on 6 votes)

About: The Weka workbench contains a collection of visualization tools and algorithms for data analysis and predictive modelling, together with graphical user interfaces for easy access to this [...]

Changes:

In core weka:

  • Numerically stable implementation of variance calculation in core Weka classes - thanks to Benjamin Weber
  • Unified expression parsing framework (with compiled expressions) is now employed by filters and tools that use mathematical/logical expressions - thanks to Benjamin Weber
  • Developers can now specify GUI and command-line options for their Weka schemes via a new unified annotation-based mechanism
  • ClassConditionalProbabilities filter - replaces the value of a nominal attribute in a given instance with its probability given each of the possible class values
  • GUI package manager's available list now shows both packages that are not currently installed, and those installed packages for which there is a more recent version available that is compatible with the base version of Weka being used
  • ReplaceWithMissingValue filter - allows values to be randomly (with a user-specified probability) replaced with missing values. Useful for experimenting with methods for imputing missing values
  • WrapperSubsetEval can now use plugin evaluation metrics

In packages:

  • alternatingModelTrees package - alternating trees for regression
  • timeSeriesFilters package, contributed by Benjamin Weber
  • distributedWekaSpark package - wrapper for distributed Weka on Spark
  • wekaPython package - execution of CPython scripts and wrapper classifier/clusterer for Scikit Learn schemes
  • MLRClassifier in RPlugin now provides access to almost all classification and regression learners in MLR 2.4

Logo Java Data Mining Package 0.3.0

by arndt - August 19, 2015, 15:44:46 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 1474 views, 277 downloads, 3 subscriptions

About: A Java library for machine learning and data analytics

Changes:

Initial Announcement on mloss.org.


Logo jLDADMM 1.0

by dqnguyen - August 19, 2015, 12:52:36 CET [ Project Homepage BibTeX Download ] 998 views, 244 downloads, 2 subscriptions

About: The Java package jLDADMM is released to provide alternative choices for topic modeling on normal or short texts. It provides implementations of the Latent Dirichlet Allocation topic model and the one-topic-per-document Dirichlet Multinomial Mixture model (i.e. mixture of unigrams), using collapsed Gibbs sampling. In addition, jLDADMM supplies a document clustering evaluation to compare topic models.

Changes:

Initial Announcement on mloss.org.


Logo Universal Java Matrix Package 0.3.0

by arndt - July 31, 2015, 14:23:14 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 12866 views, 2445 downloads, 3 subscriptions

About: The Universal Java Matrix Package (UJMP) is a data processing tool for Java. Unlike JAMA and Colt, it supports multi-threading and is therefore much faster on current hardware. It does not only support matrices with double values, but instead handles every type of data as a matrix through a common interface, e.g. CSV files, Excel files, images, WAVE audio files, tables in SQL data bases, and much more.

Changes:

Updated to version 0.3.0


Logo RiVal 0.1

by alansaid - July 29, 2015, 12:39:54 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 1107 views, 285 downloads, 2 subscriptions

About: Rival is an open source Java toolkit for recommender system evaluation. It provides a simple way to create evaluation results comparable across different recommendation frameworks.

Changes:

Initial Announcement on mloss.org.


Logo ABACOC Adaptive Ball Cover for Classification 2.0

by kikot - May 29, 2015, 11:57:28 CET [ BibTeX BibTeX for corresponding Paper Download ] 3545 views, 892 downloads, 3 subscriptions

About: Incremental (Online) Nonparametric Classifier. You can classify both points (standard) or matrices (multivariate time series). Java and Matlab code already available.

Changes:

version 2: parameterless system, constant model size, prediction confidence (for active learning).

NEW!! C++ version at: https://github.com/ilaria-gori/ABACOC


About: FAST is an implementation of Hidden Markov Models with Features. It allows features to modify both emissions and transition probabilities.

Changes:

Initial Announcement on mloss.org.


Logo BLOG 0.9.1

by jxwuyi - April 27, 2015, 06:52:05 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 1766 views, 385 downloads, 3 subscriptions

About: Bayesian Logic (BLOG) is a probabilistic modeling language. It is designed for representing relations and uncertainties among real world objects.

Changes:

Initial Announcement on mloss.org.


Logo java machine learning platform 1.0

by openpr_nlpr - April 2, 2015, 09:02:14 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 1673 views, 288 downloads, 2 subscriptions

About: Jmlp is a java platform for both of the machine learning experiments and application. I have tested it on the window platform. But it should be applicable in the linux platform due to the cross-platform of Java language. It contains the classical classification algorithm (Discrete AdaBoost.MH, Real AdaBoost.MH, SVM, KNN, MCE,MLP,NB) and feature reduction(KPCA,PCA,Whiten) etc.

Changes:

Initial Announcement on mloss.org.


Logo Hivemall 0.3

by myui - March 13, 2015, 17:08:22 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 8200 views, 1367 downloads, 3 subscriptions

About: Hivemall is a scalable machine learning library running on Hive/Hadoop.

Changes:
  • Supported Matrix Factorization
  • Added a support for TF-IDF computation
  • Supported AdaGrad/AdaDelta
  • Supported AdaGradRDA classification
  • Added normalization scheme

Logo JMLR Mulan 1.5.0

by lefman - February 23, 2015, 21:19:05 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 21391 views, 7510 downloads, 2 subscriptions

About: Mulan is an open-source Java library for learning from multi-label datasets. Multi-label datasets consist of training examples of a target function that has multiple binary target variables. This means that each item of a multi-label dataset can be a member of multiple categories or annotated by many labels (classes). This is actually the nature of many real world problems such as semantic annotation of images and video, web page categorization, direct marketing, functional genomics and music categorization into genres and emotions.

Changes:

Learners

  • MLCSSP.java: Added the MLCSSP algorithm (from ICML 2013)
  • Enhancements of multi-target regression capabilities
  • Improved CLUS support
  • Added pairwise classifier and pairwise transformation

Measures/Evaluation

  • Providing training data in the Evaluator is unnecessary in the case of specific measures.
  • Examples with missing ground truth are not skipped for measures that handle missing values.
  • Added logistics and squared error losses and measures

Bug fixes

  • IndexOutOfBounds in calculation of MiAP and GMiAP
  • Bug fix in Rcut.java
  • When in rank/score mode the meta-data contained additional unecessary attributes. (Newton Spolaor)

API changes

  • Upgrade to Java 7
  • Upgrade to Weka 3.7.10

Miscalleneous

  • Small changes and improvements in the wrapper classes for the CLUS library
  • ENTCS13FeatureSelection.java (new experiment)
  • Enumeration is now used for specifying the type of meta-data. (Newton Spolaor)

Showing Items 1-20 of 85 on page 1 of 5: 1 2 3 4 5 Next