-
- Description:
This software is written in C++ and contains routines for statistical classification, probability estimation and interpolation/non-linear regression. Two variable bandwidth kernel methods are adopted: k-nearest neighbour (KNN), and a balloon estimator based on Gaussian kernels, hence Adaptive Gaussion Filtering (AGF). A library of easy-to-use, single-call functions (you call a single function once for each estimate--no initialization required) are included, as well as command-line executables.
The statistical classification routines are particularly powerful, allowing you to generate a pre-trained model by searching for the class borders. These can then be used to make rapid classifications which nonetheless return estimates of the conditional probabilities.
Clustering routines are a recent addition.
- Changes to previous version:
New in Version 0.9.8:
bug fixes: svm file conversion works properly and is more general
non-hierarchical multi-borders has 3 options for solving for the conditional probabilities: matrix inversion, voting, and matrix inversion over-ridden by voting, with re-normalization
multi-borders now works with external binary classifiers
random numbers resolve a tie when selecting classes based on probabilities
pair of routines, sort_discrete_vectors and search_discrete_vectors, for classification based on n-d binning (still experimental)
command options have been changed with many new additions, see QUICKSTART file or run the relevant commands for details
- BibTeX Entry: Download
- Corresponding Paper BibTeX Entry: Download
- Supported Operating Systems: Agnostic
- Data Formats: Ascii, Binary
- Tags: Clustering, Nonparametric Density Estimation, Supervised Learning, Interpolation, Inverse Methods, Kernel Estimation, Nonlinear Regression, Probability Estimation, Statistical Classification
- Archive: download here
Comments
-
- Peter Mills (on March 15, 2012, 05:04:16)
- I had hoped to have multi-class border-classification ready by now, but the simple generalization I had envisioned to implement it won't work in all cases. The idea was to use matrix inversion to solve for the conditional probabilities, but quite obviously (in retrospect) you can solve for the class without being able to determine all the conditional probabilities. Likely we need two cases: one where all the conditional probabilities can be found, and one where only that of the retrieved class can be found and these two cases need to interoperate. A recursive or hierarchical model would seem to be the best solution here. I realize that there is literature relating to the problem of creating multi-class classifications from two-class, however I do not currently have access to commercial journals as I am not affiliated with an academic or research institution. It is also an enjoyable challenge to try and figure these things out for yourself, from scratch, so to speak. Likewise I had hoped to have the optimal-bandwidth Gaussian PDF estimation ready. I had made some progress on it, but the test cases were not giving consistent results and I have failed to work on it in the intervening months.
-
- Peter Mills (on April 15, 2014, 04:55:05)
- Multi-borders classification is now ready. I am very pleased (and pleasantly surprised) with how well it works.
-
- Peter Mills (on January 23, 2016, 23:46:19)
- The libAGF library has been combined with two other libraries and moved to Github under the project, libmsci: https://github.com/peteysoft/libmsci
Leave a comment
You must be logged in to post comments.