-
- Description:
Somoclu is a C++ tool for training self-organizing maps on large data sets using a massively parallel resources. It relies on OpenMP for multicore execution and it builds on MPI for distributing the workload across the nodes of the cluster. It is also able to boost training by using CUDA if graphics processing units are available. A sparse kernel is included, which is useful for high-dimensional but sparse data, such as the vector spaces common in text mining workflows. Python, Julia, R, and MATLAB interfaces facilitate use in data analysis. The code is released under GNU GPLv3 licence.
Key features:
Fast execution by parallelization: OpenMP, MPI, and CUDA are supported.
Python, Julia, R, and MATLAB interfaces for the dense multicore CPU kernel.
Planar and toroid maps.
Rectangular and hexagonal grids.
Gaussian and bubble neighborhood functions.
Both dense and sparse input data are supported.
Large emergent maps of several hundred thousand neurons are feasible.
Integration with Databionic ESOM Tools.
- Changes to previous version:
- New: The coefficient of the Gaussian neighborhood function exp(-||x-y||^2/(2(coeffradius)^2)) is now exposed in all interfaces as a parameter.
-
New:
get_bmu
function in the Python interface to get the best matching units given an activation map. -
Changed: Updated PCA initialization in the Python interface to work with
sk-learn
0.18 onwards. - Changed: Radii can be float values.
- Fixed: Only positive values were written back to codebook during update.
- Fixed: Sparse data is read correctly when there are class labels.
Comments
No one has posted any comments yet. Perhaps you'd like to be the first?
Leave a comment
You must be logged in to post comments.