Project details for Somoclu

Logo Somoclu 1.3

by peterwittek - March 31, 2014, 07:53:05 CET [ Project Homepage BibTeX BibTeX for corresponding Paper ]

view (20 today), download ( 6 today ), 2 subscriptions

Description:

Somoclu is a C++ tool for training self-organizing maps on large data sets using a massively parallel resources. It relies on OpenMP for multicore execution and it builds on MPI for distributing the workload across the nodes of the cluster. It is also able to boost training by using CUDA if graphics processing units are available. A sparse kernel is included, which is useful for high-dimensional but sparse data, such as the vector spaces common in text mining workflows. Python, R, and MATLAB interfaces facilitate use in data analysis. The code is released under GNU GPLv3 licence.

Key features:

  • Fast execution by parallelization: OpenMP, MPI, and CUDA are supported.

  • Planar and toroid maps.

  • Both dense and sparse input data are supported.

  • Large emergent maps of several hundred thousand neurons are feasible.

  • Integration with Databionic ESOM Tools.

  • Python, R, and MATLAB interfaces for the dense CPU kernel.

Changes to previous version:
  • Python, R, and MATLAB interfaces added.
  • Learning rate parameter included.
  • Linear and exponential cooling strategies added for radius and learning rate.
  • CLI interface made more user-friendly.
  • Default radius depends on both X and Y of the map.
  • Bug fixes: CUDA build without MPI, best matching unit passing without MPI, coordinate order in best matching unit file.
BibTeX Entry: Download
Corresponding Paper BibTeX Entry: Download
URL: Project Homepage
Supported Operating Systems: Linux, Os X
Data Formats: Ascii, Libsvm, Esom
Tags: Cuda, Self Organizing Maps, Mpi, Esom, Openmp

Other available revisons

Version Changelog Date
1.4
  • Better Windows support.
  • Completed CUDA support for Python and R interfaces.
  • Faster compilation by removing unnecessary flags for nvcc
  • Support for CUDA 6.5.
  • Bug fixes: R version no longer needs separate code.
September 5, 2014, 13:01:14
1.3.1
  • Initial Windows support through GCC on Windows.
  • Better I/O separation for the Python, R, and MATLAB interfaces.
  • Bug fixes: major MPI initialization bug fixed.
April 10, 2014, 06:41:38
1.3
  • Python, R, and MATLAB interfaces added.
  • Learning rate parameter included.
  • Linear and exponential cooling strategies added for radius and learning rate.
  • CLI interface made more user-friendly.
  • Default radius depends on both X and Y of the map.
  • Bug fixes: CUDA build without MPI, best matching unit passing without MPI, coordinate order in best matching unit file.
March 31, 2014, 07:53:05
1.2
  • Massive improvements in OpenMP parallelization.
  • MPI libraries are no longer mandatory.
  • Best matching units are saved.
  • Option for specifying an initial codebook for the map.
  • ESOM .lrn input format added.
  • Parsing of white-space characters corrected.
  • Long-named command line switches for specifying SOM dimensions.
  • Fine-grained control of which interim files to save across epochs
  • Option in Makefile for building shared library.
December 17, 2013, 04:31:05
1.1.2

Toroid maps were added. Initial radius is exposed as a parameter via the command line interface. Formats of codebook and U-matrix export are compatible with Databionic ESOM Tools for advanced visualisation. Bug fixes: codebook update with a compact support was removed, NaN entry no longer appears in U-matrices.

November 28, 2013, 03:20:22
1.0

Initial Announcement on mloss.org.

May 14, 2013, 06:21:13

Comments

No one has posted any comments yet. Perhaps you'd like to be the first?

Leave a comment

You must be logged in to post comments.