Project details for xSNE Stochastic Neighbor Embedding methods with novel neighborhood probabilities

Screenshot xSNE Stochastic Neighbor Embedding methods with novel neighborhood probabilities 1.2

by emstrick - August 20, 2013, 11:02:21 CET [ BibTeX BibTeX for corresponding Paper Download ]

view (1 today), download ( 0 today ), 0 subscriptions

Description:

Stochastic neighbor embedding originally aims at the reconstruction of given distance relations in a low-dimensional Euclidean space. This can be regarded as general approach to multi-dimensional scaling, but the reconstruction is based on the definition of input (and output) neighborhood probability alone. The present implementation also allows for handling dissimilarity or score-induced neighborhood topologies and makes use of quasi 2nd order gradient-based (l-)BFGS optimization.

Neighbor relationships in the embedding space ('scatter plots') are estimated as probabilities of Gaussian or Student-t distributions; probabilities can be derived from from Gaussians over pairwise Euclidean input distances, or, as in the present case, by novel score neighborhood probabilities that do not require extra settings such as 'Gaussian width' or 'perplexity'.

The functionality is {SNE, t-SNE} x {symmetric, asymmetric} where SNE is ordinary (yet well-working) stochastic neighbor embedding, and t-SNE tries to avoid the 'crowding' problem by using Student-t rather than Gaussian neighborhood density assumption on the output space.

Symmetric refers to forcing symmetric neighborhood relationships (as originally proposed for t-SNE) for visually appealing plots, else asymmetric relationship reconstruction might be preferred for better representation qualities of the embedded point cloud.

As additional feature, the embedding quality of data points is assessed by the contributions of embedding point placement to the cost function, i.e. the sum of absolute KL-divergence gradients caused by individual points.

Acknowledgements:

  • The work on SNE and t-SNE I is highly appreciated "Visualizing Data using t-SNE", JMLR 9, pp. 2579-2605, 2008, and the freely available implementations by Laurens van der Maaten.

  • The great (l-)BFGS optimizer (fminlbfgs.m) of Dirk-Jan Kroon found at http://www.mathworks.de/matlabcentral/fileexchange/23245 included here is STRONGLY acknowledged.

Changes to previous version:
  • gradient in xsne_fun.m fixed! (constant factor m was missing)

  • symmetry option re-introduced allowing for enabling symmetric and asymmetric versions of SNE and t-SNE

BibTeX Entry: Download
Corresponding Paper BibTeX Entry: Download
Supported Operating Systems: Platform Independent
Data Formats: Matlab
Tags: Dimension Reduction, Mds, Multidimensional Scaling, Sne, Stochastic Neighbor Embedding, Neighborhood Probability Estimation
Archive: download here

Other available revisons

Version Changelog Date
1.2
  • gradient in xsne_fun.m fixed! (constant factor m was missing)

  • symmetry option re-introduced allowing for enabling symmetric and asymmetric versions of SNE and t-SNE

August 20, 2013, 11:02:21
1.1
  • scoretoprob.m replaced by d2p.m

  • protein score data set added

  • trank.m computes (mid/max -tied) ranks along columns of matrix

  • local P- neighborhood probability estimation added

  • experimental soft_rank_SNE added for minimizing KL between probabilities of exceedance in source and embedding space

  • symmetry option removed, because this was strange in previous version

November 23, 2012, 15:10:26
1.0

Negligible changes for consolidating the code.

July 23, 2012, 12:18:24

Comments

No one has posted any comments yet. Perhaps you'd like to be the first?

Leave a comment

You must be logged in to post comments.