mloss.org Harryhttp://mloss.orgUpdates and additions to HarryenWed, 30 Jul 2014 16:15:26 -0000Harry 0.3http://mloss.org/software/view/530/<html><p>Harry is a small tool for comparing strings and measuring their similarity. The tool supports several common distance and kernel functions for strings as well as some excotic similarity measures. The focus of Harry lies on implicit similarity measures, that is, comparison functions that do not give rise to an explicit vector space. Examples of such similarity measures are the Levenshtein distance and the Jaro-Winkler distance.
</p>
<p>Harry is implemented using OpenMP, such that the computation time for a set of strings scales linear with the number of available CPU cores. Moreover, efficient implementations of several similarity measures, effective caching of similarity values and low-overhead locking further speedup the computation.
</p>
<p>Harry complements the tool Sally that embeds strings in a vector space and allows computing vectorial similarity measures, such as the cosine distance and the bag-of-words kernel.
</p>
<p>A tutorial is available here: http://www.mlsec.org/harry/tutorial.html
</p></html>konrad rieckWed, 30 Jul 2014 16:15:26 -0000http://mloss.org/software/rss/comments/530http://mloss.org/software/view/530/sequence analysisstring kernelssimilarity measuresstring distances