Projects that are tagged with data sets.


Logo DCABags 0.7

by wbuntine - June 5, 2014, 05:34:44 CET [ Project Homepage BibTeX Download ] 18221 views, 4358 downloads, 0 subscriptions

About: Document/Text preprocessing for topic models: suite of Perl scripts for preprocessing text collections to create dictionaries and bag/list files for use by topic modelling software.

Changes:

Moved distribution and code across to GitHub. Changed "ldac" format to have 0 offset for word indices. Added "document frequency" (df) filtering on selection of tokens for linkTables. Playing with linkParse but its still unuseable generally.


Logo mldata.org svn-r1070-Apr-2011

by sonne - April 8, 2011, 10:15:49 CET [ Project Homepage BibTeX Download ] 10586 views, 2641 downloads, 0 subscriptions

About: The source code of the mldata.org site - a community portal for machine learning data sets.

Changes:

Initial Announcement on mloss.org.