Patternhttp://mloss.orgUpdates and additions to PatternenFri, 31 Aug 2012 02:26:01 -0000Pattern 2.4<html><p>"Pattern" bundles a diverse range of Python functionality for working with text. It contains tools for web data mining such as a uniform API for web services (Google, Yahoo, Bing, Twitter, Wikipedia, Flickr, Facebook, RSS), a HTML DOM parser, web crawler and a PDF parser. It has wrappers for SQLite and MySQL databases, and a Datasheet class for working with CSV-files. It has a transformation-based tagger/chunker for English and Dutch, sentiment lexicons, a WordNet interface, and an n-gram search algorithm. It also has algorithms for tf-idf, cosine similarity, LSA, k-means and hierarchical clustering, Naive Bayes, KNN and SVM classifiers. It has a helper module for writing HTML canvas graphics in the web browser (no plugins needed), and tools for directed graphs, graph centrality, graph partitioning and spring-based graph visualization. </p> <p>The package is well-documented at: </p> <p>It comes bundled with 30+ example scripts and 350+ unit tests. </p> <p>Please let us know if you find any bugs! </p></html>Tom De Smedt, Walter DaelemansFri, 31 Aug 2012 02:26:01 -0000 semantic analysisnatural language processinginformation extractiondata visualizationtfidfcsvk nearest neighborhtmldutchenglishgerman