Web mining, Visualization
Pattern
Pattern is a web mining module for the Python programming language.
It bundles tools for data retrieval (Google + Twitter + Wikipedia API, web spider, HTML DOM parser), text analysis (rule-based shallow parser, WordNet interface, syntactical + semantical n-gram search algorithm, tf-idf + cosine similarity + LSA metrics), clustering and classification (k-means, k-NN, SVM), and data visualization (graph networks).
The module is bundled with 30+ example scripts and 350+ unit tests.
Goto: http://www.clips.ua.ac.be/pages/pattern
Orange
Open source data visualization and analysis for novice and experts. Data mining through visual programming or Python scripting. Components for machine learning. Add-ons for bioinformatics and text mining. Packed with features for data analytics.
Classify
Goto: http://orange.biolab.si/
Install
# sudo pip install Orange
# sudo pip install pattern