Data/Text/Knowledge Analysis & Mining

Web mining, Visualization

manager@ 2013. 3. 13. 16:03






Pattern 

Pattern is a web mining module for the Python programming language.

It bundles tools for data retrieval (Google + Twitter + Wikipedia API, web spider, HTML DOM parser), text analysis (rule-based shallow parser, WordNet interface, syntactical + semantical n-gram search algorithm, tf-idf + cosine similarity + LSA metrics), clustering and classification (k-means, k-NN, SVM), and data visualization (graph networks).

The module is bundled with 30+ example scripts and 350+ unit tests.

Goto: http://www.clips.ua.ac.be/pages/pattern




Orange

Orange 

Open source data visualization and analysis for novice and experts. Data mining through visual programming or Python scripting. Components for machine learning. Add-ons for bioinformatics and text mining. Packed with features for data analytics.

Classify

Naive Bayes Naive Bayesian LearnerSupport Vector Machines SVM LearnerLogistic Regression Logistic Regression Learner
Majority Majority LearnerClassification Tree Classification Tree LearnerClassification Tree Graph Classification Tree Graph
Classification Tree Viewer Classification Tree ViewerCN2 Rules CN2 Rule LearnerCN2 Rules Viewer Rule Viewer
k-Nearest Neighbours k-Nearest Neighbours LearnerNomogram NomogramRandom Forest Random Forest
C4.5 C4.5 LearnerInteractive Tree Builder Interactive Tree Builder


Goto: http://orange.biolab.si/


Install 

Orange is based on qt.

# sudo pip install Orange

# sudo pip install pattern