WMR - Lecture Notes Summary
Lecture Info
Data, Content and Interpretation
What is Web Mining?
What is Machine Learning?
Supervised Learning
Unsupervised Learning
Lecture Info
Machine Learning
Classification
Formalization
Classificators Functions
Regression
Formalization
Regression Functions
Model Selection
Model Family Selection
Model Parametrization
Searching for the Optimal Function
Linear Models
Probabilistic Models
Graphical models
Weighted Grammars
Hidden Markov Model
Summary
Lecture Info
Vector Spaces
Basic Operations
Linear Indipendence
Basis
Dot Product
Norm
Distance
Euclidean Distance
Cosine Distance
Orthogonality
Text Classification
Vector Space Model
Task Definition
Rocchio's Algorithm
Documents Vectors
Classes Vectors
Example
Limitations
Memory Based Learning
kNN
Example
Algorithm
Dimensionality Reduction
Clustering
k-Mean
Distance and Similarity
Lecture Info
Dimensionality Reduction
Statistical Techniques
Reconstruction Techniques (w/ Clustering)
Linear Algebra Techniques
Distance, Similarity and Clustering
Fuzzy sets
Pearson Correlation
Jaccard Similarity
Dice Coefficient
Clustering
The Importance of Representation
Questions
Lecture Info
Text Classification
Possible Approaches
Manual classification
Automatic Classification
Bayesian Methods
Baye's Rule
Maximum a Posteriori Hypothesis
Naive Bayes Classifiers
Multivariate Binomial Model
Learning the Model
Example
Applying the Model
Constructing the Vocabulary
Problems with Naive Bayes
Independence Assumption
Laplace Smoothing
Underflow Prevention
Lecture Info
Multivariate Multinomial Model
Stochastic Language Models
Unigram Model
Bigram Model
N-Gram Model
Language Models and Naive Bayes
Learning the Model
Applying the Model
Time Complexity
Summary of the two Models
Example
Feature Selection
Mutual Information
How it is Done
Evaluation
WebKB Experiment
Problems to be Solved
Most Common Category
Violation of NB Assumptions
Positive Properties of NB
References
Lecture Info
Evaluation
Types of Metrics
Evaluating Classification
Confusion Matrix
Accuracy and Error Rate
Problem with Accuracy
Precision, Recall, F-Measure
Trade-Off Between Precision and Recall
Break-Event Point (EBP)
Combining Precision and Recall
Combining Measurements for Multiple Classes
Microaverages
Parameter Tuning
Cross-Validation (Fixed-split)
N-Fold Cross Validation
Tuning a Classifier
The Complete ML Process
Example: Reuters Classification
Parameter Estimation Procedure
References
Lecture Info
Natural Language
What's in a Document?
Levels of Interpretation
Ambiguity
Information Retrieval
IR Models
IR Tasks
Learning and IR
Lecture Info
The Problem
Decision Tree (J48)
Basic Idea
Entropy
Construction
Example
Discretization of Features
Algorithm
Weka
Format ARFF
Weka Interface
Tree Visualization
Lecture Info
Ambiguity
The NLP Process
Syntactic Analysis
Parse Tree
Dependency/Constituency Reations
Dependency Parsing
Ambiguity in Syntactical Parsing
Modern Parsers
Semantic Analysis
Lecture Info
RevNLT
Semantinc Analysis (cont.)
Compositionality
Towards Lambda-Calculus
Lambda-Calculus
World Model
Meaning
Lexical Semantic
WordNet
Esercizi
Semantic Parsing
Frame Net
SRL Pipeline
Lecture Info
Standford CoreNLP
How to Use it
The CONNL Tabular Format
Spacy
Anaconda
Basic Example
Example (Wikipedia words)
Extracting Triples (Subject, Verb, Object)
Generalizing
Exercise: Q/A
Lecture Info
Outline
Linguistic Structures
Language Modelling
N-Gram Models
Stochastic Taggers/Grammars
Advantages
Markov Model
Visible Markov Model
Hidden Markov Model
Problems solved by HMM
PoS Tagging
Lecture Info
HMM for Pos Tagging
Questions in POS tagging
Advantages of using HMM
Forward Algorithm
Formal Description
Viterbi Algorithm
Formal Description
Parameter Estimation
Supervised Methods
Unsupervised Methods
Baum-Welch Method
Forward-Backward Algorithm
Expectation Step (E-step)
Maximization Step (M-step)
Example of Baum-Welch
References
Exercise
Lecture Info
Review
Types of HMM problems
Viterbi Algorithm
Baum-Welch Method
Overall Scheme
Forward/Backward Probabilities
Updating Step
Final Considerations
Example of HMM
Use Cases for HMMs in NLP
HMM Decoding for NLP
Exercise
Lecture Info
On Learning
Training Set
Learning Class \(C\)
Version Space
PAC Learning
PAC-Learnability
Example
VC-Dimension
Axies Aligned Rectangles
Lines
Circles
Lecture Info
Model Selection
Triple Trade-Off
Expected and Empirical Error
Learning and VC-Dimension
How to Select a Model
Example (Structural Risk Minimization)
Learning Machines
Using VC-Dimensionality
Recap
Exams Questions
MidTerm Topics
Open Questions
Example 1
Example 2
Example 3
Example 4
Example 5
Closed Questions
Lecture Info
Linear Classifiers
Perceprton
Functional Margin
Geometric Margin
On-Line Algorithm
Novikoff Theorem
Duality
Limitations of Linear Classifiers
Support Vector Machines
Maximum Marign Hyperplane
Support Vectors
How to Compute the Maximum Margin
The Lagrangian
Questions
Lecture Info
Recap
Solving the Dual Problem
Khun-Tucker Theorem
Dealing with Non-Linearly Separable Data
Soft-Margin SVM
Soft vs Hard Margin SVMs
Lecture Info
Clustering and Unsupervised Learning
Hierarchical Clustering
Direct Clustering
Aspects of Clustering
Agglomerative/Divise Hierarchical Clustering
Hierarchical Agglomerative Clustering (HAC)
Chaining Effect
Complete Link Example
Computational Complexity
Non-Hierarchical Clustering
K-Means
Time Complexity
Problems
QT K-Means
Distance Metrics
Data Standardization
Interval-Scaled Attributes
Cluster Evaluation
External Criteria
Purity
Entropy
Soft Clustering
Subspace Clustering (LAC)
Example Application (Text Clustering)
Lecture Info
SVMs and Scalar Product
Kernel Function
Example
Feature Spaces
Gram-Matrix
Kernelixed Perceprton
Kernel in SVMs
Finding Kernels
Kernel Examples
The Polynomial Kernel
Lecture Info
Conjuction of Features
String Kernel
Formal Definition
Exercise
Tree Kernels
Lecture Info
SVM
Example
Multiple Classes
False Positives vs False Negatives
Tree Kernel
RBF Kernel
KELP
Lecture Info
Intro to Deep Learning
Types of Neural Networks
Dimensions of a Task
Symbols, Rules and Observations
Connectionism and Deep Learning
What we want
History
Vector Spaces, Functions and Learning
On Representation
The Role of Depth
Multilayer Perceptron
Neural Networks
Single Neuron View
Sigmoid
Lecture Info
Keras
Batch, Steps and Epochs
API Examples
The Sequential API
The functional API
Sequential NN examples
Example 1
Example 2: MNIST Dataset