WMR - Lecture Notes Summary


  • Lecture Info

  • Data, Content and Interpretation

  • What is Web Mining?

  • What is Machine Learning?

    • Supervised Learning

    • Unsupervised Learning

  • Lecture Info

  • Machine Learning

  • Classification

    • Formalization

    • Classificators Functions

  • Regression

    • Formalization

    • Regression Functions

  • Model Selection

    • Model Family Selection

    • Model Parametrization

    • Searching for the Optimal Function

    • Linear Models

    • Probabilistic Models

      • Graphical models

      • Weighted Grammars

      • Hidden Markov Model

  • Summary

  • Lecture Info

  • Vector Spaces

    • Basic Operations

    • Linear Indipendence

    • Basis

    • Dot Product

    • Norm

    • Distance

      • Euclidean Distance

      • Cosine Distance

    • Orthogonality

  • Text Classification

    • Vector Space Model

    • Task Definition

    • Rocchio's Algorithm

      • Documents Vectors

      • Classes Vectors

      • Example

      • Limitations

    • Memory Based Learning

    • kNN

      • Example

      • Algorithm

  • Dimensionality Reduction

    • Clustering

      • k-Mean

      • Distance and Similarity

  • Lecture Info

  • Dimensionality Reduction

    • Statistical Techniques

    • Reconstruction Techniques (w/ Clustering)

    • Linear Algebra Techniques

  • Distance, Similarity and Clustering

    • Fuzzy sets

    • Pearson Correlation

    • Jaccard Similarity

    • Dice Coefficient

    • Clustering

  • The Importance of Representation

  • Questions

  • Lecture Info

  • Text Classification

  • Possible Approaches

    • Manual classification

    • Automatic Classification

  • Bayesian Methods

    • Baye's Rule

    • Maximum a Posteriori Hypothesis

    • Naive Bayes Classifiers

  • Multivariate Binomial Model

    • Learning the Model

    • Example

    • Applying the Model

    • Constructing the Vocabulary

  • Problems with Naive Bayes

    • Independence Assumption

      • Laplace Smoothing

    • Underflow Prevention

  • Lecture Info

  • Multivariate Multinomial Model

  • Stochastic Language Models

    • Unigram Model

    • Bigram Model

    • N-Gram Model

  • Language Models and Naive Bayes

    • Learning the Model

    • Applying the Model

    • Time Complexity

  • Summary of the two Models

    • Example

  • Feature Selection

    • Mutual Information

    • How it is Done

  • Evaluation

    • WebKB Experiment

    • Problems to be Solved

    • Most Common Category

  • Violation of NB Assumptions

  • Positive Properties of NB

  • References

  • Lecture Info

  • Evaluation

    • Types of Metrics

  • Evaluating Classification

    • Confusion Matrix

    • Accuracy and Error Rate

      • Problem with Accuracy

    • Precision, Recall, F-Measure

      • Trade-Off Between Precision and Recall

      • Break-Event Point (EBP)

      • Combining Precision and Recall

    • Combining Measurements for Multiple Classes

      • Microaverages

  • Parameter Tuning

    • Cross-Validation (Fixed-split)

    • N-Fold Cross Validation

    • Tuning a Classifier

  • The Complete ML Process

  • Example: Reuters Classification

    • Parameter Estimation Procedure

    • References

  • Lecture Info

  • Natural Language

    • What's in a Document?

    • Levels of Interpretation

    • Ambiguity

  • Information Retrieval

    • IR Models

    • IR Tasks

    • Learning and IR

  • Lecture Info

  • The Problem

  • Decision Tree (J48)

    • Basic Idea

    • Entropy

    • Construction

      • Example

    • Discretization of Features

    • Algorithm

  • Weka

    • Format ARFF

    • Weka Interface

    • Tree Visualization

  • Lecture Info

  • Ambiguity

  • The NLP Process

    • Syntactic Analysis

      • Parse Tree

      • Dependency/Constituency Reations

      • Dependency Parsing

      • Ambiguity in Syntactical Parsing

      • Modern Parsers

    • Semantic Analysis

  • Lecture Info

  • RevNLT

  • Semantinc Analysis (cont.)

    • Compositionality

    • Towards Lambda-Calculus

    • Lambda-Calculus

    • World Model

    • Meaning

    • Lexical Semantic

  • WordNet

  • Esercizi

  • Semantic Parsing

    • Frame Net

    • SRL Pipeline

  • Lecture Info

  • Standford CoreNLP

    • How to Use it

    • The CONNL Tabular Format

  • Spacy

    • Anaconda

    • Basic Example

  • Example (Wikipedia words)

    • Extracting Triples (Subject, Verb, Object)

    • Generalizing

  • Exercise: Q/A

  • Lecture Info

  • Outline

  • Linguistic Structures

  • Language Modelling

    • N-Gram Models

    • Stochastic Taggers/Grammars

    • Advantages

  • Markov Model

    • Visible Markov Model

    • Hidden Markov Model

      • Problems solved by HMM

  • PoS Tagging

  • Lecture Info

  • HMM for Pos Tagging

    • Questions in POS tagging

    • Advantages of using HMM

  • Forward Algorithm

    • Formal Description

  • Viterbi Algorithm

    • Formal Description

  • Parameter Estimation

    • Supervised Methods

    • Unsupervised Methods

    • Baum-Welch Method

      • Forward-Backward Algorithm

      • Expectation Step (E-step)

      • Maximization Step (M-step)

      • Example of Baum-Welch

  • References

  • Exercise

  • Lecture Info

  • Review

    • Types of HMM problems

    • Viterbi Algorithm

  • Baum-Welch Method

    • Overall Scheme

    • Forward/Backward Probabilities

    • Updating Step

    • Final Considerations

  • Example of HMM

  • Use Cases for HMMs in NLP

    • HMM Decoding for NLP

  • Exercise

  • Lecture Info

  • On Learning

    • Training Set

    • Learning Class \(C\)

    • Version Space

  • PAC Learning

    • PAC-Learnability

    • Example

  • VC-Dimension

    • Axies Aligned Rectangles

    • Lines

    • Circles

  • Lecture Info

  • Model Selection

    • Triple Trade-Off

    • Expected and Empirical Error

    • Learning and VC-Dimension

    • How to Select a Model

      • Example (Structural Risk Minimization)

  • Learning Machines

  • Using VC-Dimensionality

  • Recap

  • Exams Questions

    • MidTerm Topics

    • Open Questions

      • Example 1

      • Example 2

      • Example 3

      • Example 4

      • Example 5

    • Closed Questions

  • Lecture Info

  • Linear Classifiers

  • Perceprton

    • Functional Margin

    • Geometric Margin

    • On-Line Algorithm

    • Novikoff Theorem

    • Duality

  • Limitations of Linear Classifiers

  • Support Vector Machines

    • Maximum Marign Hyperplane

    • Support Vectors

    • How to Compute the Maximum Margin

    • The Lagrangian

  • Questions

  • Lecture Info

  • Recap

  • Solving the Dual Problem

  • Khun-Tucker Theorem

  • Dealing with Non-Linearly Separable Data

    • Soft-Margin SVM

  • Soft vs Hard Margin SVMs

  • Lecture Info

  • Clustering and Unsupervised Learning

    • Hierarchical Clustering

    • Direct Clustering

    • Aspects of Clustering

    • Agglomerative/Divise Hierarchical Clustering

  • Hierarchical Agglomerative Clustering (HAC)

    • Chaining Effect

    • Complete Link Example

    • Computational Complexity

  • Non-Hierarchical Clustering

    • K-Means

      • Time Complexity

      • Problems

    • QT K-Means

  • Distance Metrics

  • Data Standardization

    • Interval-Scaled Attributes

  • Cluster Evaluation

    • External Criteria

      • Purity

      • Entropy

  • Soft Clustering

  • Subspace Clustering (LAC)

  • Example Application (Text Clustering)

  • Lecture Info

  • SVMs and Scalar Product

  • Kernel Function

    • Example

    • Feature Spaces

    • Gram-Matrix

    • Kernelixed Perceprton

  • Kernel in SVMs

  • Finding Kernels

  • Kernel Examples

    • The Polynomial Kernel

  • Lecture Info

  • Conjuction of Features

  • String Kernel

    • Formal Definition

    • Exercise

  • Tree Kernels

  • Lecture Info

  • SVM

    • Example

    • Multiple Classes

    • False Positives vs False Negatives

  • Tree Kernel

  • RBF Kernel

  • KELP

  • Lecture Info

  • Intro to Deep Learning

    • Types of Neural Networks

    • Dimensions of a Task

    • Symbols, Rules and Observations

    • Connectionism and Deep Learning

    • What we want

    • History

  • Vector Spaces, Functions and Learning

    • On Representation

    • The Role of Depth

  • Multilayer Perceptron

  • Neural Networks

    • Single Neuron View

    • Sigmoid

  • Lecture Info

  • Keras

    • Batch, Steps and Epochs

    • API Examples

      • The Sequential API

      • The functional API

      • Sequential NN examples

  • Example 1

  • Example 2: MNIST Dataset