Lecture 9 Language Model
语言模型分两类——概率语言模型和结构语言模型
1. N-Gram Models
- Estimate probability of each word given prior context.
- Number of parameters required grows exponentially with the number of words of prior context
- An N-gram model uses only N-1 words of prior context
— unigram: P(phone)
— Bigram: P(phone | cell)
— Trigram: P(phone | your cell)
2. Smoothing/Back-off
3. Linear Interpolation
-
Linearly combine estimates of N-gram models of increasing order.
Lecture 11 Part of Speech Tagging
1. Hidden Markov Model
- Sometimes it is not possible to know precisely which states the model passes through
- We may observe some phenomena that occurs corresponding to state with probability distribution.
- the state transition is hidden
- the stochastic process of obervation is stochastic function of hidden state transition process
-
HMM Example
— Two observations: 'Rain' and 'Dry'
— Two hideen states: "Low' and 'High' 高气压和低气压
— 转移概率(Transition probabilities 隐藏状态之间)
— Observation probabilities
— Initial probabilities: P('Low')=0.4 P('High')=0.6
2. HMM 三种问题
-
Elvalution: Give the observation sequence and HMM model(A,B,π), how do we compute the probability of O given the model;给定观测序列和模型,计算观测序列的生成概率。(Forward-Backward algorithm)
- Decoding:Given the observation sequence and an HMM model, how do we find the state sequence that best explains the observations;给定观测序列和模型,输出最有可能的隐藏序列。(Viterbi algorithm)
— find the global optimal results through find stage optimal
— if the best path ending in goes through then it should coincide with the best path ending in - Learning:How do we adjust the model parameters to maximize P(O|model)
- N-best algorithm is similar to HMM, and it keeps the N best paths.
Lecture 12 Parsing
1. Parsing Approaches for Context-Free Grammar(CFG)
— Top-down Approach
— Bottom-up Approach
2. Regular Grammar
- A regular is denoted as G=(V,T,P,S)
— V are finite set of non-terminal variables
— T are finite set of terminal variables
— S are start symbol
— P is a finite set of productions. Consist of productions like V->B - Left hand side: one non-terminal symbol
- Right hand side: empty string / a terminal symbol/ a non-terminal sysbol following a terminal symbol
3. Top-down Parsing
- Begin with the start symbol S and produce the right hand side of the rule
- Match the left-hand side of CFG rules to non-terminals in the string, replacing them with the right-hand side of the rule.
- Continue untill all the non-terminals are replaced by terminals, such that they correspond to the symbols in the sentence.
- The parse succeeds when all words in the sentence are generated.
Lecture 14 Text Categorization
1. Term Selection
- Examples
— Chi Square
— Mutual Information
— Information Gain
— Information Ratio
— Odd Ratio
2. Feature Generation
- Latent Semantic Indexing(LSI)
- Explicit Semantic Indexing(ESI)
3. Nearest-Neighbor Learning Algorithm
- Compute similarity between x and all examples in D.
- Assign x the category of the most similar example in D.
4. Category Scoring for weighted-sum
-
The score for a category is the sum of the similarity scores between the point to be classified and all of its k-neighbors that belong to the given category.
网友评论