Translation Memory Similarity Metrics
Percent Match

Weighted Percent Match
Weighted percent match (WPM) uses inverse document frequency(IDF) as a proxy for trying to weight words based on how much value their translations are expected to provide to translators.

Edit Distance
Thus, a TM metric that matches sentences on more than just (weighted) percentage coverage of lexical items can be expected to perform better for TM bank evaluation and retrieval.

N-Gram Precision
Although ED takes context into account, it does not emphasize local context in matching certainhigh-value words and phrases as much as metrics that capture n-gram precision between the MTBT workload sentence and candidate source-side sentences from the TMB.
Perhaps the most important is that TM fuzzy matching has to be able to operate at a sentence-to-sentence level where as automated MT evaluation metrics such as BLEU score are intended to operate over a whole corpus.

对Z值的设定是一个召回率与准确率的trade off,Z值设成1会倾向于得到更长的句子(考虑召回率),而设成0会倾向于得到更短的句子(考虑精确率),作者最后实现设置的值为0.75
Weighted N-Gram Precision

Modified Weighted N-Gram Precision
Note that in Equation 6 each wpn contributes equally to the average. Modified Weighted NGram Precision (MWNGP) improves on WNGP by weighting the contribution of each wpn so that shorter n-grams contribute more than longer ngrams. The intuition is that for TM settings, getting more high-value shorter n-gram matches atthe expense of fewer longer n-gram matches willbe more helpful since translators will get relatively more assistance from seeing new high-value vocabulary.
