最近因为工作需要,使用Python实现了常用的字符串相似度算法,一共超过十种。列举如下:
- Levenshtein
- NormalizedLevenshtein
- WeightedLevenshtein
- DamerauLevenshtein
- OptimalStringAlignment
- Jarowinkler
- LongestCommonSubsequence
- MetricLongestCommonSubsequence
- NGram
- QGram
- Cosine
- Jaccard
- SorenceDice
详见GitHub: luozhouyang/python-string-similarity
网友评论