KNN学习笔记
KNN is a classification algorithm which is instance-based learning and lazy learning(classify when testing).
Algorithm steps:
1. To classify the unknown instance, take all known instances as the references.
2. Choose K as the parameter and get the distance of unknown instance and all known instances.
3. According to majority-voting, classify unknown instance as the category which has most instances of K.
Details:
About K: how to measure distance?
Euclidean Distance
E(x,y) is the distance of X and Y in the N dimensional space.
Other methods: cos, correlation, Manhattan distance.
Disadvantage:
1. waste space to save a lot of instances.
2. High complexity
3. If the instances of one category are majority, the new instance is easier to be consider as this category.(Add weight according to distance)
网友评论