Label propagation: It is a semi-supervised machine learning algorithm that assigns labels to previously unlabeled data points. At the start of the algorithm, a subset of the data points have labels. These labels are propagated to the unlabeled points throughout the course of the algorithm. In comparison with other algorithms label propagation has advantages in its running time and amount of a prior information needed about the network structure (no parameter is required to be known beforehand). The disadvantage is that it produces no unique solution, but an aggregate of many solutions.
Properties:
1. Transductive classification exploiting homophily
2. No node features
3. For unlabeled nodes compute probability distribution over class labels
4. Semi-supervised machine learning algorithm
5. Iterative algorithm
6. Used for finding communities in a network.
Steps
1. Initialize the labels at all nodes in the network. For a given node x, Cx (0) = x.
2. Set t = 1.
3. Arrange the nodes in the network in a random order and set it to X.
4. For each x ∈ X chosen in that specific order, let Cx(t) = f(Cxi1(t), ...,Cxim(t),Cxi(m+1) (t − 1), ...,Cxik (t − 1)). Here returns the label occurring with the highest frequency among neighbours. Select a label at random if there are multiple highest frequency labels.
5. If every node has a label that the maximum number of their neighbours have, then stop the algorithm. Else, set t = t + 1 and go to (3).
网友评论