Semi-supervised Learning(II) Graph

Posted by Kaiyuan Chen on May 18, 2018

Semi-supervised Learning(II): Graph

Continued(Weight Matrix and Graph)

At initial condition, the nodes carry a label that denotes the community they belong to. Membership in a community changes based on the labels that the neighboring nodes possess. This change is subject to the maximum number of labels within one degree of the nodes. Every node is initialized with a unique label, then the labels diffuse through the network. Consequently, densely connected groups reach a common label quickly. When many such dense (consensus) groups are created throughout the network, they continue to expand outwards until it is impossible to do so.

Example from [1]: MNIST

First thing to note is that Euclidean distance is a good measure of similarity of graphs in this case. Adjacent pairs are similar to each other: if they are not directly connected, label can still propagate to there.

  • additional introduction about [1]: it connects Gaussian process/Gaussian Random Field to label propagation.

Graph Format

  • fully connected graph
  • sparse graph: connection between dissimilar nodes are removed; weight learning makes optimization awkward
  • e-NN graph: a ball with radius e
    • parametric and sensitive to e selected
    • may have disconnected components
  • k-NN graph: “adaptive scales”
    • not scalable
    • graph is not regular(as results may be asymmetric and irregular)

Community

A community in a network is a group of nodes that are similar to each other and dissimilar from the rest of the network. Community detection, which is similar to network partition is NP complete.

References

A good thesis that describes everything in ssl: http://pages.cs.wisc.edu/~jerryzhu/pub/thesis.pdf

A good survey about recent works http://pages.cs.wisc.edu/~jerryzhu/pub/ssl_survey.pdf