Author: Aaron Brooks / @scalefreegan
You can follow along on
http://scalefreegan.github.io/Teaching/DataIntegration
Some way to group elements that are more similar to each other than they are to everything else
What does it mean to be more similar? Need to define a metric
A nonnegative function g(x,y) describing the "distance" between neighboring points for a given set
g(x,y)≥0
g(x,y)=g(y,x)
g(x,y)=0iff. x=y
g(x,y)+g(y,z)≥g(x,z)
Minimize ∑Kk=1∑xn∈Ck||xn−μk||2 with respect to Ck and μk
Given k and inital cluster centers, μk
Repeat (1) and (2) until ∑Kk=1|μkt+1−μkt|≤ϵ
More detail at The Data Science Lab
What if your data looks like this?
Some advantages
Luxburg (2007). A Tutorial on Spectral Clustering
1. Compute the similarity matrix, S (e.g. Kernel function like RBF)
2. Calculate the affinity matrix, A, from S (e.g. k-nearest neighbors algorithm)
3. Calculate a graph Laplacian, L
L=D−W
Wikipedia: Laplacian matrix
4. Perform k-means clustering on matrix, Z, consisting of eigenvectors for k smallest eigenvalues of L
Eigen value decomposition ⇨ Fiedler vector
Yong-Yeol et al (2010). Link communities reveal multiscale complexity in networks
Yong-Yeol et al (2010). Link communities reveal multiscale complexity in networks
Yong-Yeol et al (2010). Link communities reveal multiscale complexity in networks
Yong-Yeol et al (2010). Link communities reveal multiscale complexity in networks
Yong-Yeol et al (2010). Link communities reveal multiscale complexity in networks
Yong-Yeol et al (2010). Link communities reveal multiscale complexity in networks
Yong-Yeol et al (2010). Link communities reveal multiscale complexity in networks
Yong-Yeol et al (2010). Link communities reveal multiscale complexity in networks
Yong-Yeol et al (2011). Flavor network and the principles of food pairing
Brooks and Reiss et al (2014). A system‐level model for the microbial regulatory genome
Brooks and Reiss et al (2014). A system‐level model for the microbial regulatory genome
Selected from their relationship to 120 genes involved in mitosis, DNA mismatch repair, and BMP signaling
For more info: Practical 1