Saturday, January 28, 2012

Clique Relaxations in Social Network Analysis

          Given large amount of data provided by the Web 2.0, there is a pressing need to obtain new measures to better understand the networks’ structure, how their components are organized and the way they evolve over time. The representation of a social network was quite influenced by graph theory. In the social networks the set of vertices (or nodes) correspond to the “actors” (i.e.people, companies, social actors) and the set of edges corresponds to the “ties” (i.e.relationships, associations, links). 
          When the number of vertices and edges increases the visualization becomes incomprehensible. However, the visualization of a small number of vertices can be easily performed in a graph. For this a new graph mining approach based on k-cliques can be used. The concept of relaxed clique can extended to the whole graph, to achieve a general view, by covering the network with k-cliques.

          To understand a clique, we need to first learn about the subgraph. Given an undirected graph G= (V, E) where V denotes the set of vertices and E the set of edges, the graph G= (V1, E1) is called a sub-graph of G if V1 is a subset of V, E1 is a subset of E and for every edge (vi, vj) from E1 the vertices vi, vj belong to V1. A sub-graph G1 is said to be complete if there is an edge for each pair of vertices. A complete sub-graph is also called a clique.        

Cliques with 1, 2, 3, 4, 5 and 6 vertices

          The clique structure, where there must be an edge for each pair of vertices, shows many restrictions in real life modeling. So, alternative approaches were suggested in order to relax the clique concept, such as k-clique, k-clan/k-club and k-plex. 

          A k-clique is a subset of vertices C such that, for every i, j belonging to C, the distance d(i, j) <= k. The 1-clique is identical to a clique, because the distance between the vertices is one edge. The 2-clique is the maximal complete sub-graph with a path length of one or two edges. The path distance two can be exemplified by the “friend of a friend” connection in social
relationships. In social websites like the LinkedIn, each member can reach his own connections and also the two and three degrees away. The increase of the value k corresponds to a gradual relaxation of the criterion of clique membership.

k- clique examples   

           A limitation of the k-clique model is that some vertices may be distant from the group, i.e. the distance, between two nodes, may correspond to a path involving nodes that do not belong to the k-clique. To overcome this handicap diameter based models called k-club and k-clan were introduced.

k-clan and k-club
         Let's first learn few definitions,
                 d(u,v)        :  The length of the shortest path between vertices u and v in G 
                 diam(G)   :  The diameter of G max d(u, v) 
         To find all k-clan, firstly all the k-cliques Si must be found and then the restriction diam(G[S]) <= k applied to remove the undesired k-cliques. 

          In the figure to the right, the 2-clique {1,2,3,4,5} was removed because  d(4,5)=3.

          Another approach to the diameter models is the k-club which is defined as a subset of vertices S such that diam(G[S]) <= k. In the following figure two 2-cliques: {1,2,3,4,5} and {2,3,4,5,6}, one 2-clan: {2,3,4,5,6} and three 2-clubs: {1,2,3,4}, {1,2,3,5} and {2,3,4,5,6} can be found.


          An alternative way of relaxing a clique is the k-plex concept which takes into account the vertices degree. The degree of a vertex of a graph is the number of edges incident on the vertex, and is denoted by deg(v). The maximum degree of a graph G is the maximum degree of its vertices and is denoted by. Similarly, minimum degree is the minimum degree of its vertices. Asubset of vertices S is said to be a k-plex if the minimum degree in the induced subgraph >= |S| -k.
           In figure given below, the graph has 6 vertices, |S|=6, and the degree of vertices 1, 3, 4 and 5 does not exceed the value 3. So, the minimum degree in the induced sub-graph is 3. Thus, for |S|=6, k=3 is obtained.

No comments:

Post a Comment