Wednesday, October 24, 2012

Social Network Intro to Web Science 100 Points

You are to write a program in Python which uses the igraph library to read a social network given in GraphML format and display the following information: 1. A brief summary of the graph (number of nodes and edges, number of components, the diameter, density, and average path length). 2. The five friends with the highest clustering coefficients. 3. The top five friends based on betweenness centrality (node betweenness). 4. The edges that would be removed using the Girvan-Newman algorithm when splitting the given graph with N components into N+5 components. If the graph on the right were used as input, this is the output your program would produce: 8

nodes, 9 edges, undirected Number of components: 1 Diameter: 4 Density: 0.3214 Average path length: 2.0714 Top 5 clustering coefficients: 1 (0, 'A', 1.0) 2 (1, 'B', 1.0) 3 (2, 'C', 0.16666666666666666) 4 (3, 'D', 0) 5 (4, 'E', 0) Top 5 friends: 1 (2, 'C', 11.0) 2 (3, 'D', 9.0) 3 (5, 'F', 7.0) 4 (6, 'G', 3.0) 5 (0, 'A', 0.0) Girvan-Newman algorithm Number of clusters = 1 Deleting edge C - D with betweenness 10.0 Number of clusters = 1 Deleting edge F - G with betweenness 16.0 Number of clusters = 2 Deleting edge D - F with betweenness 4.0 Number of clusters = 3 Deleting edge C - G with betweenness 3.0 Number of clusters = 4 Deleting edge A - B with betweenness 1.0 Number of clusters = 4 Deleting edge A - C with betweenness 2.0 Number of clusters = 5 Deleting edge B - C with betweenness 1.0 Number of clusters = 6 A C E D H F G B Implementation Summary To display the summary discussed in part 1, there is a single igraph function which will produce all this information. Clustering Coefficients To display the clustering coefficients, there may be an igraph function which does this calculation for you, but I have not found it. You may want to post a question to the igraph-help mailing list at http://lists.nongnu.org/mailman/listinfo/igraph-help to inquire about access to this information, or you can write your own function to do the calculations. I manually computed the clustering coefficient by first getting all the friends of each node. Then I created a set and iterated through every possible friend combination and added any edge ID that I found between friends. Since the data structure I used was a set, if I added the edge ID for A-C and then tried to add it a second time for edge C-A, it would only be in the set once. I used the number of items in the set as my numerator. For my denominator, I used the equation N!/(M!(N-M)!) which is the number of ways you can pick M distinct items from a collection of N items. We want to pick M=2 items from the collection of friends of that node, so N is the number of friends. When calculating the clustering coefficient for each node, I built a list of tuples that contained the node ID,...

Website: www.harding.edu | Filesize: -
No of Page(s): 3
Download Project 3: Social Network Intro to Web Science 100 Points You are to ....pdf

No comments:

Post a Comment