Update II: It occurred to me that it would be much better for people to be able to view the entire talk in a single video, rather than having to switch between sections; therefore, I uploaded the whole thing to Vimeo.
Tonight I will be givingOn August 6th I gave a talk at the New York City R Meetup on how to perform social network analysis in R using the igraph package. Below are the slides I will be going over covered during the talk, and all of the code examples from the presentation are available in the ZIA Code Repository in the R folder.
Below is a video of this talk, with a link to the slides I review during the presentation. If you are interested, I suggest downloading the slides and following along with videos while having the slides open, as much of what is on the screen in the video is hard to read.
Andrew Little’s presentation on econometrics in R using Zelig and MatchIt are also available on YouTube starting here. I hope you enjoy the presentation, and please let me know if you have any questions or comments.
Automatically Generated Related posts:




Sorry, I will miss it but thanks for posting the slides.
[Reply]
Great talk. Like the plot of the eigenvecor vs. betweenness. I think the talk it more about igraph than R and igraph has bindings for python, R and ruby. I was wondering if there are some major drawbacks if one uses graphviz for visualisation with for example a fdp layout.
[Reply]
plotti,
True, the talk does center on igraph, but I try to show how doing the analysis under the umbrella of R makes some things easier–like the evc vs. bet plot.
I think the majpor drawback to graphviz is the markup language itself. While you can make some very attractive plots with graphviz, for most people the amount of effort required is just not worth it.
[Reply]
I was wondering if you happen to know how to calculate the different k-cores in igraph. I am a little bit lost here. I was lurking at the approach you proposed for networkx in http://www.drewconway.com/zia/?p=345 but since the result of a g.coreness(1) call is:
[0, 1, 0, true, 1, true, true, 1, true, 1, 1, true, true, 1, true, true, 1, true, true, true, true, true, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, true, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, .... ]
i am a bit confused. According to the manual this function should give me: “This function calculates the coreness for each vertex. Return Value: Numeric vector of integer numbers giving the coreness of each vertex.” How can “true” be a correct value? and the returned values never make it over 1.
Maybe you have an idea
Cheers
Thomas
[Reply]
plotti,
I go over some k-core analysis in the presentation above, so check out the slides for how to do it in R with igraph.
I am not familiar with the G.coreness for the Python igraph package, but in NetworkX the process is very straightforward:
import networkx
from networkx import core
cores=core.find_cores(G,with_labels=True)
core_items=cores.items()
# Find 2-core
core2=networkx.subgraph(G,[(a) for (a,b) in core_items if b>1])
[Reply]
Drew, great comparison and intro presentation. I had a quick comment and question.
Comment – totally agree that the integration with other techniques in the analysis phase makes R a clear winner for statisticians and analysts.
Question – have you seen any applications in R that are running on millions of nodes and 100s of millions of edges?
thanks
Nick
nick@sonamine.com
[Reply]
Nick,
I have never tried, but I am told that igraph can handle such large networks. I know that NetworkX can, so if you find igraph to be slow I would recommend moving to NX and Python.
[Reply]
[...] a bit dry, but a very interesting presentation (from a super geeky perspective) on using the open source package R and some nice libraries for doing social network analysis of data se…. (SNA definition and R stats package if you haven’t heard of them [...]
[...] 1. Unha excelente presentación sobre análise de redes sociais (en R e Python) de Drew Conway no New York City R Meetup. [...]
Drew,
I think the reason why NetworkX seems so much faster in the maximal clique finding test is because NetworkX doesn’t search for the maximal cliques immediately – it returns a Python generator object instead that will generate the cliques one by one as you start iterating over it. That 1.27 seconds you have seen is the time required to build up some internal data structures that will be used later during the iteration.
[Reply]
Tamas, you’re quite right; thank you for pointing that out!
[Reply]
Hi Drew, just a quick question, do you know whether igraph is capable of intersecting/overlaying multiple networks of a single population as what Valdis Krebs did using inflow? This is the link to Valdis’s network maps: http://www.orgnet.com/decisions.html
Thanks.
[Reply]
Drew Conway Reply:
February 11th, 2010 at 7:36 am
Sure, you would just have to give the various edge sets a different attribute via E(G)$attr< -some.val, and then plot the graph as in http://igraph.sourceforge.net/screenshots2.html#6
[Reply]
Thanks a lot, Drew!
[Reply]
[...] analysis based on data sets collected on Twitter see for example Drew Conway’ s post “SNA in R Talk, Updated with [Better] Video “; for research based on Facebook data sets see “How to split up the US” by Pete [...]
[...] for maximal independent vertex sets on the complementary graph. As Drew Conway pointed it out in one of his SNA talks, this is a terribly slow approach for sparse graphs as it takes ages to build the complementary [...]