We describe a methodology for the automatic identification of communities of practice from email logs within an organization. We use a betweeness centrality algorithm that can rapidly find communities within a graph representing information flows. We apply this algorithm to an email corpus of nearly one million messages collected over a two-month span, and show that the method is effective at identifying true communities, both formal and informal, within these scale-free graphs. This approach also enables the identification of leadership roles within the communities. Email as Spectroscopy [PDF] |