People have long predicted the demise of traditional news media and the rise of the citizen journalists. Various initiatives have tried to create new media outlets on the Web, Blog, and Twitter powered by creative swarms of hobby journalists - but none of them has been a breakthrough success so far. Well, it turns out that there is such a citizen Web site, venerable old Wikipedia!
In a series of earlier projects we have analyzed collaboration among Wikipedia authors when creating new Wikipedia articles, for example studying how they collaborate as COINs in different cultures (http://www.ickn.org/documents/COINS2010_Nemoto_Gloor.pdf).
In our current project we are creating a map based on Who-works-with-whom-on-Wikipedia (the “W5-map”). We build a semantic network of concepts by constructing a link between two Wikipedia articles if the same author has worked on both articles. This W5-map shows us to what kind of articles the swarm flocks to. By repeating this process for every month in 2010 we are able to see how the W5-map changes over time.
As the whole Wikipedia includes millions of article, drawing a whole map of Wikipedia in one step is too much. Instead we employed a “snowball sampling” method, which allows us to draw a partial map by selecting a start article or editor. For our first experiment, we used the article about “Wikipedia” as the starting point. We collected the top 10 editors based on the number of edits on this article, then we gathered the top 10 articles of each editor. We repeated this steps recursively up to 3 degrees of separation from the start point. Restricting this analysis to a certain period of time (e.g. one month starting Jan. 1 2010), permits us to obtain a temporal W5 map from this start point. Applying this process repeatedly we calculated 11 snapshots of one month from Jan. 2010 to Nov. 2010. Each node corresponds to an article in Wikipedia. We draw an edge between articles A and B if there are at least 2 editors who made edits both on article A and article B.
The pictures below show our results. Each map was drawn by Gephi, and the size of the article title was determined by the undirected PageRank score of the W5 network. The major topics (based on PageRank Score) for each month are shown below. Surprisingly they reflect the major news item of the month:
Jan. 2010: 2010 Haiti earthquake
Feb. 2010: 2010 Winter Olympics
Mar. 2010: 2010 Polish Air Force Tu-154 crash
Apr. 2010: Telephone (song)
May. 2010: Gaza flotilla raid
Jun. 2010: 2010 FIFA World Cup
Jul. 2010: 2010 FIFA World Cup
Aug. 2010: 2010 Israel-Lebanon border clash
Sep. 2010: 2010 Atlantic hurricane season
Oct. 2010: Copiapo mining accident
Nov. 2010: United States diplomatic cables leak
Furthermore, we can also find clusters of articles, representing a group of similar topics (e.g. a cluster on Lady Gaga or on WikiLeaks).
This means that groups of similarly minded Wikipedians tend to aggregate around a set of articles on a topic they are most interested in.
Looking at Nov. 2010, the United States diplomatic cables leak was strongly connected to WikiLeaks and Julian Assange, which makes perfect sense because both of them are part of the WikiLeaks dispute. Bombardment of Yeonpyeong had many edges from the WikiLeaks cluster while there were no edges from the 2010 Asian Games cluster, which means that Wikipedians working on the Bombardment of Yeonpyeong are interested in the diplomatic problem, not in the topics in Asia.
Our preliminary investigation suggests that looking at Wikipedia through the W5 map might be a new way to identify latest news. We find the news of the world even if we start from a neutral article such as the one about “Wikipedia”. The swarm of Wikipedians seems to be a perfect group of coolhunters and citizen journalists to report latest news on politics, celebrities, and sports.