The February 2017 Wikimedia Research Showcase:
https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#February_2017Wikipedia and the Urban-Rural Divide
By Isaac Johnson (GroupLens/University of Minnesota)
Wikipedia articles about places, OpenStreetMap features, and other
forms of peer-produced content have become critical sources of
geographic knowledge for humans and intelligent technologies. We explore
the effectiveness of the peer production model across the rural/urban
divide, a divide that has been shown to be an important factor in many
online social systems. We find that in Wikipedia (as well as
OpenStreetMap), peer-produced content about rural areas is of
systematically lower quality, less likely to have been produced by
contributors who focus on the local area, and more likely to have been
generated by automated software agents (i.e. “bots”). We continue to
explore and codify the systemic challenges inherent to characterizing
rural phenomena through peer production as well as discuss potential
solutions. (read more in this paper)
Wikipedia Navigation Vectors
By Ellery Wulczyn
In this project, we learned embeddings for Wikipedia articles and Wikidata items by applying Word2vec
models to a corpus of reading sessions. Although Word2vec models were
developed to learn word embeddings from a corpus of sentences, they can
be applied to any kind of sequential data. The learned embeddings have
the property that items with similar neighbors in the training corpus
have similar representations (as measured by the cosine similarity,
for example). Consequently, applying Wor2vec to reading sessions
results in article embeddings, where articles that tend to be read in
close succession have similar representations. Since people usually
generate sequences of semantically related articles while reading, these
embeddings also capture semantic similarity between articles. (read more...)