Your browser (Internet Explorer 7 or lower) is out of date. It has known security flaws and may not display all features of this and other websites. Learn how to update your browser.


The Method to the Madness: The 2012 Presidential Election Twitter Corpus.

Publication Type:

Conference Paper


International Conference on Social Media & Society, London, England (2016)


Social media provides a rich environment for understanding social
connections, interactions and information sharing across many
aspects of society. The relative ease of access to social media data
through provision of APIs by the companies has led to a
significant number of studies that attempt to understand how
social media fits into society and how the public uses it for
discourse and information sharing. One of the existing gaps in
these studies is the lack of extensive description of the data
collection and processing methods. These gaps exist as a result of
word limits in existing publication venues and a lack of
appropriate publication venues to share this type of fundamental
research. The following paper provides extensive detail as to how
a 52 million corpus of Twitter data on the 2012 Presidential
Election in the United States was collected, parsed and analyzed.
This level of detail is imperative in studies of social media as
small choices in what data to collect can have material effect on
the findings. In addition to the description of the methods, the
following paper provides a contribution to knowledge in
providing basic characteristics of one of the largest research
datasets of social media activity compiled to study political