Study of Online, Collective Knowledge, and Stories (SOCKS)
Stories are an essential part of how people comprehend, explain, predict, and seek to navigate the world. What are the fundamental kinds of stories? How do stories told through social networks influence our behavior? A powerful approach to quantifying the components of stories centers on enumerating the base units of n-grams—contiguous sequences of n words, including punctuation and other text elements in a body of writing—and how usages of and interactions between n-grams unfold over time.

StoryWrangler is a curation of Twitter into day-scale usage ranks and frequencies of n-grams for over 100 billion tweets in 100 languages from 2008 through to mid 2020. The massive sociolinguistic data set accounts for social amplification of n-grams via retweets, which can be visualized through time series contagiograms. The project is intended to enable or enhance the study of any large-scale temporal phenomena where people matter including culture, politics, economics, linguistics, public health, conflict, cimate change, and data journalism.