Platform Studies

This section contains references to historical resources, tools, and institutions that might be of interest for scholars studying online platforms, web services and online projects like Twitter, Reddit, Telegram, Wikipedia etc. Please see as well the section on digital methods.

(Historical) Data on Platforms

APIs (Application Programming Interfaces) are great for accessing and gathering current platform data. A number of projects are dedicated to archiving and publishing historical platform data.

Reddit

Wikipedia

Being not a platform in the srtict sense of the concept Wikipedia nevertheless constitutes a vivid project and community with a build in “history” because MediaWiki – the software Wikipedia runs on – tracks all changes to any page on Wikipedia. The Wikimedia foundation provides various data dumps (mostly on a monthly basis) to all language editions of Wikipedia. The Wikimedia Downloads page provides links to the most recent data dumps of active Wikis as well as to data dumps to discontintued Wikis.

Telegram

The messenger service Telegram allows the creation of groups with up to 200.000 members interacting with each other and of channels with unlimited members for broadcasting messages to potentially very large audiences. This capability gave rise to Telegram’s growing popularity as a social media platform. As a consequence, new patterns and dynamics of communication emerge that amend one-to-one and few-to-few communication by modes of many-to-many and one-to-many communication. The resulting publics of Telegram are situated in a grey zone of publicness: they typically accessible without restriction, yet at the same time remain hidden to a certain extend and thus might appear closed off.

Due to its broad understanding and enforcement of freedom of speech Telegram has become popular among actors deplatformed from other social media. However, in January 2021 Telegram confirmed that it removed a number of channels because of threats of violence.

A number of resources, meta-services and tools exist that allow (media) scholars to engage with Telegram as a platform to study. One of the challenges is to find groups and channels of interest. There are a number of pages dedicated to this:

For retrieving data from relevant groups and channels one make use of the Telegram API. Once a free API key is acquired one can interact with Telegram programmatically:

Github

GitHub is a platform for developing, managing and publishing software projects. Since GitHub provides an infrastructure for open as well as closed distributed collaboration, it is used for other non-software-related purposes as well. Following the platforms own branding GitHub is “[w]here the world builds software” (GitHub n.d.).

Retrieving Platform Data

The main resource for collecing data of this platform is the officical GitHub REST API. For retrieving data from the API one has to create a GitHub user account and create an API access token.

  • Digital Methods Initiative published a number of tools on their website that provide an easy to use interface for collecting selected data via the API.
  • A Google Collaboratory based tool/interface for easily collecting a broader variety of platform data via the GitHub API and storing it in ones own Google Drive can be found here

Besides the official API a number of other sources for GitHub data exist

  • GitHub Archive collects data of platform events and publishes them in JSON format.
    • This dataset is vast and using it requires rather extensive computing capabilities. However, the dataset is accessible via Google BigQuery as well.
  • GitHub Torrent is “an effort to create a scalable, queriable, offline mirror of data offered through the Github REST API”. The project publishes MySQL as well as MongoDB database dumps of platform events.

Analyzing Platform Data