As anticipated by many, Twitter stopped offering its (limited!) API for free 1.
Now, what options do you have to programmatically access the public content for free?
In this context, it is worth mentioning the library snscrape
, a tool (well-maintained as of now) for extracting the content from social media services such as Facebook, Instagram or Twitter 2. I have just given a go, in the scope of the research project I am working on, and would love to share some thoughts and code.
The basic usage is pretty simple, but I added multithreading to improve speed by executing queries in parallel (an established way of handling I/O bound operations). I also prefer a functional/pipeline style of composing Python commands, using generators
, filter
and map
features. The code snippet below (see also the Colab notebook) shows how to extract tweets of top futurists. Enjoy!
# install social media scrapper: !pip3 install snscrape
import snscrape.modules.twitter as sntwitter
import itertools
import multiprocessing.dummy as mp # for multithreading
import datetime
import pandas as pd
start_date = datetime.datetime(2018,1,1,tzinfo=datetime.timezone.utc) # from when
attributes = ('date','url','rawContent') # what attributes to keep
def get_tweets(username,n_tweets=5000,attributes=attributes):
tweets = itertools.islice(sntwitter.TwitterSearchScraper(f'from:{username}').get_items(),n_tweets) # invoke the scrapper
tweets = filter(lambda t:t.date>=start_date, tweets)
tweets = map(lambda t: (username,)+tuple(getattr(t,a) for a in attributes),tweets) # keep only attributes needed
tweets = list(tweets) # the result has to be pickle'able
return tweets
# a list of accounts to scrape
user_names = ['kevin2kelly','briansolis','PeterDiamandis','michiokaku']
# parallelise queries for speed !
with mp.Pool(4) as p:
results = p.map(get_tweets, user_names)
# combine
results = list(itertools.chain(*results))
- 1.@TwitterDev. Twitter announces stopping free access to its API. Twitter Dev Team. Published February 3, 2023. Accessed February 15, 2023. https://twitter.com/TwitterDev/status/1621026986784337922?s=20
- 2.snscrape. snscrape. Github Repository. Accessed February 15, 2023. https://github.com/JustAnotherArchivist/snscrape