I wanted to know how the use of distressing language is being perceived and maybe internalised by the readers of headlines.
Using a module called snscrape, i collected data about the frequency of posts containing words like 'depressed' and 'anxious' from christmas day each year from 2010 to 2021. This was the function that i used to scrape tweets, it is set up to receive a term to search for and a date to search in, it then searches for the term up to one day past the given date and returns a pandas dataframe with the tweets.
def get_tweets(term, day, month, year):
# Creating list to append tweet data to
tweets_list2 = []
# Using TwitterSearchScraper to scrape data and append tweets to list
since = year + "-" + month + "-" + day
until = year + "-" + month + "-" + str(int(day) + 1)
for i, tweet in enumerate(
sntwitter.TwitterSearchScraper('{} since:{} until:{}'.format(
term, since, until)).get_items()):
if i > 10000:
break
tweets_list2.append(
[tweet.date, tweet.id, tweet.content, tweet.user.username,
tweet.replyCount, tweet.retweetCount,
tweet.likeCount])
# Creating a dataframe from the tweets list above
df = pd.DataFrame(tweets_list2,
columns=['datetime', 'tweet Id', 'text', 'username',
"replies", "retweets", "likes"])
return df
I then used my new function to scrape the terms 'depressed' and 'anxious' and found the count for each term and year
datasets_anxiety = {}
datasets_depression = {}
for i in range(2):#get all years for christmas
datasets_anxiety[str(i+10)] = get_tweets("anxious", "25", "12",
"20" + str(i+10))
datasets_depression[str(i+10)] = get_tweets("depressed", "25", "12",
"20" + str(i+10))
count_per_year_anxiety = []
count_per_year_depression = []
for i in datasets_anxiety:
count_per_year_anxiety.append(datasets_anxiety[i].count()["text"])
count_per_year_depression.append(datasets_depression[i].count()["text"])
plotting these against an array of years with the following code gave this graph (note that this is an interactive copy and you can use the control icons in the bottom left of each graph to navigate them):
fig, ax = plt.subplots(2, 1, sharex='col', sharey='row')
ax[0].plot(["20" + i for i in datasets_anxiety.keys()],count_per_year_anxiety,
label="anxious", color="red")
ax[0].plot(["20" + i for i in datasets_anxiety.keys()],count_per_year_depression,
label="depressed", color="blue")
plt.legend()
plt.ylabel("No. of tweets")
plt.xlabel("Year")
plt.show()