The #MeToo movement :
A Temporal, Gender Based and Sentimental Perspective
The aim of this project is to use a dataset of tweets containing #metoo and associated hashtags, in order to analyze this movement and better understand it.
The #MeToo movement is a global movement against sexual abuse. It spread virally in October 2017 as a hashtag used on social media in order to show the widespread phenomenon of sexual harassment, particularly in the workplace. The idea is to fearlessly talk about any sexual abuse one might experience and to rebel against the culture of staying silent after going through such traumatising experiences. A lot of celebrities participated in this movement by highlighting their stories on social media, including renowned actress, personalities, and politics, but also people from simple backgrounds. As a team we are motivated to understand the movement because of its intensity and controverse. It is a challenge to put aside our personal opinion as individuals and consider only the data around this thematic to arrive to an interesting yet objective datastory.
Explore the movement timeline
Let's begin by exploring the movement activity by hashtag, and let's try to correlate the activity peaks with known major events. To do so, we computed the number of tweets containing a given hashtag, for each day. We will assume that the distribution over time of our dataset is representative of the real distribution on Twitter ; which is a reasonable assumption since the tweets have been randomly extracted from Twitter.
The hashtag list has been arbitrarily chosen by us among the most frequent hashtags in our dataset, based on relevance and interest. Thereafter, we looked for the major events in the movement and in the period of time of our dataset. Here is the resulting event list :
- 9 November 2017 : Accusation of Roy Moore. In November 2017, nine women accused Roy Moore (a USA Senate candidate and a former Chief Justice of the Supreme Court of Alabama) of sexual misconduct.
- 12 November 2017 : Hollywood march. Thousands of people marched in Los Angeles on Sunday in support of victims of sexual assault and harassment.
- 15 November 2017 : MeToo congress act. The METOO Congress Act (USA) opens the process to non-staffers and members, make the counseling and mediation process optional, and give accusers 180 days after the alleged incident to file a complaint.
- 23 November 2017 : Uma Thurman accuses Harvey Weinstein. "I tried to say no, I cried, I did everything I could do. He told me the door was locked but I never ran over and tried the knob", Uma Thurman said.
- 6 December 2017 : Time's person of the year. Time’s 2017 Person of the Year is the "Silence Breakers", i.e. the women and men of the #MeToo movement (Time is one of the main magazines in USA).
- 12 December 2017 : Election for senate in Alabama. These elections opposed the Democrat Doug Jones and the Republican Roy Moore (accused of sexual misconduct) for a USA Senate seat in Alabama. Doug Jones won.
- 17 December 2017 : Meryl Streep silence is pointed out. "Actresses, like Meryl Streep, who happily worked for The Pig Monster, are wearing black @goldenglobes in a silent protest. YOUR SILENCE is THE problem. You’ll accept a fake award breathlessly & affect no real change. I despise your hypocrisy.", Rose McGowan said.
- 8 January 2018 : Golden Globe's. "For too long women have not been heard or believed if they dare speak the truth to the power of those men. But their time is up. Their time is up.", said Oprah Winfrey duting her speech at the Golden Globe's.
- 21 January 2018 : Women's march (USA). This march is aimed towards is to gather the political power of diverse women and their communities to create a change in the society. They strive to break down the system of oppression with the means of nonviolent action lead by morality and reverence. The #MeToo movement had become "a galvanizing force at many of the rallies".
Resulting from this process, in the interactive figure below, you can select a hashtag to visualize the associated activity on Twitter. Black arrows for the above events also appear at each corresponding date after selecting a hashtag.
In this timeline, we can clearly correlate some events with activity peaks. For example, without any surprise, the peak of activity for the hashtag #goldenglobes happens the exact day of the 2018 Golden's Globes. Likewise, the activity associated with #roymoore begins just after the accusation of Roy Moore on November 9th 2017. More interestingly, we can see that the very popular #timesup is correlated with the Golden Globe's event : actually, it corresponds to the speech of Oprah Winfrey who said "their time is up" during the Golden Globe's. The hashtag #sheknew, a bit enigmatic, is also more understood when when we realize that it appears just after that the silence of Meryl Streep is pointed out about Harvey Weinstein misconduct. It actually makes reference to the silence of Meryl Streep ("she knew") ; there were even posters in the street denouncing this silence.
Tweeting is sharing
An important thing to notice is that - as in a lot of movements on Twitter - most of these tweets are actually retweets. Indeed, it is the case for almost 3 tweets out of 4. Thus, in any case, we cannot talk about isolated testimonies. It shows that this mobilization is overall a process of sharing stories of others by showing support or disagreement. Twitter hence enables the connection of people worldwide to develop a common battle and gain awareness for the cause. This group phenomenom can be also criticized : it raises the question of stacktivism, which happens when people show support for a cause with the main purpose of boosting the egos of participants in the movement, by simple actions ("like", "retweet"...) that require very little thought or effort.
Below, you can find the tweets in our dataset that have been the most retweeted. They are a good representation of the movement in the sense that they drove its activity.
The 16 women who accused Trump of sexual assault are telling their story in one video-please share this far & wide. RT if you agree it’s time for Trump to be held accountable for his sexual misconduct.#TrumpSexPredator #AMJoypic.twitter.com/hNIqZEI54G
— Scott Dworkin (@funder) 20 novembre 2017
This first tweet is from Scott Dworkin, who is part of The Democratic Coalition, an opposition movement against Donald Trump. In his tweet, he denounces the sexual misconduct of Donald Trump with a video (trailer) describing the testimonies of 16 women.
I mean, what world are we living in that an accused sexual abuser is allowed to be our President and an accused pedophile is allowed to run for senate? These two things have many things in common - one of which - is the Republican National Committee. #MeToo
— Alyssa Milano (@Alyssa_Milano) 5 décembre 2017
This one is from Alyssa Milano, one of the persons at the origin of the movement. Here again, the sexual misconduct of Donald Trump is denounced, as well as the sexual misconduct of Roy Moore (implicitly).
Just reported @Rosie for targeted harassment, mainly to see if Twitter does indeed have a double standard. Everyone knows if Rosie were conservative, Twitter would suspend her in a hot second. So, Twitter, put your money where your mouth is. #MeToo
— Ben Shapiro (@benshapiro) 22 décembre 2017
This tweet from Ben Shapiro is interesting in the sense that it raises the question of "double standard" : is it accepted (and trusted) if a man denounces a woman for sexual harassment ?
I talked to a girl who says she went on a date with @azizansari in an exclusive for @babedotnet. She told me, "It was by far the worst experience with a man I’ve ever had." I believe her. #TimesUp #MeToo #AzizAnsari https://t.co/p7q0fjSsh0
— Katie Way (@k80way) 13 janvier 2018
Finally, this tweet from Katie Way denounces Aziz Ansari, a famous American actor. Not surprisingly, all these popular tweets have in common that they denounce someone or something (Trump, Roy Moore, the double standard, Aziz Ansari). Moreover, these popular tweets are also at the origin of activity peaks. Indeed, these celebrities have a high number of followers, hence the sharing effect is exponential.
Topic clustering
In this section, we identify which topics are discussed in this movement, through topic modelling and more particularly Latent Dirichlet Allocation (LDA). It is an unsupervised machine learning method that helps discover hidden semantic structures and allows to learn topic representations of tweets and finally to identify topics. In this interactive figure, you can visualize the main topics discussed among the movement. Click on "next topic" to visualize successively all the topics. You can also click on the words at the right to see in which cluster(s) it is the most frequent.
This modeling allows to clearly identify topics like the topic #10 which seems to be about Harvey Weinstein accusation, or topic #4, about Donald Trump accusation. They are of course coherent with the popular hashtags (#trumpsexprobe, #weinstein...). This method shows the diversity of topics among the tweets and within the movement ; but the main keywords remain present in every topic.
Sentimental Analysis, all tweets
From all the tweets that were sent out with a #meToo, one can wonder what was the overall sentiment of these tweets. The sentimental analysis was performed using LIWC, a text analysis application called Linguistic Inquiry and Word Count: an efficient and effective method for studying the various emotional, cognitive, and structural components present in individuals’ verbal and written speech samples. [1]
The analysis was applied with three aspects: on the overall data, on 30% of the data identified and separated by gender, and on data separated to correspond to each unique day over the entire time length.
For the initial analysis conducted on the overall data, all LIWC categories of the analysis are presented above each graph using icons. We invite you to hover over the icons to learn what these categories are and what are examples of the words that LIWC uses to identify these categories.
Regarding the percentages and numbers presented, all percentages were rounded up for a cleaner presentation. Hence no numbers are absolutes, all numbers serve to give you a feeling over the dataset. The most pertinent way to look at the number of a category is to consider it respectively to the other percentage numbers of the other categories in that same analysis.
Thematics analysis, all tweets
“Biology” includes all biological processes, ranging from eating to more sexual terms such as “sex.” On the other hand, due to LIWC’s definition of “Sexual”, much more graphic and precise sexual terms, we can understand that none of that language appears in our tweets.
Hence “Biology” would incorporate the terms most recurrent that we also see in the Topic Clustering such as, “sexuality.”
We observe that the following recurrent thematics are notably the body and health, which echo the Topic Clustering thematics such as “assault” and “violence,” correlated to body and health. The “social” thematic of the pie chart can be linked to Topic Clustering elements such as “share” and “story” indicating a social communication on the #meToo matter.
Context analysis, all tweets
We see that the #meToo phenomenon is a phenomenon that “hits close to home.” Family and friends are contexts that are recurrent in the tweet’s speech. We feel people are being affected in their personal lives and either show or gain support from their friends and family.
Temporal speech
People tend to speak in the past or future tenses. #meToo is not a spur of the moment occurrence using the present. It either reaches out to the past, like the many sexual abuse narrative resurfacing after decades, or latches on to the future, to change the mentalities through each user expressing their opinion on the stories being told.
Implication, group dynamics and certitude analysis, all tweets
The implication and group dynamics of the movement could be investigated using LIWC.
The pie chart attempts to see how people are implicating themselves. We find that hearing is a recurrent manner. Indeed, Twitter is all about telling your story and having people listen and read it. #meToo is all about spreading the word. Insight and Cognitive Mechanisms indicate opinions and explanations, people are either commenting or reasoning on the stories being told: stories are not being left as such but are interpreted and processed by the readers. On the opposite spectrum we see that motion is only a small percentage, hence it seems that the #meToo presented in these tweets is not a call to action but more a story telling.
For group dynamics, the repetition of “we” and all plural forms of pronouns were considered, as opposed to singular pronouns such as “you” and “I” for an individualist dynamic. Was the implication personal or impersonal? For a “personal” implication measurement we considered pronouns including the speaker such as “We” and “I”, while impersonal was defined as pronouns differentiating themselves from the speaker such as “You” and “They.”
The level of certitude of the speakers was also assessed. It can be observed that those using #meToo, use a language of certainty. This could be due to the speaker’s resolve to speak out and strong opinions on the matter.
We see that people tend to "hear" stories in an impersonnal and individualist yet assertive manner. This conforts the notion that #metoo is about storytelling and broadcasting to the world these stories.
Positive or negative emotional analysis, all tweets
#meToo being a highly personal movement of individuals telling their story, it was essential to get a notion of the overall feelings circulating in the text of the tweets. We observe both strong peaks of positive and negative emotions and attitudes. The “Assent” associated to #meToo can be seen as the approval and encouragement of those telling their stories. The “Negative Emotions” are inevitable as well, seeing the subject of sexual harassement and abuse surround #meToo. However, we cannot distinguish if perhaps these negative emotions are perhaps against the movement itself. We are aware that some tweets in our dataset are outspokenly against the #meToo movement.
We can take as example the famously controversial tweet of DanBilzerian, subsequently deleted from Twitter: "RT @DanBilzerian: This #metoo shit is getting out of control, guys getting their lives ruined over touching a girl's back or hitting on some."
Sentimental analysis for gender
Using the program, GenderPerformr [2], we were able to allocate a gender (Female, Male or Neutral) to 30% of all our tweets. Due to this small percentage (and even smaller if we consider only Male and Female), we urge the reader to remain aware that the above analysis does NOT apply to the entire data.
For those users identified with certainty as male and female, we performed a LIWC sentimental analysis. The results are surprisingly similar for both genders, suggesting that #meToo is a human battle and not a battle of gender. The only notable differences are that women appear more certain than men, while being more individualist and impersonal in their manner of speech. The two last aspects can be potentially associated to a protection mechanism due to the heavy psychological burden of sexual harassement.
Sentimental analysis over time
By doing a sentimental analysis over time we wanted to see if some evolution in the themes, emotions, implications, etc. would pop up during the development of the #metoo movement and if some particular tendencies would be discernible in relation to certain events.
Note : to explore the following graphs you can double-click on the legend to isolate one trace (for example double-click on 'Body' to see only the dots corresponding to this category).Thematics & context
The thematics analysis doesn't show any huge trends. We can still notice an increase in the "social" category up to the 21st of January corresponding to the Women's March 2018. In deed the "social" category of LIWC includes the word "woman".
The context analysis on the other hand shows some interesting tendencies. For example, the leisure category includes "movie*", "actor*" and its evolution can be associated with events in the holywood world. Indeed a ceremony like the golden globes or the release of the Time's person of the year correspond by a few days to an increase in this category.
Emotions & implications
Indeed we can observe that the spikes of negative emotions almost always correspond to a spike in the impersonal ("you", "he-she", "they") implication and a spike in the solo ("I", "you", "he"-"she") dynamic. One way to interpret it is to imagine that most of the solo and impersonal tweets are the one with "he" or "she" words in it so are about sharing another person abuse story or accusing someone.
What does this all add up to ? ...
Seeing the certainty in the speech of our tweets we can clearly see how the technology empowered women and people in general to speak out against sexual harassement and sexual abuse. Twitter is the essential tool to this wild-fire movement, it could not have happened without social media as can be seen by each spike in the data that almost instantly follows a major event in the world.
Surprisingly, from this data analysis, gender does not play as big of a role as one could have expected. The implications and emotions expressed by each gender were almost identical.
It all boils down to someone, somewhere, telling their stories and others listening and sharing.
References:
[1] J. W. Pennebaker, R. L. Boyd, K. Jordan, and K. Blackburn, “The Development and Psychometric Properties of LIWC2015,” p. 26.
[2] Z. Wang and D. Jurgens, “It’s going to be okay: Measuring Access to Support in Online Communities,” p. 13.
Icons: