Tweets Reporting Abuse Classification Task

The total number of users of social media continues to grow worldwide, resulting in the generation of vast amounts of data. Popular social networking sites such as Facebook, Twitter and Instagram dominate this sphere. According to estimates, 500 million tweets and 4.3 billion Facebook messages are posted every day. According to the latest Pew Research Report, nearly half of adults worldwide and two-thirds of all American adults (65%) use social networking.

In recent decades we have noticed a considerable increase in reports or confession posts of abuse victims on twitter. Most of the time victims do not report it to their guardians or the concerned authorities. Teenagers and minorities are the most affected group of abuse. Part of these victims tweets about their incident to let go of pain and suffering or as a cry for help. Identifying such reports are challenging because of the unavailability of annotated training data, and a high degree of data sparsity. To address this we are hosting TRACT on kaggle.

Task

This new, multi-class classification task involves distinguishing three classes of tweets that mention abuse reportings: “report” (annotated as 1); “empathy” (annotated as 2); and “general” (annotated as 3)

  1. Automatic classification of tweets reporting abuse
    • F1-score for each class
    • Micro Averaged F1 scores
  2. Exploratory Analysis

Contributors

  1. Saichethan Miriyala Reddy
  2. Kanishk Tyagi
  3. Abhay Anand Tripathi

Contact

for more information regarding dataset and scripts contact

Acknowledgment

We would like to thank Dr. Ambika Vishal Pawar (associate professor) and Dr. Ketan Kotecha (director) at Symbiosis Institute of Technology, Pune for their help and support.

References

@misc{https://doi.org/10.17632/my2vkfyffd.1,
  doi = {10.17632/MY2VKFYFFD.1},
  url = {https://data.mendeley.com/datasets/my2vkfyffd/1},
  author = {Miriyala Reddy, Saichethan},
  keywords = {Data Mining, Social Media, Domestic Abuse, Twitter},
  title = {TRACT: Tweets Reporting Abuse Classification Task Corpus},
  publisher = {Mendeley},
  year = {2020}
}