Skip to content

antrixsh/Toxic-Comment-Classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Toxic-Comment-Classification

Introduction:

Being anonymous over the internet can sometimes make people say nasty things that they normally would not in real life. Let's filter out the hate from our platforms one comment at a time.

Objective:

To create an EDA/ feature-engineering starter notebook for toxic comment classification

Data Overview:

The dataset here is from wiki corpus dataset which was rated by human raters for toxicity. The corpus contains 63M comments from discussions relating to user pages and articles dating from 2004-2015.

Different platforms/sites can have different standards for their toxic screening process. Hence the comments are tagged in the following five categories

toxic
severe_toxic
obscene
threat
insult
identity_hate

The tagging was done via crowdsourcing which means that the dataset was rated by different people and the tagging might not be 100% accurate too.

About

Identify and classify toxic online comments. Being anonymous over the internet can sometimes make people say nasty things that they normally would not in real life. Let's filter out the hate from our platforms one comment at a time.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors