Toxic-Comment-Classification

Introduction:

Being anonymous over the internet can sometimes make people say nasty things that they normally would not in real life. Let's filter out the hate from our platforms one comment at a time.

Objective:

To create an EDA/ feature-engineering starter notebook for toxic comment classification

Data Overview:

The dataset here is from wiki corpus dataset which was rated by human raters for toxicity. The corpus contains 63M comments from discussions relating to user pages and articles dating from 2004-2015.

Different platforms/sites can have different standards for their toxic screening process. Hence the comments are tagged in the following five categories

toxic
severe_toxic
obscene
threat
insult
identity_hate

The tagging was done via crowdsourcing which means that the dataset was rated by different people and the tagging might not be 100% accurate too.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Source file		Source file
data		data
image		image
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Toxic-Comment-Classification

Introduction:

Objective:

Data Overview:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Toxic-Comment-Classification

Introduction:

Objective:

Data Overview:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages