Cyberbullying Tweet Analysis Project

This project focuses on the analysis of cyberbullying tweets categorized by cyberbullying types. The dataset contains cleaned text, T-SNE plot coordinates, and categorization by cyberbullying types. The main goal is to predict cyberbullying types for new tweets using traditional Machine Learning Models and a Convolutional Neural Network (CNN).

Dataset

The dataset includes:

Cleaned Text: Preprocessed text data.

T-SNE Plot Coordinates: Coordinates for T-SNE visualization.
Cyberbullying Types: Categorized types for each tweet.

Due to size constraints on GitHub (25 MB cap), the vectorized words cannot be included in the repository as it exceeds the limit even when compressed (85 MB).

Accesing Necesary Files

Access the necessary files for the app's functionality via this Google Drive link:

Models: Random Forest, Support Vector Machine, Logistic Regression, Convolutional Neural Network.
Scaler: Necessary for model predictions.
Vectorized Texts: Preprocessed and vectorized text data.

Implemented Models

Implemented Machine Learning Models include:

Random Forest
Support Vector Machine (SVM)
Logistic Regression
Convolutional Neural Network (CNN)

Functionality

The main Python file, project.py, utilizes Streamlit for data visualization and machine learning predictions. It offers the following functionalities:

Visualization: Showcases word clouds, bar charts displaying the top 15 most common n-grams based on selected cyberbullying types, and a T-SNE plot for data visualization.
Machine Learning Predictions: Given a tweet, the models generate probabilities for each cyberbullying category. Users can view these probabilities for different categories.

Usage

To run the project locally:

Ensure Python and necessary libraries are installed.
Clone the repository.
Access the models, scaler, and vectorized texts from the provided Google Drive link.
Place the files in the appropriate directory.
Modify the path variable to the appropiate directory
Execute project.py using Streamlit.

Contributors

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
JoblibSVC.ipynb		JoblibSVC.ipynb
LICENSE		LICENSE
LogisticRegression.joblib		LogisticRegression.joblib
README.md		README.md
cyberbullying_tweets_wordclouds.csv		cyberbullying_tweets_wordclouds.csv
project.py		project.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cyberbullying Tweet Analysis Project

Dataset

Accesing Necesary Files

Implemented Models

Functionality

Usage

Contributors

License

About

Releases

Packages

Contributors 2

Languages

License

CasuallyPassingBy/Cyberbullying_Tweet_Analysis

Folders and files

Latest commit

History

Repository files navigation

Cyberbullying Tweet Analysis Project

Dataset

Accesing Necesary Files

Implemented Models

Functionality

Usage

Contributors

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages