Data Science: Fake news prediction.
Fake news encapsulates pieces of news that may be hoaxes and is generally spread through social media and other online media. This is often done to further or impose certain ideas and is often achieved with political agendas. such news items may contain false or exaggerated claims and may end up being viralized by algorithms and users may end up in a filter bubble.
Let's see the code and go step by step:
First, we need to import the libraries that are needed for the following steps:
Now, we need to import the other libraries which are needed for the model building. here we are taking the TfidfVectorizer.
TfidfVectorizer is The term frequency and inverse document frequency.
Reading the dataset through the panda's library.
Getting the Top 5 of the dataset through the head() command.
Getting the bottom 5 of the dataset through the tail() command.
Getting the more information about the dataset through the data.info() command.
Now, we need to get the mean, standard deviation and count for that we need to take the describe() method.
Getting the number of rows and columns for that we are taking the shape command.
Now, we are separating the label command for the fake news prediciton.
Now, spliting the data into X_train, x_test, y_train, y_test for that we need to take the data, labels, test_size and the random_state
Now, we are taking the TfidfVectorizer for the model building.
Now Fit and Transform the dataset through the train set and the transform test set.
Taking the model through the PassiveAggressiveClassifier building the model and fitting the model.
We need to predict the model by taking the model_test data for which we are predicting the new values of the dataset.
Now, doing the confusion matrix is a commonly used tool in machine learning for evaluating the performance of a classification model. It provides a detailed breakdown of the model's predictions and helps you understand how well the model is performing. The confusion matrix is particularly useful because it goes beyond simple accuracy and provides insights
By here you can see that there are 588 True Positives, 50 False Positive, 43 False Negative, 586 True Negative.
Thanks for Reading,
Mohammed Muqafamuddin.