Skip to content

Girish-Anadv-07/SENTIMENT-ANALYSIS-OF-AMAZON-FOOD-REVIEWS

Repository files navigation

SENTIMENT-ANALYSIS-OF-AMAZON-FOOD-REVIEWS

Introduction

The Amazon Food Reviews dataset consists of 568,454 food reviews. This dataset consists of a single CSV file, Reviews.csv Both Naive Bayes and Logistic Regression - with L1 regularizor models are used and model with higher accuracy is preffered.

Data Set
Click here to get the dataset.

Review.csv - 251MB

Dataset statistics

    Number of reviews     568,454
    Number of users     256,059
    Number of products     74,258
    Users with > 50 reviews     260
    Median no. of words per review     56
    Timespan     Oct 1999 - Oct 2012

Data Fields Explanation

    Id - Unique row number
    ProductId - unique identifier for the product
    UserId - unqiue identifier for the user
    ProfileName
    HelpfulnessNumerator - number of users who found the review helpful
    HelpfulnessDenominator - number of users who indicated whether they found the review helpful
    Score - rating between 1 and 5
    Time - timestamp for the review
    Summary - brief summary of the review
    Text - text of the review

EDA Objective

Analysing the data & plot the required graphs to show that these conclusions are true:

a. Positive reviews are very common.
b. Positive reviews are shorter.
c. Longer reviews are more helpful.
d. Despite being more common and shorter, positive reviews are found more helpful.
e. Frequent reviewers are more discerning in their ratings, write longer reviews, and write more helpful reviews

Model Building

    STEP-1: Copy the data in Pandas DataFrame and drop unwanted columns.
    STEP-2: Text Preprocessing.
            a. Converting to lower-case.
            b. Removing HTML Tags.
            c. Removing Special Characters.
            d. Removing Stop Words.
            e. Stemming (Snowball Stemming)
    STEP-3: Vectorizing out Data Set
    STEP-4: Building and evaluating the model
            a. Naive Bayes
            b. Logistic Regression - with L1 regularizor

Output

The accuracy of Naive Bayes Model is shown below

The accuracy of Logistic Regression - with L1 regularizor is shown below

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published