- A machine learning model that separates spam messages from ham messages. The dataset consisted of email messages and their labels (0 for ham, 1 for spam). I have used the TF-IDF Vectorizer which gives weightage to important words and hence improves the accuracy of the model.
- This model is able to detect the given language. The dataset consisted of 17 different languages. Data was vectorized with TF-IDF vectorizer and then fitted into a Logistic Regression model for classification.
- The program recommends movies based on similarity between tags. Tags are made by combining cast, genre & synopsis data of movies. The program finds the 5 most similar movies to the selected movie from the database using K-nearest neighbours technique. The dataset used in this project is downloaded from Kaggle.com.
- A deep learning model capable of doing breed classification of a dog by just “looking” into its image. The dataset consisted of labelled pictures of different dog breeds. I used Fast.ai's vision learner to train the model and finally deployed it on web using Gradio.
- A machine learning model designed to predict diabetes based on health data. I processed the data, applied feature scaling, and trained the model using Support Vector Classifier (SVC). I also tried logistic regression for comparison, but SVC performed better. The model’s accuracy was evaluated on both training and test datasets.