GitHub - Stefanstud/Recommender-Systems

Overview

This repository contains the following components:

Root

model.pt and submission.csv in the root of the project are the best model's weights and predictions.

Data

The data/ folder includes the datasets and embeddings used in the project:

train.csv: Training dataset provided.
test.csv: Test dataset provided.
books.csv: Original books dataset.
books_fixed.csv: Fixed version of the books dataset, addressing issues announced on the Ed forum.
extended_books_google.csv: Books dataset augmented with metadata from the Google Books API.
book_text_embeddings_bge.pt: Textual metadata embeddings generated using the BGE embedding model.
category_embeddings.npy: Category embeddings created using GloVe word embeddings.

Notebooks

The notebooks/ folder contains notebooks for data preprocessing, and model experimentation:

fetch_book_metadata.ipynb: Extracts metadata using the Google Books API (alternative options for Open Book Library API included as comments).
build_embeddings.ipynb: Generates book and category embeddings using BGE and GloVe models.
fix_isbn.ipynb: Cleans and fixes the dataset, producing books_fixed.csv.
item_collaborative.ipynb: Implements item-based collaborative filtering.
user_collaborative.ipynb: Implements user-based collaborative filtering.
matrix_factorization_best_approach.ipynb: Implements the final matrix factorization model with enhancements. This is the best model.

Source Code

The src/ folder contains modular implementations for various methods:

collaborative/: Scripts for user-based and item-based collaborative filtering.
content_based/: Code for content-based filtering using metadata and embeddings.
gnn/: Experiments with graph neural networks.
other/: Additional exploratory scripts.

Note: While these methods were part of our experimentation, they did not contribute to the final model used for predictions. They are included to provide insight into the approaches we explored during the project.

Getting Started

Ensure you are using Python 3.10, as some dependencies may not be compatible with newer versions of Python. You can set up a Conda environment and install the required packages as follows:

# Create and activate a Conda environment; 
conda create -n recommender_env python=3.10 -y
conda activate recommender_env

# Install dependencies
pip install -r requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Root

Data

Notebooks

Source Code

Getting Started

About

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
data		data
notebooks		notebooks
src		src
.gitignore		.gitignore
README.md		README.md
model.pt		model.pt
requirements.txt		requirements.txt
submission.csv		submission.csv

Stefanstud/Recommender-Systems

Folders and files

Latest commit

History

Repository files navigation

Overview

Root

Data

Notebooks

Source Code

Getting Started

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages