Skip to content

Stefanstud/Recommender-Systems

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview

This repository contains the following components:

Root

  • model.pt and submission.csv in the root of the project are the best model's weights and predictions.

Data

The data/ folder includes the datasets and embeddings used in the project:

  • train.csv: Training dataset provided.
  • test.csv: Test dataset provided.
  • books.csv: Original books dataset.
  • books_fixed.csv: Fixed version of the books dataset, addressing issues announced on the Ed forum.
  • extended_books_google.csv: Books dataset augmented with metadata from the Google Books API.
  • book_text_embeddings_bge.pt: Textual metadata embeddings generated using the BGE embedding model.
  • category_embeddings.npy: Category embeddings created using GloVe word embeddings.

Notebooks

The notebooks/ folder contains notebooks for data preprocessing, and model experimentation:

  1. fetch_book_metadata.ipynb: Extracts metadata using the Google Books API (alternative options for Open Book Library API included as comments).
  2. build_embeddings.ipynb: Generates book and category embeddings using BGE and GloVe models.
  3. fix_isbn.ipynb: Cleans and fixes the dataset, producing books_fixed.csv.
  4. item_collaborative.ipynb: Implements item-based collaborative filtering.
  5. user_collaborative.ipynb: Implements user-based collaborative filtering.
  6. matrix_factorization_best_approach.ipynb: Implements the final matrix factorization model with enhancements. This is the best model.

Source Code

The src/ folder contains modular implementations for various methods:

  • collaborative/: Scripts for user-based and item-based collaborative filtering.
  • content_based/: Code for content-based filtering using metadata and embeddings.
  • gnn/: Experiments with graph neural networks.
  • other/: Additional exploratory scripts.

Note: While these methods were part of our experimentation, they did not contribute to the final model used for predictions. They are included to provide insight into the approaches we explored during the project.


Getting Started

Ensure you are using Python 3.10, as some dependencies may not be compatible with newer versions of Python. You can set up a Conda environment and install the required packages as follows:

# Create and activate a Conda environment; 
conda create -n recommender_env python=3.10 -y
conda activate recommender_env

# Install dependencies
pip install -r requirements.txt

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •