Skip to content

PuchatekwSzortach/combination_of_multiple_global_descriptors_for_image_retrieval

Repository files navigation

Combination of Multiple Global Descriptors for Image Retrieval

This repository contains a Tensorflow 2.2 implementation of image ranking model based on following research papers:

Two models are provided:

Two types of losses are provided:

Implementation details

There are a some differences between official CGD architecture and our implementation. Authors of CGD paper modify backbone ResNet-50 network so there is no downsampling between stage 3 and stage 4, resulting in higher-resolution outputs from base network. We don't modify base network in any way.

We include script for training and evaluation on Stanford University's Cars 196 Dataset. While CGD paper crops exact cars locations from raw images and then resizes results to fixed size, we instead first pad raw images to squares, and then resize to fixed size.

Results

Results are based on Stanford University's Cars 196 Dataset.

Somewhat different from results reported in Combination of Multiple Global Descriptors for Image Retrieval, the best results were obtained using model with SPoC (sum from channels) head only. Adding MAC and GeM heads brings down accuracy by ~5%.

k Recall at k
1 0.717
2 0.814
4 0.880
8 0.931

Image below shows representative ranking performance. Each row starts with a query image, marked with a blue dot, followed by top 8 ranked images for that query. Images with same category as query image are marked with a green dot. Validation set contains about 8,000 images, with, on average, about 80 images per category.

Alt results

How to run

This project can be run in a docker container. Building the docker container is a two stage process:

  • building app_base container (./docker/app_base.Dockerfile)
  • buiding app container (./docker/app.Dockerfile)

You can build containers manually with docker, but there are also invoke tasks provided:

  • invoke docker.build-app-base-container - builds base container. Based on tensorflow/tensorflow:2.2.0-gpu container, downloads weights for base network, installs python requirements
  • invoke docker.build-app-container - based on app-base container created with command above, creates user, paths, environment variables, mounts code

You can then start the container manually, or with provided invoke docker.run command that takes care of mounting paths for data volume. invoke docker.run asks for password to execute sudo chmod on data volume path, so that docker container has write permissions on host system. This is necessary because user inside docker container isn't root. It's easy to modify code to change this behaviour if you don't need to access any outputs on host system.

Once inside container, following key invoke commands are available:

  • analysis.analyze-model-performance Analyze model performance
  • ml.train Train model
  • visualize.visualize-data Visualize data
  • visualize.visualize-predictions-on-batches Visualize image similarity ranking predictions on a few batches of data
  • visualize.visualize-predictions-on-dataset Visualize image similarity ranking predictions on a few

Most commands accept --config-path argument that accepts a path pointing to a configuration file. Sample configuration file is provided at ./config.yaml.

How to extend

Should you want to use this code to train and predict on a different data than Cars 196, you would need to:

  • provide your own data loaders for training and analyzing - please refer to net.data.Cars196TrainingLoopDataLoader and net.data.Cars196AnalysisDataLoader for sample implementations
  • provide a yaml configuration file pointing to paths with your data - please refer to config.yaml for the expected format

Honorable mentions

In addition to research papers listed above, following works were consulted during making of this project:

  • Olivier Moindrot for his post on triples loss that helped me troubleshoot problem with training divergence when distance between two embeddings was 0
  • leftthomas for his PyTorch implementation of CGD that I consulted for implementation of descriptors implementations

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published