Skip to content

Latest commit

 

History

History
18 lines (13 loc) · 790 Bytes

README.md

File metadata and controls

18 lines (13 loc) · 790 Bytes

vit_cluster

# Image clustering using pre-trained ViT

Use a Vision Transformer pre-trained through the DINO protocol to extract features from images. Reduce dimensionality and visualize using t-SNE and cluster the images with HDBSCAN. Can be run as-is or deployed using a docker image. Developed to cluster large underwater datasets captured by marine robots. cluster visualization

The images can be overlaid on their embeddings to better understand the latent space.

image overlay

Usage

  • Build the docker image docker build -t name_of_image:tag .
  • In run_docker.sh Change the file path of the mounted volume to your directory containing the images to be clustered.
  • Run the container: ./run_docker.sh