vit_cluster

# Image clustering using pre-trained ViT

Use a Vision Transformer pre-trained through the DINO protocol to extract features from images. Reduce dimensionality and visualize using t-SNE and cluster the images with HDBSCAN. Can be run as-is or deployed using a docker image. Developed to cluster large underwater datasets captured by marine robots.

The images can be overlaid on their embeddings to better understand the latent space.

Usage

Build the docker image docker build -t name_of_image:tag .
In run_docker.sh Change the file path of the mounted volume to your directory containing the images to be clustered.
Run the container: ./run_docker.sh

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
assets		assets
Dockerfile		Dockerfile
README.md		README.md
dino_deitsmall16_pretrain.pth		dino_deitsmall16_pretrain.pth
main.py		main.py
requirements.txt		requirements.txt
run_docker.sh		run_docker.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

vit_cluster

# Image clustering using pre-trained ViT

Usage

About

Releases

Packages

Languages

surajbijjahalli/vit_cluster

Folders and files

Latest commit

History

Repository files navigation

vit_cluster

# Image clustering using pre-trained ViT

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages