Skip to content

Open vocabulary interactions with remote sensing images.

License

Notifications You must be signed in to change notification settings

eceo-epfl/RS-OVSS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🌮 TACOSS : Learning transferable land cover semantics for open vocabulary interactions with remote sensing images.

Valerie Zermatten ORCID logo, Javiera Castillo-Navarro ORCID logo, Diego Marcos ORCID logo, Devis TuiaORCID logo

Overview

This repository proposes Text As supervision for COntrastive Semantic Segmentation (TACOSS), an open vocabulary semantic segmentation model for remote sensing images. TACOSS leverages the common sense knowledge captured by language models and is capable of interpreting the image at the pixel level, attributing semantics to each pixel and removing the constraints of a fixed set of land cover labels.

This project intends to not only simplify the map creation process but also bridge the gap between complex remote sensing technology and user-friendly applications, eventually making advanced mapping tools accessible to everyone.

Overview

Code usage

Interested in trying out?

First install the necessary dependencies and download the models/data.

Code requirements (installation)

Required Python packages are listed in the requirements.yml which can be used to build a conda environment.

conda env create --file environment.yml
conda activate tacoss

Or use the provided "Dockerfile"

Data requirements

For trying out the TACOSS, please download the model weights and embeddings available in Zenodo: DOI.

First, clone this repository, then copy the model weights files in the folder /output, and the labels embeddings in the folder /data

  • The FLAIR dataset is available on the IGN website FLAIR challenge.

  • The TLM aerial images can be downloaded from the swissIMAGE 10cm website

  • The TLM annotations can be downloaded as shapefile 'BodenBeckdung' on the swissTLM3d website

    • The TLM dataset as used in this repository can be provided on request by contacting the authors.

Model training

Several configuration files are provided in the config folder. To launch experiments based on the existing configuration files, use the following command line :

python main.py --cfg <config_name>

# Train Segformer baseline model :
python main.py --cfg segformer-base

# Train DeepLabv3+ baseline model :
python main.py --cfg dlv-base

# Train TACOSS with the SegFormer visual backbone and the SentenceBERT text encoder : 
python main.py --cfg segformer-bcos-sbert-des-eda

# Train TACOSS with the DeepLabv3+ backbone and CLIP text encoder :
python main.py --cfg dlv-bcos-clip-name

Experiments with the CLIPSeg model require a specific dataset class for its training and inference since CLIPSeg is trained as a binary segmentation task with a binary cross-entropy loss. To train and evaluate CLIPSeg, use the CLIPSeg folder and the CLIPSegFinetune.py script.

Results

Qualitative performance of TACOSS on the FLAIR dataset :

interactions

Qualitative performance of TACOSS on the TLM dataset (in a transfert setting ) :

Fig. 1 : Aerial view :

AerialView

Fig. 2 : TLM labels :

Labels

Fig. 3: TACOSS predictions:

Predictions

More examples can be found in the associated publication [under review].

Future directions

This project proposes the development of remote sensing-specific vision-language models to facilitate interactions with RS images. Our work showed a proof of principle.

In principle, to be more usable, TACOSS requires multiple improvements:

  • Extend TACOSS to more geographical regions, sensors and spatial resolution. Currently, the model is trained only on high-resolution (30cm) images with RGB bands.
  • Improve fine-tuning of TACOSS from a few land cover labels to a larger set of labels and a more diverse description of land cover.
  • Improve open-vocabulary capabilities of TACOSS.

Contributing

If you are interested in contributing to one of the aforementioned points or working on a similar project and wish to collaborate, please reach out to ECEO.

For code-related contributions, suggestions or inquiries, please open a GitHub issue.

Code acknowledgments

We acknowledge the following code repositories that helped to build the TACOSS repository :

Thank you! Other smaller sources are mentioned in the relevant code sections.

About

Open vocabulary interactions with remote sensing images.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published