🌮 TACOSS : Learning transferable land cover semantics for open vocabulary interactions with remote sensing images.

Valerie Zermatten , Javiera Castillo-Navarro , Diego Marcos , Devis Tuia

Overview

This repository proposes Text As supervision for COntrastive Semantic Segmentation (TACOSS), an open vocabulary semantic segmentation model for remote sensing images. TACOSS leverages the common sense knowledge captured by language models and is capable of interpreting the image at the pixel level, attributing semantics to each pixel and removing the constraints of a fixed set of land cover labels.

This project intends to not only simplify the map creation process but also bridge the gap between complex remote sensing technology and user-friendly applications, eventually making advanced mapping tools accessible to everyone.

Code usage

Interested in trying out?

First install the necessary dependencies and download the models/data.

Code requirements (installation)

Required Python packages are listed in the requirements.yml which can be used to build a conda environment.

conda env create --file environment.yml
conda activate tacoss

Or use the provided "Dockerfile"

Data requirements

For trying out the TACOSS, please download the model weights and embeddings available in Zenodo: .

First, clone this repository, then copy the model weights files in the folder /output, and the labels embeddings in the folder /data

The FLAIR dataset is available on the IGN website FLAIR challenge.
The TLM aerial images can be downloaded from the swissIMAGE 10cm website
The TLM annotations can be downloaded as shapefile 'BodenBeckdung' on the swissTLM3d website
- The TLM dataset as used in this repository can be provided on request by contacting the authors.

Model training

Several configuration files are provided in the config folder. To launch experiments based on the existing configuration files, use the following command line :

python main.py --cfg <config_name>

# Train Segformer baseline model :
python main.py --cfg segformer-base

# Train DeepLabv3+ baseline model :
python main.py --cfg dlv-base

# Train TACOSS with the SegFormer visual backbone and the SentenceBERT text encoder : 
python main.py --cfg segformer-bcos-sbert-des-eda

# Train TACOSS with the DeepLabv3+ backbone and CLIP text encoder :
python main.py --cfg dlv-bcos-clip-name

Experiments with the CLIPSeg model require a specific dataset class for its training and inference since CLIPSeg is trained as a binary segmentation task with a binary cross-entropy loss. To train and evaluate CLIPSeg, use the CLIPSeg folder and the CLIPSegFinetune.py script.

Results

Qualitative performance of TACOSS on the FLAIR dataset :

Qualitative performance of TACOSS on the TLM dataset (in a transfert setting ) :

Fig. 1 : Aerial view :

Fig. 2 : TLM labels :

Fig. 3: TACOSS predictions:

More examples can be found in the associated publication [under review].

Future directions

This project proposes the development of remote sensing-specific vision-language models to facilitate interactions with RS images. Our work showed a proof of principle.

In principle, to be more usable, TACOSS requires multiple improvements:

Extend TACOSS to more geographical regions, sensors and spatial resolution. Currently, the model is trained only on high-resolution (30cm) images with RGB bands.
Improve fine-tuning of TACOSS from a few land cover labels to a larger set of labels and a more diverse description of land cover.
Improve open-vocabulary capabilities of TACOSS.

Contributing

If you are interested in contributing to one of the aforementioned points or working on a similar project and wish to collaborate, please reach out to ECEO.

For code-related contributions, suggestions or inquiries, please open a GitHub issue.

Code acknowledgments

We acknowledge the following code repositories that helped to build the TACOSS repository :

Thank you! Other smaller sources are mentioned in the relevant code sections.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
CLIPSeg		CLIPSeg
config		config
resources		resources
utils		utils
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌮 TACOSS : Learning transferable land cover semantics for open vocabulary interactions with remote sensing images.

Overview

Code usage

Code requirements (installation)

Data requirements

Model training

Results

Future directions

Contributing

Code acknowledgments

About

Releases

Packages

Languages

License

eceo-epfl/RS-OVSS

Folders and files

Latest commit

History

Repository files navigation

🌮 TACOSS : Learning transferable land cover semantics for open vocabulary interactions with remote sensing images.

Overview

Code usage

Code requirements (installation)

Data requirements

Model training

Results

Future directions

Contributing

Code acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages