SignWriting Illustration

People without previous SignWriting experience have a hard time understanding SignWriting notation.

This project aims to provide an alternative view to SignWriting, using computer generated illustrations of the signs.

Data

We use multiple data sources with SignWriting and illustrations:

Vokabeltrainer - Swiss-German lexicon
SignPuddle LSF-CH - Swiss-French lexicon

The illustrations are of different people, usually in grayscale. We use ChatGPT to generate the prompt to describe every illustration.

Examples

	00004	00007	00015
Video
SignWriting
Illustration
Prompt	An illustration of a person with short hair, with black arrows.	An illustration of a woman with short hair, with black arrows.	An illustration of a man with short hair. The arrows are black.

All images are then created at 512x512, for example: An illustration of a woman with short hair, with orange arrows. The background is white and there is a watermark text '@signecriture.org

control	illustration

Training

Prompt information

The prompt should include if this is an image or an illustration, if it colored or black and white, man or woman, hair style, and watermark. (see train/prompt.json for values)

Data Preparation

create_images.py - Generate parallel images - we create parallel files with the same name in directories train/A and train/B to include the SignWriting (B) and illustration (A) in the same resolution (512x512).
create_prompts.py - Generate prompts - we use ChatGPT to generate the prompt for every illustration. All of the prompts are then stored in train/prompt.json. (a JSONL file with {source: ..., target: ..., prompt: ...}). Cost per 1000 illustrations is about $5.

Model Training

We train a ControlNet model to control Stable Diffusion given the prompt and SignWriting image, generate the relevant illustration. This process benefits from the pretrained generative image diffusion model.

Inference

In inference time, we still give the control image of the new SignWriting image, but can control for the prompt. For example, we can always say "An illustration of a man with short hair." for consistency of character. This also removes any watermarks from the data, since watermarked illustrations are prompted with the watermark.

As diffusion models struggle to generate illustrations, we use the image-to-image pipeline with an initial white image. Unfortunately, while the model generates illustrations, they do not follow the SignWriting.

Here is a comparison of the results:

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.github/workflows		.github/workflows
assets		assets
datasets		datasets
signwriting_illustration		signwriting_illustration
train		train
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SignWriting Illustration

Data

Examples

Training

Prompt information

Data Preparation

Model Training

Inference

ControlNet Pipeline

ControlNet Image-to-Image Pipeline

About

Contributors 2

Languages

sign-language-processing/signwriting-illustration

Folders and files

Latest commit

History

Repository files navigation

SignWriting Illustration

Data

Examples

Training

Prompt information

Data Preparation

Model Training

Inference

ControlNet Pipeline

ControlNet Image-to-Image Pipeline

About

Resources

Stars

Watchers

Forks

Contributors 2

Languages