In this repository, we create an AI model to perform semantic segmentation for Anime Portrait.
Fist, We need training data about original image and it's corresponding mask image, so we annotate 200 anime portraits (512x512) from "Danbooru2019-Portraits Dataset".
Note: Totally there are 7 tags (background、skin、face、cloth、eye、mouth、hair) we need to annotate!
Because 200 images are not enough for predicting semantic mask, therefore we use following data augmentation to create fake data!
- horizontal、vertical filp
- GridDistortion
- RandomBrightnessContrast
- GaussNoise
- Rotation
After doing data augementation, we get roughly 3000 paired datas to train our semantic segmentation model.
In AI model, We use MobileNetV3 as encoder and Unet as decoder to complete this task, it can easliy done by using this repo!
model = smp.Unet('timm-mobilenetv3_large_100', encoder_weights='imagenet', classes=class_num, activation=None, encoder_depth=5, decoder_channels=[256, 128, 64, 32, 16]).to("cuda")
details | |
---|---|
OS | Windows10 |
CPU | AMD |
GPU | NVIDIA RTX 2060 6GB |
language | Python |
framework | pytorch |
[1] Deep Learning Project — Drawing Anime Face with Simple Segmentation Mask