- 📢 News!
- My column
- Introduction
- All task training results
- Environments
- Download my pretrained models and experiments records
- Prepare datasets
- How to train or test a model
- How to use gradio demo
- Reference
- Citation
- 2025/01/18: train light segment-anything model with bf16.
https://www.zhihu.com/column/c_1692623656205897728
This repository provides simple training and testing examples for following tasks:
task | support dataset | support network |
---|---|---|
Image classification task | CIFAR100 ImageNet1K(ILSVRC2012) ImageNet21K(Winter 2021 release) |
Convformer DarkNet ResNet VAN ViT |
Knowledge distillation task | ImageNet1K(ILSVRC2012) | DML loss(ResNet) KD loss(ResNet) |
Masked image modeling task | ImageNet1K(ILSVRC2012) | MAE(ViT) |
Object detection task | COCO2017 Objects365(v2,2020) VOC2007 and VOC2012 |
DETR DINO-DETR RetinaNet FCOS |
Semantic segmentation task | ADE20K COCO2017 |
DeepLabv3+ |
Instance segmentation task | COCO2017 | SOLOv2 YOLACT |
Salient object detection task | combine dataset | pfan-segmentation |
Human matting task | combine dataset | pfan-matting |
OCR text detection task | combine dataset | DBNet |
OCR text recognition task | combine dataset | CTC Model |
Face detection task | combine dataset | RetinaFace |
Face parsing task | FaceSynthetics CelebAMask-HQ |
pfan-face-parsing sapiens_face_parsing |
Human parsing task | LIP CIHP |
pfan-human-parsing sapiens_human_parsing |
Interactive segmentation task | combine dataset | SAM(segment-anything) light_sam light_sam_matting |
Diffusion model task | CelebA-HQ CIFAR10 CIFAR100 FFHQ |
DDPM DDIM |
Most experiments were trained on 2-8 RTX4090D GPUs, pytorch2.3, ubuntu22.04.
See all task training results in results.md.
1、This repository only supports running on ubuntu(verison>=22.04 LTS).
2、This repository only support one node one gpu/one node multi gpus mode with pytorch DDP training.
3、Please make sure your Python environment version>=3.9 and pytorch version>=2.0.
4、If you want to use torch.complie() function,using pytorch2.0/2.2/2.3,don't use pytorch2.1.
Use pip or conda to install those Packages in your Python environment:
torch
torchvision
pillow
numpy
Cython
pycocotools
opencv-python
scipy
einops
scikit-image
pyclipper
shapely
imagesize
nltk
tqdm
yapf
onnx
onnxruntime
onnxsim
thop==0.1.1.post2209072238
gradio==3.50.0
transformers==4.41.2
open-clip-torch==2.24.0
If you want to use xformers,install xformers Packge from offical github repository:
https://github.com/facebookresearch/xformers
If you want to use dino-detr model,install MultiScaleDeformableAttention Packge in your Python environment:
cd to simpleAICV/detection/compile_multiscale_deformable_attention,then run commands:
chmod +x make.sh
./make.sh
You can download all my pretrained models and experiments records/checkpoints from huggingface or Baidu-Netdisk.
If you only want to download all my pretrained models(model.state_dict()),you can download pretrained_models folder.
# huggingface
https://huggingface.co/zgcr654321/0.classification_training/tree/main
https://huggingface.co/zgcr654321/1.distillation_training/tree/main
https://huggingface.co/zgcr654321/2.masked_image_modeling_training/tree/main
https://huggingface.co/zgcr654321/3.detection_training/tree/main
https://huggingface.co/zgcr654321/4.semantic_segmentation_training/tree/main
https://huggingface.co/zgcr654321/5.instance_segmentation_training/tree/main
https://huggingface.co/zgcr654321/6.salient_object_detection_training/tree/main
https://huggingface.co/zgcr654321/7.human_matting_training/tree/main
https://huggingface.co/zgcr654321/8.ocr_text_detection_training/tree/main
https://huggingface.co/zgcr654321/9.ocr_text_recognition_training/tree/main
https://huggingface.co/zgcr654321/10.face_detection_training/tree/main
https://huggingface.co/zgcr654321/11.face_parsing_training/tree/main
https://huggingface.co/zgcr654321/12.human_parsing_training/tree/main
https://huggingface.co/zgcr654321/13.interactive_segmentation_training/tree/main
https://huggingface.co/zgcr654321/20.diffusion_model_training/tree/main
https://huggingface.co/zgcr654321/pretrained_models/tree/main
# Baidu-Netdisk
链接:https://pan.baidu.com/s/1yhEwaZhrb2NZRpJ5eEqHBw
提取码:rgdo
Make sure the folder architecture as follows:
CIFAR10
|
|-----batches.meta unzip from cifar-10-python.tar.gz
|-----data_batch_1 unzip from cifar-10-python.tar.gz
|-----data_batch_2 unzip from cifar-10-python.tar.gz
|-----data_batch_3 unzip from cifar-10-python.tar.gz
|-----data_batch_4 unzip from cifar-10-python.tar.gz
|-----data_batch_5 unzip from cifar-10-python.tar.gz
|-----readme.html unzip from cifar-10-python.tar.gz
|-----test_batch unzip from cifar-10-python.tar.gz
Make sure the folder architecture as follows:
CIFAR100
|
|-----train unzip from cifar-100-python.tar.gz
|-----test unzip from cifar-100-python.tar.gz
|-----meta unzip from cifar-100-python.tar.gz
Make sure the folder architecture as follows:
ILSVRC2012
|
|-----train----1000 sub classes folders
|-----val------1000 sub classes folders
Please make sure the same class has same class folder name in train and val folders.
Make sure the folder architecture as follows:
ImageNet21K
|
|-----train-----------10450 sub classes folders
|-----val-------------10450 sub classes folders
|-----small_classes---10450 sub classes folders
|-----imagenet21k_miil_tree.pth
Please make sure the same class has same class folder name in train and val folders.
Make sure the folder architecture as follows:
ACCV2022
|
|-----train-------------5000 sub classes folders
|-----testa-------------60000 images
|-----accv2022_broken_list.json
Make sure the folder architecture as follows:
VOCdataset
| |----Annotations
| |----ImageSets
|----VOC2007------|----JPEGImages
| |----SegmentationClass
| |----SegmentationObject
|
| |----Annotations
| |----ImageSets
|----VOC2012------|----JPEGImages
| |----SegmentationClass
| |----SegmentationObject
Make sure the folder architecture as follows:
COCO2017
| |----captions_train2017.json
| |----captions_val2017.json
|--annotations---|----instances_train2017.json
| |----instances_val2017.json
| |----person_keypoints_train2017.json
| |----person_keypoints_val2017.json
|
| |----train2017
|----images------|----val2017
Make sure the folder architecture as follows:
SAMA-COCO
| |----sama_coco_train.json
| |----sama_coco_validation.json
|--annotations---|----train_labels.json
| |----validation_labels.json
| |----test_labels.json
| |----image_info_test2017.json
| |----image_info_test-dev2017.json
|
| |----train
|----images------|----validation
Make sure the folder architecture as follows:
objects365_2020
|
| |----zhiyuan_objv2_train.json
|--annotations---|----zhiyuan_objv2_val.json
| |----sample_2020.json
|
| |----train all train patch folders
|----images------|----val all val patch folders
|----test all test patch folders
Make sure the folder architecture as follows:
ADE20K
| |----training
|---images--------|----validation
| |----testing
|
| |----training
|---annotations---|----validation
Make sure the folder architecture as follows:
CelebA-HQ
| |----female
|---train---------|----male
|
| |----female
|---val-----------|----male
Make sure the folder architecture as follows:
FFHQ
|
|---images
|---ffhq-dataset-v1.json
|---ffhq-dataset-v2.json
If you want to train or test a model,you need enter a training experiment folder directory,then run train.sh or test.sh.
For example,you can enter in folder classification_training/imagenet/resnet50.
If you want to restart train this model,please delete checkpoints and log folders first,then run train.sh:
CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.run --nproc_per_node=2 --master_addr 127.0.1.0 --master_port 10000 ../../../tools/train_classification_model.py --work-dir ./
if you want to test this model,you need have a pretrained model first,modify trained_model_path in test_config.py,then run test.sh:
CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.run --nproc_per_node=1 --master_addr 127.0.1.0 --master_port 10000 ../../../tools/test_classification_model.py --work-dir ./
CUDA_VISIBLE_DEVICES is used to specify the gpu ids for this training.Please make sure the number of nproc_per_node equal to the number of using gpu cards.Make sure master_addr/master_port are unique for each training.
Checkpoints/log folders are saved in your executing training/testing experiment folder directory.
Also, You can modify super parameters in train_config.py/test_config.py.
cd to gradio_demo,we have:
classification demo
detection demo
semantic_segmentation demo
instance_segmentation demo
salient_object_detection demo
human_matting demo
text_detection demo
text_recognition demo
face_detection demo
face_parsing demo
human_parsing demo
point target segment_anything demo
circle target segment_anything demo
For example,you can run detection gradio demo(please prepare trained model weight first and modify model weight load path):
python gradio_detect_single_image.py
https://github.com/facebookresearch/segment-anything
https://github.com/facebookresearch/sam2
If you find my work useful in your research, please consider citing:
@inproceedings{zgcr,
title={SimpleAICV-pytorch-training-examples},
author={zgcr},
year={2020-2024}
}