Image preprocessing package for automatic face alignment and cropping with additional features. It provides the following functionality:
- Face cropping - face alignment and center-cropping using facial landmarks. Landmarks can be automatically predicted or, if they are already know, can be supplied through a separate file. It is possible to specify face factor, i.e., face area relative to the cropped image, and face extraction strategy, e.g., all faces or largest face per image.
- Face enhancement - face image quality enhancement. If images are blurry or contain many small faces, quality enhancement model can be used to make the images clearer. Small faces in the image are automatically checked and enhanced if desired.
- Face parsing - face attribute parsing and cropped image grouping to sub-directories. Face images can be grouped according to some facial attributes or some combination, such as glasses, earrings and necklace, hats. It is also possible to generate masks for facial attributes or some combination of them, for instance, glasses, nose, nose and eyes.
Please see References section for more details about which models are used for each feature.
Note: each feature can be used separately, e.g., if you just need to enhance the quality of blurry photos, or if you just need to generate attribute masks (like hats, glasses, face parts).
The packages requires at least Python 3.10. You may also want to set up PyTorch in advance from here.
To install the package simply run:
pip install face-crop-plus
Or, to install it from source, run:
git clone https://github.com/mantasu/face-crop-plus
cd face-crop-plus && pip install .
You can run the package from the command line:
face-crop-plus -i path/to/images
You can also use it in a Python script:
from face_crop_plus import Cropper
cropper = Cropper(face_factor=0.7, strategy="largest")
cropper.process_dir(input_dir="path/to/images")
For a quick demo, you can experiment with demo.py file:
git clone https://github.com/mantasu/face-crop-plus
cd face-crop-plus/demo
python demo.py
For more examples, see Examples section.
Here, some of the main arguments are described that control the behavior of each of the features. These arguments can be specified via command line or when initializing the Cropper
class. For further details about how the Cropper
class works, please refer to the documentation.
The main feature is face alignment and cropping. The main arguments that control this feature:
-
landmarks
- if you don't want automatic landmark prediction and already have face landmark coordinates in a separate file, you can specify the path to it. See the table below for the expected file formats.File format Description .json
Expects a dictionary with the following example entries: 'image.jpg': [x1, y1, ...]
. I.e., keys are image file names and values are flattened arrays of face landmark coordinates..csv
Expects comma-separated values of where each line is of the form image.jpg,x1,y1,...
. Note that it also expects the first line to be a header..txt
and otherSimilar to CSV file, but each line is expected to have space-separated values of the form image.jpg x1 y1 ...
. No header is expected. -
output_size
- the output size of the cropped face images. Can be either a tuple of 2 values (weight, height) or a single value indicating square dimensions200 × 200 300 × 300 300 × 200 200 × 300 -
face_factor
- the fraction of the face area relative to the output image. The value is between 0 and 1 and, the larger the value, the larger the face is in the output image.0.4 0.55 0.7 0.85 -
padding
- the type of padding (border mode) to apply after cropping the images. If faces are near edges, the empty areas after aligning those faces will be filled with some values. This could be constant (leave black), replicate (repeat the last value of the edge in the original image), reflect (mirror the values before the edge).Constant Replicate Reflect Wrap -
det_threshold
- if automatic detection is desired, then detection threshold, which is a value between 0 and 1, can be specified to indicate when the detected face should be considered an actual face. Lower values allow more faces to be extracted, however they can be blurry and not non-primary, e.g., the ones in the background. Higher values only alow clear faces to be considered. It only makes sense to play around with this parameter whenstrategy
is specified to return more than one face, e.g., all. For example, if it is0.999
, the blurry face in the background in the examples above is not detected, however if the threshold0.998
, the face is still detected. For blurrier images, thresholds may differ. -
strategy
- the strategy to apply for cropping out images. This can be set to all, if all faces should be extracted from each image (suffixes will be added to each file name), largest, if only the largest faces should be considered (slowest), best if only the first face (which has the best confidence score) per image should be considered.
Quality enhancement feature allows to restore blurry faces. It has one main argument:
-
enh_threshold
- quality enhancement threshold that tells when the image quality should be enhanced. It is the minimum average face factor, i.e., face area relative to the image, below which the whole image is enhanced. Note that quality enhancement is an expensive operation, thus set this to a low value, like0.01
to only enhance images where faces are actually small. If your images are of reasonable quality and don't contain many tiny faces, you may want to set this to None (or to a negative value if using command-line) to disable the model. Here are some of the examples of the extracted faces before and after enhancing the image:Face 1 Face 2 Face 3 Face 4
Quality enhancement can be used as a separate feature to enhance images that contain faces. For an end user, it is a useful feature to boost the quality of photos. It is not suggested to enhance ultra high resolution images (>2000) because your GPU will explode. See Pure Enhancement/Parsing section on how to run it as a stand-alone.
Face parsing to attributes allows to group output images by category and generate attribute masks for that category. Categorized images are put to their corresponding sub-folders in the output directory.
-
attr_groups
- dictionary specifying attribute groups, based on which the face images should be grouped. Each key represents an attribute group name, e.g., glasses, earings and necklace, no accessories, and each value represents attribute indices, e.g.,[6]
,[9, 15]
,[-6, -9, -15, -18]
, each index mapping to some attribute. Since this model labels face image pixels, if there is enough pixels with the specified values in the list, the whole face image will be put into that attribute category. For negative values, it will be checked that the labeled face image does not contain those (absolute) values. If it is None, then there will be no grouping according to attributes. Here are some group examples with 2 sample images per group:Glasses
[6]
Earrings and necklace
[9, 15]
Hats, no glasses
[18, -6]
No accessories
[-6, -9, -15, -18]
-
mask_groups
- Dictionary specifying mask groups, based on which the face images and their masks should be grouped. Each key represents a mask group name, e.g., nose, eyes and eyebrows, and each value represents attribute indices, e.g.,[10]
,[2, 3, 4, 5]
, each index mapping to some attribute. Since this model labels face image pixels, a mask will be created with 255 (white) at pixels that match the specified attributes and zeros (black) elsewhere. Note that negative values would make no sense here and having them would cause an error. Images are saved to sub-directories named by the mask group and masks are saved to sub-directories under the same name, except with_mask
suffix. If it is None, then there will be no grouping according to mask groups. Here are some group examples with 1 sample image and its mask per group (for consistency, same images as before):Glasses
[6]
Earrings and necklace
[9, 15]
Nose
[10]
Eyes and eyebrows
[2, 3, 4, 5]
If both
attr_groups
andmask_groups
are specified, first images are grouped according to face attributes, then images in each groups are further sub-grouped to different mask groups (along with their masks).
Here are the 19 possible face attributes (with 0
representing the neutral category):
1 - skin |
7 - left ear |
13 - lower lip |
2 - left eyebrow |
8 - right ear |
14 - neck |
3 - right eyebrow |
9 - earrings |
15 - necklace |
4 - left eye |
10 - nose |
16 - clothes |
5 - right eye |
11 - mouth |
17 - hair |
6 - eyeglasses |
12 - upper lip |
18 - hat |
You can run the package via command line by providing the arguments as follows:
face-crop-plus -i path/to/images --output-size 200 300 --face-factor 0.75 -d cuda:0
You can specify the command-line arguments via JSON config file and provide the path to it. Further command-line arguments would overwrite the values taken from the JSON file.
face-crop-plus --config path/to/json --attr-groups '{"glasses": [6]}'
An example JSON config file is demo.json. If you've cloned the repository, you can run from it:
face-crop-plus --config demo/demo.json --device cuda:0 # overwrite device
For all the available command line arguments, just type (although refer to documentation for more details):
face-crop-plus -h
Note: you can use
fcp
asface-crop-plus
alias , e.g.,fcp -c config.json
If your image files contain non-ascii symbols, lengthy names, os-reserved characters, it may be better to standardize them. To do so, it is possible to rename the image files before processing them:
face-crop-plus -i path/to/images --clean-names # --clean-names-inplace (avoids temp dir)
It is possible to specify more arguments via python script. The function can be used in general with any file types:
from face_crop_plus.utils import clean_names
clean_names(
input_dir="path/to/input/dir",
output_dir=None, # will rename in-place
max_chars=250,
)
If you already have aligned and center-cropped face images, you can perform quality enhancement and face parsing without re-cropping them. Here is an example of enhancing quality of every face and parsing them to (note that none of the parameters described in Alignment and Cropping section have any affect here):
from face_crop_plus import Cropper
cropper = Cropper(
det_threshold=None,
enh_threshold=1, # enhance every image
attr_groups={"hats": [18], "no_hats": [-18]},
mask_groups={"hats": [18], "ears": [7, 8, 9]},
device="cuda:0",
)
cropper.crop(input_dir="path/to/images")
This would result in the following output directory structure:
└── path/to/images_faces
├── hats
| ├── hats # Images with hats
| ├── hats_mask # Hat masks for images in upper dir
| ├── ears # Images with hats and visible ears
| └── ears_mask # Ears masks for images in upper dir
|
└── no_hats
├── ears # Masks with no hats and visible ears
└── ears_mask # Ears masks for images in upper dir
To just enhance the quality of images (e.g., if you have blurry photos), you can run enhancement feature separately:
face-crop-plus -i path/to/images -dt -1 -et 1 --device cuda:0
To just generate masks for images (e.g., as part of your research pipeline), you can run segmentation feature separately. This will only consider images for which the masks are actually present.
face-crop-plus -i path/to/images -dt -1 -et -1 -mg '{"glasses": [6]}'
Please beware of the following:
- While you can perform quality enhancement on images of different sizes (because, due to large amount of computations, images are processed one by one), you cannot perform face parsing (attribute-based grouping/segmentation) if images have different dimensions (though a possible work around is to set the batch size to 1).
- It is not advised to perform quality enhancement after cropping the images since there is not enough information for the model on how to improve the quality. If you still need to enhance the quality after cropping, using larger image sizes, e.g.,
512×512
, may help. Regardless whether you use it before or after cropping, do not use input images of spatial size over2000×2000
, unless you have a powerful GPU.
Here is an example pipeline of how to pre-process CelebA dataset. It is useful if you want to customize the cropped face properties, e.g., face factor, output size. It only takes a few minutes to pre-process the whole dataset using multiple processors and the provided landmarks:
- Download the following files from Google Drive:
- Unzip the data:
7z x data/img_celeba.7z/img_celeba.7z.001 -o./data unzip data/annotations.zip -d data
- Create a script file, e.g.,
preprocess_celeba.py
, in the same directory:from face_crop_plus import Cropper from multiprocessing import cpu_count cropper = Cropper( output_size=256, face_factor=0.7, landmarks="data/landmark.txt", enh_threshold=None, num_processes=cpu_count(), ) cropper.process_dir("data/img_celeba")
- Run the script to pre-process the data:
python preprocess_celeba.py
- Clean up the data dir (remove the original images and the annotations):
rm -r data/img_celeba.7z data/img_celeba rm data/annotations.zip data/*.txt
-
When using
num_processes
, only set it to a larger value if you have enough GPU memory, or reducebatch_size
. Unless you only perform face cropping with already known landmarks and don't perform quality enhancement nor face parsing, in which case set it to the number of CPU cores you have. -
If you experience any of the following:
- RuntimeError: CUDA error: an illegal memory access was encountered.
- torch.cuda.OutOfMemoryError: CUDA out of memory.
- cuDNN error: CUDNN_STATUS_MAPPING_ERROR.
This is likely because you are processing images on too many processes or have a large batch size. If you run all 3 models on GPU, it may be helpful to just run on a single process with a larger batch size.
This package uses the code and the pre-trained models from the following repositories:
- PyTorch RetinaFace - 5-point landmark prediction
- BSRGAN - super resolution and quality enhancement
- Face Parsing PyTorch - grouping by face attributes and segmentation
If you find this package helpful in your research, you can cite the following:
@misc{face-crop-plus,
author = {Mantas Birškus},
title = {Face Crop Plus},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/mantasu/face-crop-plus}},
doi = {10.5281/zenodo.7856749}
}