The environment is tested with Ubuntu 20.04 and Python 3.8, with NVIDIA GPU plus CUDA enabled. Anaconda or Miniconda is recommended to install the running environment. All the packages dependencies can be found in e4s_env.yaml
, and it's convinient to create a conda environment via conda env create -f e4s_env.yaml
command.
💡 Hint: If you find some problems when installing dlib, please consider to install it from conda forge or build it manually.
If you plan to use SegNeXt-FaceParser as described in section 1.3.1 below, some extra effort is needed. Click >>this link<< to forward to the installation guidance of SegNeXt-FaceParser. mmcv-full==1.5.1 + mmcls==0.20.1 + latest SegNeXt-FaceParser version
is tested.
We provide a pre-trained RGI model that was trained on FFHQ dataset for 300K iterations, please fetch the model from this Google Drive link and place it in the pretrained_ckpts/e4s
folder.
We use face parser to estimate the facial segmentation. Currently, we provide the following two pre-trained face parsers:
-
face-parsing.PyTorch (the default one): repo
Please download the pre-trained model here, and place it in the
pretrained_ckpts/face_parsing
folder. -
SegNeXt-FaceParser: repo
Please download the pre-trained SegNeXt model (small | base), and place it in the pretrained_ckpts/face_parsing folder. The corresponding configuration files are already included in the
pretrained_ckpts/face_parsing
folder.
💡 Hint: The following FaceVid2Vid and GPEN are only applied for face swapping. If noly face editing is needed, just skip to Section 2 directly.
1.3.2 FaceVid2Vid: paper | unofficial-repo
This face reenactment model is applied to drive source face to show similar pose and expression as the target. Currently, we use zhanglonghao's impl. of FaceVid2Vid, where the pre-trained model can be downloaded here (Vox-256-New). Similarly, please put it in the pretrained_ckpts/facevid2vid
folder.
A face restoration model (GPEN) is used to improve the resolution of the intermediate driven face. You can execute the following script to fetch them automatically:
cd pretrained_ckpts/gpen
sh ./fetch_gpen_models.sh
Alternatively, you can download the pre-trained models manually as follows:
Model | download link |
---|---|
RetinaFace-R50 | https://public-vigen-video.oss-cn-shanghai.aliyuncs.com/robin/models/RetinaFace-R50.pth, for face detection |
RealESRNet_x4 | https://public-vigen-video.oss-cn-shanghai.aliyuncs.com/robin/models/realesrnet_x4.pth, for x4 super resolution |
GPEN-BFR-512 | https://public-vigen-video.oss-cn-shanghai.aliyuncs.com/robin/models/GPEN-BFR-512.pth, GEPN pre-trained model |
ParseNet | https://public-vigen-video.oss-cn-shanghai.aliyuncs.com/robin/models/ParseNet-latest.pth, for face parsing |
Make sure to place these checkpoint files in pretrained_ckpts/gpen/weights
folder.
After fetching these checkpoints, your pretrained_ckpts
folder should be same as:
pretrained_ckpts/
├── auxiliray (optional for training)
│ ├── model_ir_se50.pth
│ └── model.pth
├── e4s
│ └── iteration_300000.pt
├── face_parsing
│ ├── 79999_iter.pth
│ ├── segnext.tiny.512x512.celebamaskhq.160k.py
│ ├── segnext.tiny.best_mIoU_iter_160000.pth (optional)
│ ├── segnext.base.512x512.celebamaskhq.160k.py
│ ├── segnext.base.best_mIoU_iter_140000.pth (optional)
│ ├── segnext.small.512x512.celebamaskhq.160k.py
│ ├── segnext.small.best_mIoU_iter_140000.pth (optional)
│ ├── segnext.large.512x512.celebamaskhq.160k.py
│ └── segnext.large.best_mIoU_iter_150000.pth (optional)
├── facevid2vid
│ ├── 00000189-checkpoint.pth.tar
│ └── vox-256.yaml
├── gpen
│ ├── fetch_gepn_models.sh
│ └── weights
│ ├── GPEN-BFR-512.pth
│ ├── ParseNet-latest.pth
│ ├── realesrnet_x4.pth
│ └── RetinaFace-R50.pth
├── put_ckpts_accordingly.txt
└── stylegan2 (optional for training)
└── stylegan2-ffhq-config-f.pt