The model implementation is written in PyTorch and is available in the open repository on GitHub. The repository contains the project code, pre-trained MobileNet-V1 networks, and a preprocessed dataset for training and testing. On inference, 3DDFA processes an image in 0.27 milliseconds on a GeForce GTX TITAN X.
By Jianzhu Guo. [Updates] 2020.8.30: The pre-trained model and code of ECCV-20 are made public on 3DDFA_V2, the…
3DDFA — Network architecture
3DDFA combines cascade regression and convolutional networks. CNN is used as a regressor in a cascading convolutional network. The framework consists of four components: regression functionality, image features, convolutional network structures, and an error function for training the model.
The neural network works in two streams:
- In the first stream with an intermediate learning parameter, the Projected Normalized Coordinate Code (PNCC) is constructed, which, together with the input image, is sent to the CNN input;
- On the second stream, the model receives feature anchors with consistent semantics and conducts Pose Adaptive Convolution (PAC) on them
Outputs from the two streams are combined using an additional fully connected layer that predicts the intermediate parameter update.
3DDFA — Approach performance evaluation
The researchers compared the proposed 3DDFA with state-of-the-art techniques for 3D facial markup. The models were tested on AFLW and AFLW2000–3D datasets. NME (%) was used as a metric. Below you can see that the proposed approach produces results that are higher or comparable to state-of-the-art.
Interested in Deep Learning?
If you found this article helpful, click the💚 or 👏 button below or share the article on Facebook so your friends can benefit from it too.