Trueface SDK Benchmarks
A typical 1 to N face recognition pipeline involves the following steps:
- Preprocess Image
- Face and landmark detection
- Face recognition template extraction
- 1 to N identification search
To estimate the total pipeline time, add up the time for each individual step. You may also need to account for the time taken to decode your video stream.
If you chose to run face recognition on every face in your image, you will need repeat steps 3 and 4 for every detected face.
Trueface SDK Benchmarks: CPU
Operation | Ram Req. | i9-11900H | AWS c6i.2xlarge (x86) |
Preprocess image (JPG from disk) | - | 4.6 ms | 6.4 ms |
Preprocess image (encoded JPG in memory) | - | 4.2 ms | 5.3 ms |
Preprocess image (RGB pixels array in memory) | - | 0.08 ms | 0.16 ms |
Face image orientation detection | 68 Mb | 44.6 ms | 48.0 ms |
Face and landmark detection | 42 Mb | 7.9 ms | 12.5 ms |
106 face landmark detection | 39 Mb | 2.2 ms | 2.7 ms |
Head orientation detection (yaw, pitch, roll) | 52 Mb | 0.4 ms | 0.5 ms |
Face image blur detection | 109 Mb | 8.4 ms | 12.2 ms |
Face template quality estimation | 115 Mb | 2.5 ms | 3.2 ms |
Blink detection | 100 Mb | 6.1 ms | 9.7 ms |
Mask detection | 118 Mb | 3.0 ms | 3.6 ms |
Eyeglasses detection | 40 Mb | 0.3 ms | 0.3 ms |
Passive spoof detection | 140 Mb | 24.1 ms | 30.7 ms |
Object detection (fast) | 70 Mb | 29.3 ms | 39.8 ms |
Object Detection (accurate) | 80 Mb | 121.5 ms | 155.2 ms |
Face recognition LITE model | 54 Mb | 3.2 ms | 3.4 ms |
Face recognition LITE_V2 model | 76 Mb | 6.3 ms | 10.2 ms |
Face recognition LITE_V3 model | 100 Mb | 6.5 ms | 8.5 ms |
Face recognition TFV5_2 model | 260 Mb | 36.0 ms | 49.9 ms |
Face recognition TFV6 model | 260 Mb | 36.1 ms | 49.9 ms |
Face recognition TFV7 model | 500 Mb | 67.2 ms | 94.2 ms |
1 to N identification search (N = 1,000,000) TFV7 | 1250 bytes / template | 26.9 ms | 21.4 ms |
1 to N batch identification search (N = 1,000,000) TFV7 | 1250 bytes / template | 10.9 ms | 15.5 ms |
1 to N identification search times scale linearly relative to collection size (ex. for a collection of size 10,000, divide the above reported times by 100). frVectorCompression
flag set to true
for 1 to N benchmarks.
All benchmarks performed using 1280x720 pixel images containing 1 face or object with CPU only. Ram usage refers to maximum resident memory. Batch identification tested with 100 probe templates, and is used to increase throughput. Enrollment template size represents conservative average case, but it can be variable due to variable length in identity string. All benchmarks run on Ubuntu 20.04. A Licensee’s results may vary depending on the SDK model and the nature of its input images and data.
Trueface SDK Benchmarks: GPU
Operation | VRAM Usage | RTX 4090 | AWS g4dn.xlarge (T4) |
Preprocess image (JPG from disk) | - | 6.3 ms | 7.7 ms |
Preprocess image (encoded JPG in memory) | - | 6.3 ms | 7.6 ms |
Preprocess image (RGB pixels array in memory) | - | 0.12 ms | 0.14 ms |
Face image orientation detection | 506 Mb | 0.61 ms | 2.2 ms |
Face image orientation detection, batch size = 16 | 506 Mb | 0.47 ms | 1.8 ms |
Face and landmark detection | 130 Mb | 1.6 ms | 1.8 ms |
106 face landmark detection | 148 Mb | 0.36 ms | 0.47 ms |
106 face landmark detection, batch size = 16 | 148 Mb | 0.08 ms | 0.20 ms |
Head orientation detection (yaw, pitch, roll) | 160 Mb | 0.54 ms | 0.80 ms |
Face image blur detection | 254 Mb | 0.27 ms | 0.82 ms |
Face image blur detection, batch size = 16 | 254 Mb | 0.09 ms | 0.41 ms |
Face template quality estimation | 250 Mb | 0.23 ms | 0.69 ms |
Face template quality estimation, batch size = 16 | 250 Mb | 0.08 ms | 0.33 ms |
Blink detection | 154 Mb | 0.65 ms | 0.93 ms |
Blink detection, batch size = 16 | 154 Mb | 0.35 ms | 0.55 ms |
Mask detection | 160 Mb | 0.50 ms | 0.9 ms |
Mask detection, batch size = 16 | 160 Mb | 0.07 ms | 0.17 ms |
Passive spoof detection | 278 Mb | 5.3 ms | 6.9 ms |
Object detection (fast) | 160 Mb | 4.2 ms | 6.3 ms |
Object Detection (accurate) | 234 Mb | 13.6 ms | 21.9 ms |
Face recognition LITE_V2 model | 144 Mb | 0.66 ms | 0.95 ms |
Face recognition LITE_V2 model, batch size = 16 | 144 Mb | 0.12 ms | 0.43 ms |
Face recognition LITE_V3 model | 214 Mb | 0.55 ms | 1.5 ms |
Face recognition LITE_V3 model, batch size = 16 | 214 Mb | 0.17 ms | 0.77 ms |
Face recognition TFV5_2 model | 332 Mb | 1.1 ms | 3.1 ms |
Face recognition TFV5_2 model, batch size = 16 | 332 Mb | 0.36 ms | 1.8 ms |
Face recognition TFV6 model | 328 Mb | 1.2 ms | 3.2 ms |
Face recognition TFV6 model, batch size = 16 | 328 Mb | 0.37 ms | 1.8 ms |
Face recognition TFV7 model | 430 Mb | 2.2 ms | 6.0 ms |
Face recognition TFV7 model, batch size = 16 | 430 Mb | 0.69 ms | 3.2 ms |
1 to N identification search is performed on CPU only, so refer to CPU times.
How are batching times reported:
With a batch size of 4, we generate 4 face recognition templates at the same time. The total time taken to generate those templates is 16.4 ms, meaning the average time per template is 4.1 ms.
With a batch size of 64, we generate 64 face recognition templates at the same time. The total time taken to generate those templates is 136.96 ms, meaning the average time per template is 2.14 ms.
Operations which do not show a batch size only support a batch size of 1 at this time.
All benchmarks performed using 1280x720 pixel images containing 1 face or object with GPU enabled. GPU benchmarks use FP16 inference. A Licensee’s results may vary depending on the SDK model and the nature of its input images and data.
Trueface On-Prem Benchmarks
Operation | Ram Req. | AWS c6i.2xlarge (CPU) | AWS g4dn.xlarge (GPU) |
Templatize Face | 2.1 Gb | 104 ms | 31 ms |
Match Faces | 2.1 Gb | 249 ms | 60 ms |
Enroll Face | 2.1 Gb | 93 ms | 30 ms |
Identify Face (N = 1000) | 2.1 Gb | 119 ms | 31 ms |
Spoof Detection | 2.5 Gb | 78 ms | 35 ms |
All benchmarks performed using 1280x720 px jpg image on disk using default smallest face height of 100 and TFV5_2 face recognition model. Requests were sent from the same machine running the PTOP server in order to avoid any network overhead in the measurements. A Licensee’s results may vary depending on the SDK model and the nature of its input images and data.