GPU 性能数据¶
测试条件¶
- 测试模型 - MobileNetV1 
- MobileNetV2 
- ResNet50 
- mask_rcnn_r50_vd_fpn_1x_coco 
- ssdlite_mobilenet_v1_300_coco 
- yolov3_darknet53_270e_coco 
- deeplabv3p_resnet50 
- bert 
- ViT_base_patch32_384 
- SwinTransformer_base_patch4_window12_384 
 
- 测试机器信息 - NVIDIA® T4 GPU 
- Intel(R) Xeon(R) Gold 6271C CPU @ 2.60GHz 
- CUDA 11.2.2 
- cuDNN 8.2.1 
- TensorRT 8.0.3.4 
 
- 测试说明 - 测试 PaddlePaddle 版本:v2.3 
- warmup=10,repeats=1000,统计平均时间,单位为 ms。 
- cpu_math_library_num_threads=1,num_samples=1000。 
 
数据¶
| model_name | precision | batch_size | avg_latency | 
|---|---|---|---|
| MobileNetV1 | fp16 | 1 | 0.4925 | 
| MobileNetV1 | fp16 | 2 | 0.7485 | 
| MobileNetV1 | fp16 | 4 | 1.2914 | 
| MobileNetV1 | fp32 | 1 | 0.8737 | 
| MobileNetV1 | fp32 | 2 | 1.4106 | 
| MobileNetV1 | fp32 | 4 | 2.5238 | 
| MobileNetV2 | fp16 | 1 | 0.5926 | 
| MobileNetV2 | fp16 | 2 | 0.9131 | 
| MobileNetV2 | fp16 | 4 | 1.4491 | 
| MobileNetV2 | fp32 | 1 | 1.1125 | 
| MobileNetV2 | fp32 | 2 | 1.6682 | 
| MobileNetV2 | fp32 | 4 | 2.819 | 
| ResNet50 | fp16 | 1 | 1.3045 | 
| ResNet50 | fp16 | 2 | 1.8964 | 
| ResNet50 | fp16 | 4 | 3.1821 | 
| ResNet50 | fp32 | 1 | 3.5244 | 
| ResNet50 | fp32 | 2 | 5.2147 | 
| ResNet50 | fp32 | 4 | 9.3702 | 
| SwinTransformer_base_patch4_window12_384 | fp16 | 1 | 23.0886 | 
| SwinTransformer_base_patch4_window12_384 | fp16 | 2 | 42.2748 | 
| SwinTransformer_base_patch4_window12_384 | fp16 | 4 | 87.3252 | 
| SwinTransformer_base_patch4_window12_384 | fp32 | 1 | 43.5075 | 
| SwinTransformer_base_patch4_window12_384 | fp32 | 2 | 87.5455 | 
| SwinTransformer_base_patch4_window12_384 | fp32 | 4 | 173.796 | 
| ViT_base_patch32_384 | fp16 | 1 | 4.923 | 
| ViT_base_patch32_384 | fp16 | 2 | 7.5347 | 
| ViT_base_patch32_384 | fp16 | 4 | 12.899 | 
| ViT_base_patch32_384 | fp32 | 1 | 10.8246 | 
| ViT_base_patch32_384 | fp32 | 2 | 18.5213 | 
| ViT_base_patch32_384 | fp32 | 4 | 34.7381 | 
| deeplabv3p_resnet50 | fp16 | 1 | 26.1575 | 
| deeplabv3p_resnet50 | fp16 | 2 | 47.9256 | 
| deeplabv3p_resnet50 | fp16 | 4 | 95.9487 | 
| deeplabv3p_resnet50 | fp32 | 1 | 66.8809 | 
| deeplabv3p_resnet50 | fp32 | 2 | 133.6688 | 
| deeplabv3p_resnet50 | fp32 | 4 | 266.9613 | 
| mask_rcnn_r50_vd_fpn_1x_coco | fp16 | 1 | 40.6577 | 
| mask_rcnn_r50_vd_fpn_1x_coco | fp32 | 1 | 101.93 | 
| yolov3_darknet53_270e_coco | fp16 | 1 | 20.6326 | 
| yolov3_darknet53_270e_coco | fp16 | 2 | 41.5202 | 
| yolov3_darknet53_270e_coco | fp16 | 4 | 80.3059 | 
| yolov3_darknet53_270e_coco | fp32 | 1 | 44.1216 | 
| yolov3_darknet53_270e_coco | fp32 | 2 | 85.4666 | 
| yolov3_darknet53_270e_coco | fp32 | 4 | 183.9448 |