GFPGAN to TorchScript/TensorRT #263

New Issue

claunia · 2026-01-29T21:46:17Z

claunia commented

2026-01-29 21:46:17 +00:00

Originally created by @lschaupp on GitHub (Sep 27, 2022).

Hello, I am trying to convert the GFPGAN model to TorchScript/TensorRT to increase model performance. Has there be made any efforts yet on this?

So far I made a successful conversion to onnx (including the StyleGAN Decoder)
However the conversion to torchscript (or even just tracing) results in some errors of the StyleGAN Decoder part)

Originally created by @lschaupp on GitHub (Sep 27, 2022). Hello, I am trying to convert the GFPGAN model to TorchScript/TensorRT to increase model performance. Has there be made any efforts yet on this? So far I made a successful conversion to onnx (including the StyleGAN Decoder) However the conversion to torchscript (or even just tracing) results in some errors of the StyleGAN Decoder part)

claunia commented

2026-01-29 21:46:18 +00:00

@NingNanXin commented on GitHub (Apr 3, 2023):

Hi, I successfully export trt with TensorRt 8.5, but onnx result != tensorrt result.
Did you meet this problem?
This my polygraph output.

@NingNanXin commented on GitHub (Apr 3, 2023): Hi, I successfully export trt with TensorRt 8.5, but onnx result != tensorrt result. Did you meet this problem? This my polygraph output. <img width="939" alt="image" src="https://user-images.githubusercontent.com/56398622/229459760-40babdd3-6d68-40c0-93c6-356374c36a80.png">

claunia commented

2026-01-29 21:46:18 +00:00

@bychen7 commented on GitHub (May 17, 2023):

@NingNanXin Hi, you can try this https://github.com/bychen7/Face-Restoration-TensorRT.

@bychen7 commented on GitHub (May 17, 2023): @NingNanXin Hi, you can try this https://github.com/bychen7/Face-Restoration-TensorRT.

claunia commented

2026-01-29 21:46:18 +00:00

@topninja commented on GitHub (May 17, 2023):

@NingNanXin
https://github.com/bychen7/Face-Restoration-TensorRT
This repo can be run on Windows 10?

@topninja commented on GitHub (May 17, 2023): @NingNanXin https://github.com/bychen7/Face-Restoration-TensorRT This repo can be run on Windows 10?

claunia commented

2026-01-29 21:46:18 +00:00

@bychen7 commented on GitHub (May 17, 2023):

@NingNanXin https://github.com/bychen7/Face-Restoration-TensorRT This repo can be run on Windows 10?

I think it is possible, and the code does not need to be changed. However, I have not tested it on Win10, and it would require replacing all dependencies with their Windows versions.

@bychen7 commented on GitHub (May 17, 2023): > @NingNanXin https://github.com/bychen7/Face-Restoration-TensorRT This repo can be run on Windows 10? I think it is possible, and the code does not need to be changed. However, I have not tested it on Win10, and it would require replacing all dependencies with their Windows versions.

claunia commented

2026-01-29 21:46:18 +00:00

@NingNanXin commented on GitHub (May 18, 2023):

@NingNanXin Hi, you can try this https://github.com/bychen7/Face-Restoration-TensorRT.

Thanks for your work, I think you can push your work in Tensorrtx by wangxinyu, thus more people will benefit from your work.
I test the work in ubuntu18.04, trt8.2, and got a good performance, but you didn't set fp16 precision, so all of your onnx keep fp32?

And you can try trt>8.5, it can deal with the two inputs of a convolution

@NingNanXin commented on GitHub (May 18, 2023): > @NingNanXin Hi, you can try this https://github.com/bychen7/Face-Restoration-TensorRT. Thanks for your work, I think you can push your work in Tensorrtx by wangxinyu, thus more people will benefit from your work. I test the work in ubuntu18.04, trt8.2, and got a good performance, but you didn't set fp16 precision, so all of your onnx keep fp32? <img width="297" alt="image" src="https://github.com/TencentARC/GFPGAN/assets/56398622/056ab074-b5b0-408c-b58b-2261fb328bde"> And you can try trt>8.5, it can deal with the two inputs of a convolution

claunia commented

2026-01-29 21:46:19 +00:00

@bychen7 commented on GitHub (May 18, 2023):

@NingNanXin Hi, you can try this https://github.com/bychen7/Face-Restoration-TensorRT.

Thanks for your work, I think you can push your work in Tensorrtx by wangxinyu, thus more people will benefit from your work. I test the work in ubuntu18.04, trt8.2, and got a good performance, but you didn't set fp16 precision, so all of your onnx keep fp32?
And you can try trt>8.5, it can deal with the two inputs of a convolution

Thank you for your suggestion. The quality of fp16 generation may decrease. This work was done a year ago, and I reviewed the records from that time. Please refer to the following comparison results.

@bychen7 commented on GitHub (May 18, 2023): > > @NingNanXin Hi, you can try this https://github.com/bychen7/Face-Restoration-TensorRT. > > Thanks for your work, I think you can push your work in Tensorrtx by wangxinyu, thus more people will benefit from your work. I test the work in ubuntu18.04, trt8.2, and got a good performance, but you didn't set fp16 precision, so all of your onnx keep fp32? > > <img alt="image" width="297" src="https://user-images.githubusercontent.com/56398622/239112773-056ab074-b5b0-408c-b58b-2261fb328bde.png"> > And you can try trt>8.5, it can deal with the two inputs of a convolution Thank you for your suggestion. The quality of fp16 generation may decrease. This work was done a year ago, and I reviewed the records from that time. Please refer to the following comparison results. <img width="782" alt="exp" src="https://github.com/TencentARC/GFPGAN/assets/55865490/304505ac-48da-481f-8478-8c4b38c8e61e">

claunia commented

2026-01-29 21:46:19 +00:00

@NingNanXin commented on GitHub (May 18, 2023):

@bychen7
Yes, I convert GFPGAN to fp16 and got the same problem. Even if I retrain the model the problem cannot be solved. However, GPEN can right convert to fp16.
Another problem do you find the trt infer speed is slower than torch when the resolution is 512*512. For example, GFPGAN jit model got an average 14ms/frame, but fp32 trt got an average 36ms/frame, my test device is RTX2080ti.

@NingNanXin commented on GitHub (May 18, 2023): @bychen7 Yes, I convert GFPGAN to fp16 and got the same problem. Even if I retrain the model the problem cannot be solved. However, GPEN can right convert to fp16. Another problem do you find the trt infer speed is slower than torch when the resolution is 512*512. For example, GFPGAN jit model got an average 14ms/frame, but fp32 trt got an average 36ms/frame, my test device is RTX2080ti.

claunia commented

2026-01-29 21:46:19 +00:00

@debasishaimonk commented on GitHub (Oct 26, 2023):

@NingNanXin did u solve it?

@debasishaimonk commented on GitHub (Oct 26, 2023): @NingNanXin did u solve it?

claunia commented

2026-01-29 21:46:19 +00:00

@NingNanXin commented on GitHub (Oct 27, 2023):

@debasishaimonk No, I try to keep some layers as fp32,others fp16, but trt which export by this way did not work

@NingNanXin commented on GitHub (Oct 27, 2023): @debasishaimonk No, I try to keep some layers as fp32,others fp16, but trt which export by this way did not work

claunia commented

2026-01-29 21:46:20 +00:00

@ListonQH commented on GitHub (Nov 29, 2023):

@bychen7 Yes, I convert GFPGAN to fp16 and got the same problem. Even if I retrain the model the problem cannot be solved. However, GPEN can right convert to fp16. Another problem do you find the trt infer speed is slower than torch when the resolution is 512*512. For example, GFPGAN jit model got an average 14ms/frame, but fp32 trt got an average 36ms/frame, my test device is RTX2080ti.

my device is laptop 4070, input[1, 3, 512, 512], inference fp16 got average 16~18ms/frame, fp32 about 42ms/frame. What's more, fp32 infer result is right, fp16 just got a black image. Did your fp16 result are normal?

@ListonQH commented on GitHub (Nov 29, 2023): > @bychen7 Yes, I convert GFPGAN to fp16 and got the same problem. Even if I retrain the model the problem cannot be solved. However, GPEN can right convert to fp16. Another problem do you find the trt infer speed is slower than torch when the resolution is 512*512. For example, GFPGAN jit model got an average 14ms/frame, but fp32 trt got an average 36ms/frame, my test device is RTX2080ti. my device is laptop 4070, input[1, 3, 512, 512], inference fp16 got average 16~18ms/frame, fp32 about 42ms/frame. What's more, fp32 infer result is right, fp16 just got a black image. Did your fp16 result are normal?

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: TencentARC/GFPGAN#263