v0.2.4

fix bug in inference: RealESRGAN model is None
update utils and unittest
2026-02-19 05:50:28 +00:00 · 2021-12-12 22:54:36 +08:00 · 2021-12-12 22:46:07 +08:00 · 2021-11-28 23:09:38 +08:00 · 2021-11-27 19:59:23 +08:00 · 2021-10-22 17:06:29 +08:00
46 changed files with 1620 additions and 438 deletions
--- a/.github/workflows/no-response.yml
+++ b/.github/workflows/no-response.yml
@@ -0,0 +1,33 @@
+name: No Response
+
+# TODO: it seems not to work
+# Modified from: https://raw.githubusercontent.com/github/docs/main/.github/workflows/no-response.yaml
+
+# **What it does**: Closes issues that don't have enough information to be actionable.
+# **Why we have it**: To remove the need for maintainers to remember to check back on issues periodically
+#                     to see if contributors have responded.
+# **Who does it impact**: Everyone that works on docs or docs-internal.
+
+on:
+  issue_comment:
+    types: [created]
+
+  schedule:
+    # Schedule for five minutes after the hour every hour
+    - cron: '5 * * * *'
+
+jobs:
+  noResponse:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: lee-dohm/no-response@v0.5.0
+        with:
+          token: ${{ github.token }}
+          closeComment: >
+            This issue has been automatically closed because there has been no response
+            to our request for more information from the original author. With only the
+            information that is currently in the issue, we don't have enough information
+            to take action. Please reach out if you have or find the answers we need so
+            that we can investigate further.
+            If you still have questions, please improve your description and re-open it.
+            Thanks :-)
--- a/.github/workflows/publish-pip.yml
+++ b/.github/workflows/publish-pip.yml
@@ -0,0 +1,29 @@
+name: PyPI Publish
+
+on: push
+
+jobs:
+  build-n-publish:
+    runs-on: ubuntu-latest
+    if: startsWith(github.event.ref, 'refs/tags')
+
+    steps:
+      - uses: actions/checkout@v2
+      - name: Set up Python 3.8
+        uses: actions/setup-python@v1
+        with:
+          python-version: 3.8
+      - name: Upgrade pip
+        run: pip install pip --upgrade
+      - name: Install PyTorch (cpu)
+        run: pip install torch==1.7.0+cpu torchvision==0.8.1+cpu -f https://download.pytorch.org/whl/torch_stable.html
+      - name: Install dependencies
+        run: pip install -r requirements.txt
+      - name: Build and install
+        run: rm -rf .eggs && pip install -e .
+      - name: Build for distribution
+        run: python setup.py sdist bdist_wheel
+      - name: Publish distribution to PyPI
+        uses: pypa/gh-action-pypi-publish@master
+        with:
+          password: ${{ secrets.PYPI_API_TOKEN }}
--- a/.github/workflows/pylint.yml
+++ b/.github/workflows/pylint.yml
@@ -20,10 +20,11 @@ jobs:
    - name: Install dependencies
      run: |
        python -m pip install --upgrade pip
-        pip install flake8 yapf isort
+        pip install codespell flake8 isort yapf

    - name: Lint
      run: |
+        codespell
        flake8 .
-        isort --check-only --diff data/ archs/ models/ train.py inference_gfpgan_full.py
-        yapf -r -d data/ archs/ models/ train.py inference_gfpgan_full.py
+        isort --check-only --diff gfpgan/ scripts/ inference_gfpgan.py setup.py
+        yapf -r -d gfpgan/ scripts/ inference_gfpgan.py setup.py
--- a/.gitignore
+++ b/.gitignore
@@ -1,22 +1,13 @@
-.vscode
+# ignored folders
 datasets/*
 experiments/*
+results/*
 tb_logger/*
+wandb/*
+tmp/*

-# ignored files
 version.py

-# ignored files with suffix
-*.html
-*.png
-*.jpeg
-*.jpg
-*.gif
-*.pth
-*.zip
-
-# template
-
 # Byte-compiled / optimized / DLL files
 __pycache__/
 *.py[cod]
@@ -39,6 +30,8 @@ parts/
 sdist/
 var/
 wheels/
+pip-wheel-metadata/
+share/python-wheels/
 *.egg-info/
 .installed.cfg
 *.egg
@@ -57,12 +50,14 @@ pip-delete-this-directory.txt
 # Unit test / coverage reports
 htmlcov/
 .tox/
+.nox/
 .coverage
 .coverage.*
 .cache
 nosetests.xml
 coverage.xml
 *.cover
+*.py,cover
 .hypothesis/
 .pytest_cache/

@@ -74,6 +69,7 @@ coverage.xml
 *.log
 local_settings.py
 db.sqlite3
+db.sqlite3-journal

 # Flask stuff:
 instance/
@@ -91,11 +87,26 @@ target/
 # Jupyter Notebook
 .ipynb_checkpoints

+# IPython
+profile_default/
+ipython_config.py
+
 # pyenv
 .python-version

-# celery beat schedule file
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   having no cross-platform support, pipenv may install dependencies that don't work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow
+__pypackages__/
+
+# Celery stuff
 celerybeat-schedule
+celerybeat.pid

 # SageMath parsed files
 *.sage.py
@@ -121,3 +132,8 @@ venv.bak/

 # mypy
 .mypy_cache/
+.dmypy.json
+dmypy.json
+
+# Pyre type checker
+.pyre/
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -24,6 +24,12 @@ repos:
    hooks:
      - id: yapf

+  # codespell
+  - repo: https://github.com/codespell-project/codespell
+    rev: v2.1.0
+    hooks:
+      - id: codespell
+
  # pre-commit-hooks
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v3.2.0
--- a/.vscode/settings.json
+++ b/.vscode/settings.json
@@ -0,0 +1,19 @@
+{
+    "files.trimTrailingWhitespace": true,
+    "editor.wordWrap": "on",
+    "editor.rulers": [
+        80,
+        120
+    ],
+    "editor.renderWhitespace": "all",
+    "editor.renderControlCharacters": true,
+    "python.formatting.provider": "yapf",
+    "python.formatting.yapfArgs": [
+        "--style",
+        "{BASED_ON_STYLE = pep8, BLANK_LINE_BEFORE_NESTED_CLASS_OR_DEF = true, SPLIT_BEFORE_EXPRESSION_AFTER_OPENING_PAREN = true, COLUMN_LIMIT = 120}"
+    ],
+    "python.linting.flake8Enabled": true,
+    "python.linting.flake8Args": [
+        "max-line-length=120"
+    ],
+}
--- a/MANIFEST.in
+++ b/MANIFEST.in
@@ -0,0 +1,8 @@
+include assets/*
+include inputs/*
+include scripts/*.py
+include inference_gfpgan.py
+include VERSION
+include LICENSE
+include requirements.txt
+include gfpgan/weights/README.md
--- a/PaperModel.md
+++ b/PaperModel.md
@@ -27,6 +27,7 @@ If you want want to use the original model in our paper, please follow the instr
    pip install facexlib

    pip install -r requirements.txt
+    python setup.py develop

    # remember to set BASICSR_JIT=True before your running commands
    ```
@@ -45,6 +46,7 @@ If you want want to use the original model in our paper, please follow the instr
    pip install facexlib

    pip install -r requirements.txt
+    python setup.py develop
    ```

 ## :zap: Quick Inference
@@ -58,17 +60,17 @@ wget https://github.com/TencentARC/GFPGAN/releases/download/v0.1.0/GFPGANv1.pth
 - Option 1: Load extensions just-in-time(JIT)

    ```bash
-    BASICSR_JIT=True python inference_gfpgan_full.py --model_path experiments/pretrained_models/GFPGANv1.pth --test_path inputs/whole_imgs --save_root results --arch original --channel 1
+    BASICSR_JIT=True python inference_gfpgan.py --model_path experiments/pretrained_models/GFPGANv1.pth --test_path inputs/whole_imgs --save_root results --arch original --channel 1

    # for aligned images
-    BASICSR_JIT=True python inference_gfpgan_full.py --model_path experiments/pretrained_models/GFPGANv1.pth --test_path inputs/whole_imgs --save_root results --arch original --channel 1 --aligned
+    BASICSR_JIT=True python inference_gfpgan.py --model_path experiments/pretrained_models/GFPGANv1.pth --test_path inputs/cropped_faces --save_root results --arch original --channel 1 --aligned
    ```

 - Option 2: Have successfully compiled extensions during installation

    ```bash
-    python inference_gfpgan_full.py --model_path experiments/pretrained_models/GFPGANv1.pth --test_path inputs/whole_imgs --save_root results --arch original --channel 1
+    python inference_gfpgan.py --model_path experiments/pretrained_models/GFPGANv1.pth --test_path inputs/whole_imgs --save_root results --arch original --channel 1

    # for aligned images
-    python inference_gfpgan_full.py --model_path experiments/pretrained_models/GFPGANv1.pth --test_path inputs/whole_imgs --save_root results --arch original --channel 1 --aligned
+    python inference_gfpgan.py --model_path experiments/pretrained_models/GFPGANv1.pth --test_path inputs/cropped_faces --save_root results --arch original --channel 1 --aligned
    ```
--- a/README.md
+++ b/README.md
@@ -1,21 +1,36 @@
 # GFPGAN (CVPR 2021)

 [![download](https://img.shields.io/github/downloads/TencentARC/GFPGAN/total.svg)](https://github.com/TencentARC/GFPGAN/releases)
-[![Open issue](https://isitmaintained.com/badge/open/TencentARC/GFPGAN.svg)](https://github.com/TencentARC/GFPGAN/issues)
+[![PyPI](https://img.shields.io/pypi/v/gfpgan)](https://pypi.org/project/gfpgan/)
+[![Open issue](https://img.shields.io/github/issues/TencentARC/GFPGAN)](https://github.com/TencentARC/GFPGAN/issues)
+[![Closed issue](https://img.shields.io/github/issues-closed/TencentARC/GFPGAN)](https://github.com/TencentARC/GFPGAN/issues)
 [![LICENSE](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://github.com/TencentARC/GFPGAN/blob/master/LICENSE)
 [![python lint](https://github.com/TencentARC/GFPGAN/actions/workflows/pylint.yml/badge.svg)](https://github.com/TencentARC/GFPGAN/blob/master/.github/workflows/pylint.yml)
+[![Publish-pip](https://github.com/TencentARC/GFPGAN/actions/workflows/publish-pip.yml/badge.svg)](https://github.com/TencentARC/GFPGAN/blob/master/.github/workflows/publish-pip.yml)

-1. [Colab Demo](https://colab.research.google.com/drive/1sVsoBd9AjckIXThgtZhGrHRfFI6UUYOo) for GFPGAN <a href="https://colab.research.google.com/drive/1sVsoBd9AjckIXThgtZhGrHRfFI6UUYOo"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="google colab logo"></a>
+1. [Colab Demo](https://colab.research.google.com/drive/1sVsoBd9AjckIXThgtZhGrHRfFI6UUYOo) for GFPGAN <a href="https://colab.research.google.com/drive/1sVsoBd9AjckIXThgtZhGrHRfFI6UUYOo"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="google colab logo"></a>; (Another [Colab Demo](https://colab.research.google.com/drive/1Oa1WwKB4M4l1GmR7CtswDVgOCOeSLChA?usp=sharing) for the original paper model)
 1. We provide a *clean* version of GFPGAN, which can run without CUDA extensions. So that it can run in **Windows** or on **CPU mode**.

 GFPGAN aims at developing **Practical Algorithm for Real-world Face Restoration**.<br>
 It leverages rich and diverse priors encapsulated in a pretrained face GAN (*e.g.*, StyleGAN2) for blind face restoration.

 :triangular_flag_on_post: **Updates**
-
- :white_check_mark: We provide a *clean* version of GFPGAN, which does not require CUDA extensionts.
+- :white_check_mark: Integrated to [Huggingface Spaces](https://huggingface.co/spaces) with [Gradio](https://github.com/gradio-app/gradio). See [Gradio Web Demo](https://huggingface.co/spaces/akhaliq/GFPGAN).
+- :white_check_mark: Support enhancing non-face regions (background) with [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN).
+- :white_check_mark: We provide a *clean* version of GFPGAN, which does not require CUDA extensions.
 - :white_check_mark: We provide an updated model without colorizing faces.

+---
+
+If GFPGAN is helpful in your photos/projects, please help to :star: this repo or recommend it to your friends. Thanks:blush:
+Other recommended projects:<br>
+:arrow_forward: [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN): A practical algorithm for general image restoration<br>
+:arrow_forward: [BasicSR](https://github.com/xinntao/BasicSR): An ppen-source image and video restoration toolbox<br>
+:arrow_forward: [facexlib](https://github.com/xinntao/facexlib): A collection that provides useful face-relation functions.<br>
+:arrow_forward: [HandyView](https://github.com/xinntao/HandyView): A PyQt5-based image viewer that is handy for view and comparison. <br>
+
+---
+
 ### :book: GFP-GAN: Towards Real-World Blind Face Restoration with Generative Facial Prior

 > [[Paper](https://arxiv.org/abs/2101.04061)] &emsp; [[Project Page](https://xinntao.github.io/projects/gfpgan)] &emsp; [Demo] <br>
@@ -33,7 +48,7 @@ It leverages rich and diverse priors encapsulated in a pretrained face GAN (*e.g
 - Python >= 3.7 (Recommend to use [Anaconda](https://www.anaconda.com/download/#linux) or [Miniconda](https://docs.conda.io/en/latest/miniconda.html))
 - [PyTorch >= 1.7](https://pytorch.org/)
 - Option: NVIDIA GPU + [CUDA](https://developer.nvidia.com/cuda-downloads)
- Option: Linux (We have not tested on Windows)
+- Option: Linux

 ### Installation

@@ -59,6 +74,11 @@ If you want want to use the original model in our paper, please see [PaperModel.
    pip install facexlib

    pip install -r requirements.txt
+    python setup.py develop
+
+    # If you want to enhance the background (non-face) regions with Real-ESRGAN,
+    # you also need to install the realesrgan package
+    pip install realesrgan
    ```

 ## :zap: Quick Inference
@@ -72,13 +92,17 @@ wget https://github.com/TencentARC/GFPGAN/releases/download/v0.2.0/GFPGANCleanv1
 **Inference!**

 ```bash
-python inference_gfpgan_full.py --upscale_factor 2 --test_path inputs/whole_imgs --save_root results
+python inference_gfpgan.py --upscale 2 --test_path inputs/whole_imgs --save_root results
 ```

+If you want want to use the original model in our paper, please see [PaperModel.md](PaperModel.md) for installation and inference.
+
 ## :european_castle: Model Zoo

- [GFPGANCleanv1-NoCE-C2.pth](https://github.com/TencentARC/GFPGAN/releases/download/v0.2.0/GFPGANCleanv1-NoCE-C2.pth)
- [GFPGANv1.pth](https://github.com/TencentARC/GFPGAN/releases/download/v0.1.0/GFPGANv1.pth)
+- [GFPGANCleanv1-NoCE-C2.pth](https://github.com/TencentARC/GFPGAN/releases/download/v0.2.0/GFPGANCleanv1-NoCE-C2.pth): No colorization; no CUDA extensions are required. It is still in training. Trained with more data with pre-processing.
+- [GFPGANv1.pth](https://github.com/TencentARC/GFPGAN/releases/download/v0.1.0/GFPGANv1.pth): The paper model, with colorization.
+
+You can find **more models (such as the discriminators)** here: [[Google Drive](https://drive.google.com/drive/folders/17rLiFzcUMoQuhLnptDsKolegHWwJOnHu?usp=sharing)], OR [[Tencent Cloud 腾讯微云](https://share.weiyun.com/ShYoCCoc)]

 ## :computer: Training

@@ -90,10 +114,9 @@ You could improve it according to your own needs.
 1. More high quality faces can improve the restoration quality.
 2. You may need to perform some pre-processing, such as beauty makeup.

-
 **Procedures**

-(You can try a simple version ( `train_gfpgan_v1_simple.yml`) that does not require face component landmarks.)
+(You can try a simple version ( `options/train_gfpgan_v1_simple.yml`) that does not require face component landmarks.)

 1. Dataset preparation: [FFHQ](https://github.com/NVlabs/ffhq-dataset)

@@ -102,11 +125,11 @@ You could improve it according to your own needs.
    1. [Component locations of FFHQ: FFHQ_eye_mouth_landmarks_512.pth](https://github.com/TencentARC/GFPGAN/releases/download/v0.1.0/FFHQ_eye_mouth_landmarks_512.pth)
    1. [A simple ArcFace model: arcface_resnet18.pth](https://github.com/TencentARC/GFPGAN/releases/download/v0.1.0/arcface_resnet18.pth)

-1. Modify the configuration file `train_gfpgan_v1.yml` accordingly.
+1. Modify the configuration file `options/train_gfpgan_v1.yml` accordingly.

 1. Training

-> python -m torch.distributed.launch --nproc_per_node=4 --master_port=22021 train.py -opt train_gfpgan_v1.yml --launcher pytorch
+> python -m torch.distributed.launch --nproc_per_node=4 --master_port=22021 gfpgan/train.py -opt options/train_gfpgan_v1.yml --launcher pytorch

 ## :scroll: License and Acknowledgement

--- a/1
+++ b/1
@@ -0,0 +1 @@
+0.2.4
--- a/gfpgan/init.py
+++ b/gfpgan/init.py
@@ -0,0 +1,6 @@
+# flake8: noqa
+from .archs import *
+from .data import *
+from .models import *
+from .utils import *
+from .version import *
--- a/gfpgan/archs/init.py
+++ b/gfpgan/archs/init.py
@@ -1,12 +1,10 @@
 import importlib
+from basicsr.utils import scandir
 from os import path as osp

-from basicsr.utils import scandir
-
 # automatically scan and import arch modules for registry
-# scan all the files under the 'archs' folder and collect files ending with
-# '_arch.py'
+# scan all the files that end with '_arch.py' under the archs folder
 arch_folder = osp.dirname(osp.abspath(__file__))
 arch_filenames = [osp.splitext(osp.basename(v))[0] for v in scandir(arch_folder) if v.endswith('_arch.py')]
 # import all the arch modules
-_arch_modules = [importlib.import_module(f'archs.{file_name}') for file_name in arch_filenames]
+_arch_modules = [importlib.import_module(f'gfpgan.archs.{file_name}') for file_name in arch_filenames]
--- a/gfpgan/archs/arcface_arch.py
+++ b/gfpgan/archs/arcface_arch.py
@@ -1,15 +1,28 @@
 import torch.nn as nn
-
 from basicsr.utils.registry import ARCH_REGISTRY


-def conv3x3(in_planes, out_planes, stride=1):
-    """3x3 convolution with padding"""
-    return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride, padding=1, bias=False)
+def conv3x3(inplanes, outplanes, stride=1):
+    """A simple wrapper for 3x3 convolution with padding.
+
+    Args:
+        inplanes (int): Channel number of inputs.
+        outplanes (int): Channel number of outputs.
+        stride (int): Stride in convolution. Default: 1.
+    """
+    return nn.Conv2d(inplanes, outplanes, kernel_size=3, stride=stride, padding=1, bias=False)


 class BasicBlock(nn.Module):
-    expansion = 1
+    """Basic residual block used in the ResNetArcFace architecture.
+
+    Args:
+        inplanes (int): Channel number of inputs.
+        planes (int): Channel number of outputs.
+        stride (int): Stride in convolution. Default: 1.
+        downsample (nn.Module): The downsample module. Default: None.
+    """
+    expansion = 1  # output channel expansion ratio

    def __init__(self, inplanes, planes, stride=1, downsample=None):
        super(BasicBlock, self).__init__()
@@ -41,7 +54,16 @@ class BasicBlock(nn.Module):


 class IRBlock(nn.Module):
-    expansion = 1
+    """Improved residual block (IR Block) used in the ResNetArcFace architecture.
+
+    Args:
+        inplanes (int): Channel number of inputs.
+        planes (int): Channel number of outputs.
+        stride (int): Stride in convolution. Default: 1.
+        downsample (nn.Module): The downsample module. Default: None.
+        use_se (bool): Whether use the SEBlock (squeeze and excitation block). Default: True.
+    """
+    expansion = 1  # output channel expansion ratio

    def __init__(self, inplanes, planes, stride=1, downsample=None, use_se=True):
        super(IRBlock, self).__init__()
@@ -79,7 +101,15 @@ class IRBlock(nn.Module):


 class Bottleneck(nn.Module):
-    expansion = 4
+    """Bottleneck block used in the ResNetArcFace architecture.
+
+    Args:
+        inplanes (int): Channel number of inputs.
+        planes (int): Channel number of outputs.
+        stride (int): Stride in convolution. Default: 1.
+        downsample (nn.Module): The downsample module. Default: None.
+    """
+    expansion = 4  # output channel expansion ratio

    def __init__(self, inplanes, planes, stride=1, downsample=None):
        super(Bottleneck, self).__init__()
@@ -117,10 +147,16 @@ class Bottleneck(nn.Module):


 class SEBlock(nn.Module):
+    """The squeeze-and-excitation block (SEBlock) used in the IRBlock.
+
+    Args:
+        channel (int): Channel number of inputs.
+        reduction (int): Channel reduction ration. Default: 16.
+    """

    def __init__(self, channel, reduction=16):
        super(SEBlock, self).__init__()
-        self.avg_pool = nn.AdaptiveAvgPool2d(1)
+        self.avg_pool = nn.AdaptiveAvgPool2d(1)  # pool to 1x1 without spatial information
        self.fc = nn.Sequential(
            nn.Linear(channel, channel // reduction), nn.PReLU(), nn.Linear(channel // reduction, channel),
            nn.Sigmoid())
@@ -134,6 +170,15 @@ class SEBlock(nn.Module):

@ARCH_REGISTRY.register()
 class ResNetArcFace(nn.Module):
+    """ArcFace with ResNet architectures.
+
+    Ref: ArcFace: Additive Angular Margin Loss for Deep Face Recognition.
+
+    Args:
+        block (str): Block used in the ArcFace architecture.
+        layers (tuple(int)): Block numbers in each layer.
+        use_se (bool): Whether use the SEBlock (squeeze and excitation block). Default: True.
+    """

    def __init__(self, block, layers, use_se=True):
        if block == 'IRBlock':
@@ -141,6 +186,7 @@ class ResNetArcFace(nn.Module):
        self.inplanes = 64
        self.use_se = use_se
        super(ResNetArcFace, self).__init__()
+
        self.conv1 = nn.Conv2d(1, 64, kernel_size=3, padding=1, bias=False)
        self.bn1 = nn.BatchNorm2d(64)
        self.prelu = nn.PReLU()
@@ -154,6 +200,7 @@ class ResNetArcFace(nn.Module):
        self.fc5 = nn.Linear(512 * 8 * 8, 512)
        self.bn5 = nn.BatchNorm1d(512)

+        # initialization
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.xavier_normal_(m.weight)
@@ -164,7 +211,7 @@ class ResNetArcFace(nn.Module):
                nn.init.xavier_normal_(m.weight)
                nn.init.constant_(m.bias, 0)

-    def _make_layer(self, block, planes, blocks, stride=1):
+    def _make_layer(self, block, planes, num_blocks, stride=1):
        downsample = None
        if stride != 1 or self.inplanes != planes * block.expansion:
            downsample = nn.Sequential(
@@ -174,7 +221,7 @@ class ResNetArcFace(nn.Module):
        layers = []
        layers.append(block(self.inplanes, planes, stride, downsample, use_se=self.use_se))
        self.inplanes = planes
-        for _ in range(1, blocks):
+        for _ in range(1, num_blocks):
            layers.append(block(self.inplanes, planes, use_se=self.use_se))

        return nn.Sequential(*layers)
--- a/gfpgan/archs/gfpganv1_arch.py
+++ b/gfpgan/archs/gfpganv1_arch.py
@@ -1,28 +1,27 @@
 import math
 import random
 import torch
-from torch import nn
-from torch.nn import functional as F
-
 from basicsr.archs.stylegan2_arch import (ConvLayer, EqualConv2d, EqualLinear, ResBlock, ScaledLeakyReLU,
                                          StyleGAN2Generator)
 from basicsr.ops.fused_act import FusedLeakyReLU
 from basicsr.utils.registry import ARCH_REGISTRY
+from torch import nn
+from torch.nn import functional as F


 class StyleGAN2GeneratorSFT(StyleGAN2Generator):
-    """StyleGAN2 Generator.
+    """StyleGAN2 Generator with SFT modulation (Spatial Feature Transform).

    Args:
        out_size (int): The spatial size of outputs.
        num_style_feat (int): Channel number of style features. Default: 512.
        num_mlp (int): Layer number of MLP style layers. Default: 8.
-        channel_multiplier (int): Channel multiplier for large networks of
-            StyleGAN2. Default: 2.
-        resample_kernel (list[int]): A list indicating the 1D resample kernel
-            magnitude. A cross production will be applied to extent 1D resample
-            kenrel to 2D resample kernel. Default: [1, 3, 3, 1].
+        channel_multiplier (int): Channel multiplier for large networks of StyleGAN2. Default: 2.
+        resample_kernel (list[int]): A list indicating the 1D resample kernel magnitude. A cross production will be
+            applied to extent 1D resample kernel to 2D resample kernel. Default: (1, 3, 3, 1).
        lr_mlp (float): Learning rate multiplier for mlp layers. Default: 0.01.
+        narrow (float): The narrow ratio for channels. Default: 1.
+        sft_half (bool): Whether to apply SFT on half of the input channels. Default: False.
    """

    def __init__(self,
@@ -54,21 +53,18 @@ class StyleGAN2GeneratorSFT(StyleGAN2Generator):
                truncation_latent=None,
                inject_index=None,
                return_latents=False):
-        """Forward function for StyleGAN2Generator.
+        """Forward function for StyleGAN2GeneratorSFT.

        Args:
            styles (list[Tensor]): Sample codes of styles.
-            input_is_latent (bool): Whether input is latent style.
-                Default: False.
+            conditions (list[Tensor]): SFT conditions to generators.
+            input_is_latent (bool): Whether input is latent style. Default: False.
            noise (Tensor | None): Input noise or None. Default: None.
-            randomize_noise (bool): Randomize noise, used when 'noise' is
-                False. Default: True.
-            truncation (float): TODO. Default: 1.
-            truncation_latent (Tensor | None): TODO. Default: None.
-            inject_index (int | None): The injection index for mixing noise.
-                Default: None.
-            return_latents (bool): Whether to return style latents.
-                Default: False.
+            randomize_noise (bool): Randomize noise, used when 'noise' is False. Default: True.
+            truncation (float): The truncation ratio. Default: 1.
+            truncation_latent (Tensor | None): The truncation latent tensor. Default: None.
+            inject_index (int | None): The injection index for mixing noise. Default: None.
+            return_latents (bool): Whether to return style latents. Default: False.
        """
        # style codes -> latents with Style MLP layer
        if not input_is_latent:
@@ -85,7 +81,7 @@ class StyleGAN2GeneratorSFT(StyleGAN2Generator):
            for style in styles:
                style_truncation.append(truncation_latent + truncation * (style - truncation_latent))
            styles = style_truncation
-        # get style latent with injection
+        # get style latents with injection
        if len(styles) == 1:
            inject_index = self.num_latent

@@ -114,15 +110,15 @@ class StyleGAN2GeneratorSFT(StyleGAN2Generator):
            # the conditions may have fewer levels
            if i < len(conditions):
                # SFT part to combine the conditions
-                if self.sft_half:
+                if self.sft_half:  # only apply SFT to half of the channels
                    out_same, out_sft = torch.split(out, int(out.size(1) // 2), dim=1)
                    out_sft = out_sft * conditions[i - 1] + conditions[i]
                    out = torch.cat([out_same, out_sft], dim=1)
-                else:
+                else:  # apply SFT to all the channels
                    out = out * conditions[i - 1] + conditions[i]

            out = conv2(out, latent[:, i + 1], noise=noise2)
-            skip = to_rgb(out, latent[:, i + 2], skip)
+            skip = to_rgb(out, latent[:, i + 2], skip)  # feature back to the rgb space
            i += 2

        image = skip
@@ -134,17 +130,15 @@ class StyleGAN2GeneratorSFT(StyleGAN2Generator):


 class ConvUpLayer(nn.Module):
-    """Conv Up Layer. Bilinear upsample + Conv.
+    """Convolutional upsampling layer. It uses bilinear upsampler + Conv.

    Args:
        in_channels (int): Channel number of the input.
        out_channels (int): Channel number of the output.
        kernel_size (int): Size of the convolving kernel.
        stride (int): Stride of the convolution. Default: 1
-        padding (int): Zero-padding added to both sides of the input.
-            Default: 0.
-        bias (bool): If ``True``, adds a learnable bias to the output.
-            Default: ``True``.
+        padding (int): Zero-padding added to both sides of the input. Default: 0.
+        bias (bool): If ``True``, adds a learnable bias to the output. Default: ``True``.
        bias_init_val (float): Bias initialized value. Default: 0.
        activate (bool): Whether use activateion. Default: True.
    """
@@ -164,6 +158,7 @@ class ConvUpLayer(nn.Module):
        self.kernel_size = kernel_size
        self.stride = stride
        self.padding = padding
+        # self.scale is used to scale the convolution weights, which is related to the common initializations.
        self.scale = 1 / math.sqrt(in_channels * kernel_size**2)

        self.weight = nn.Parameter(torch.randn(out_channels, in_channels, kernel_size, kernel_size))
@@ -224,7 +219,26 @@ class ResUpBlock(nn.Module):

@ARCH_REGISTRY.register()
 class GFPGANv1(nn.Module):
-    """Unet + StyleGAN2 decoder with SFT."""
+    """The GFPGAN architecture: Unet + StyleGAN2 decoder with SFT.
+
+    Ref: GFP-GAN: Towards Real-World Blind Face Restoration with Generative Facial Prior.
+
+    Args:
+        out_size (int): The spatial size of outputs.
+        num_style_feat (int): Channel number of style features. Default: 512.
+        channel_multiplier (int): Channel multiplier for large networks of StyleGAN2. Default: 2.
+        resample_kernel (list[int]): A list indicating the 1D resample kernel magnitude. A cross production will be
+            applied to extent 1D resample kernel to 2D resample kernel. Default: (1, 3, 3, 1).
+        decoder_load_path (str): The path to the pre-trained decoder model (usually, the StyleGAN2). Default: None.
+        fix_decoder (bool): Whether to fix the decoder. Default: True.
+
+        num_mlp (int): Layer number of MLP style layers. Default: 8.
+        lr_mlp (float): Learning rate multiplier for mlp layers. Default: 0.01.
+        input_is_latent (bool): Whether input is latent style. Default: False.
+        different_w (bool): Whether to use different latent w for different layers. Default: False.
+        narrow (float): The narrow ratio for channels. Default: 1.
+        sft_half (bool): Whether to apply SFT on half of the input channels. Default: False.
+    """

    def __init__(
            self,
@@ -247,7 +261,7 @@ class GFPGANv1(nn.Module):
        self.different_w = different_w
        self.num_style_feat = num_style_feat

-        unet_narrow = narrow * 0.5
+        unet_narrow = narrow * 0.5  # by default, use a half of input channels
        channels = {
            '4': int(512 * unet_narrow),
            '8': int(512 * unet_narrow),
@@ -296,6 +310,7 @@ class GFPGANv1(nn.Module):
        self.final_linear = EqualLinear(
            channels['4'] * 4 * 4, linear_out_channel, bias=True, bias_init_val=0, lr_mul=1, activation=None)

+        # the decoder: stylegan2 generator with SFT modulations
        self.stylegan_decoder = StyleGAN2GeneratorSFT(
            out_size=out_size,
            num_style_feat=num_style_feat,
@@ -306,14 +321,16 @@ class GFPGANv1(nn.Module):
            narrow=narrow,
            sft_half=sft_half)

+        # load pre-trained stylegan2 model if necessary
        if decoder_load_path:
            self.stylegan_decoder.load_state_dict(
                torch.load(decoder_load_path, map_location=lambda storage, loc: storage)['params_ema'])
+        # fix decoder without updating params
        if fix_decoder:
            for _, param in self.stylegan_decoder.named_parameters():
                param.requires_grad = False

-        # for SFT
+        # for SFT modulations (scale and shift)
        self.condition_scale = nn.ModuleList()
        self.condition_shift = nn.ModuleList()
        for i in range(3, self.log_size + 1):
@@ -333,13 +350,15 @@ class GFPGANv1(nn.Module):
                    ScaledLeakyReLU(0.2),
                    EqualConv2d(out_channels, sft_out_channels, 3, stride=1, padding=1, bias=True, bias_init_val=0)))

-    def forward(self,
-                x,
-                return_latents=False,
-                save_feat_path=None,
-                load_feat_path=None,
-                return_rgb=True,
-                randomize_noise=True):
+    def forward(self, x, return_latents=False, return_rgb=True, randomize_noise=True):
+        """Forward function for GFPGANv1.
+
+        Args:
+            x (Tensor): Input images.
+            return_latents (bool): Whether to return style latents. Default: False.
+            return_rgb (bool): Whether return intermediate rgb images. Default: True.
+            randomize_noise (bool): Randomize noise, used when 'noise' is False. Default: True.
+        """
        conditions = []
        unet_skips = []
        out_rgbs = []
@@ -363,7 +382,7 @@ class GFPGANv1(nn.Module):
            feat = feat + unet_skips[i]
            # ResUpLayer
            feat = self.conv_body_up[i](feat)
-            # generate scale and shift for SFT layer
+            # generate scale and shift for SFT layers
            scale = self.condition_scale[i](feat)
            conditions.append(scale.clone())
            shift = self.condition_shift[i](feat)
@@ -372,12 +391,6 @@ class GFPGANv1(nn.Module):
            if return_rgb:
                out_rgbs.append(self.toRGB[i](feat))

-        if save_feat_path is not None:
-            torch.save(conditions, save_feat_path)
-        if load_feat_path is not None:
-            conditions = torch.load(load_feat_path)
-            conditions = [v.cuda() for v in conditions]
-
        # decoder
        image, _ = self.stylegan_decoder([style_code],
                                         conditions,
@@ -390,10 +403,12 @@ class GFPGANv1(nn.Module):

@ARCH_REGISTRY.register()
 class FacialComponentDiscriminator(nn.Module):
+    """Facial component (eyes, mouth, noise) discriminator used in GFPGAN.
+    """

    def __init__(self):
        super(FacialComponentDiscriminator, self).__init__()
-
+        # It now uses a VGG-style architectrue with fixed model size
        self.conv1 = ConvLayer(3, 64, 3, downsample=False, resample_kernel=(1, 3, 3, 1), bias=True, activate=True)
        self.conv2 = ConvLayer(64, 128, 3, downsample=True, resample_kernel=(1, 3, 3, 1), bias=True, activate=True)
        self.conv3 = ConvLayer(128, 128, 3, downsample=False, resample_kernel=(1, 3, 3, 1), bias=True, activate=True)
@@ -402,6 +417,12 @@ class FacialComponentDiscriminator(nn.Module):
        self.final_conv = ConvLayer(256, 1, 3, bias=True, activate=False)

    def forward(self, x, return_feats=False):
+        """Forward function for FacialComponentDiscriminator.
+
+        Args:
+            x (Tensor): Input images.
+            return_feats (bool): Whether to return intermediate features. Default: False.
+        """
        feat = self.conv1(x)
        feat = self.conv3(self.conv2(feat))
        rlt_feats = []
--- a/gfpgan/archs/gfpganv1_clean_arch.py
+++ b/gfpgan/archs/gfpganv1_clean_arch.py
@@ -1,6 +1,7 @@
 import math
 import random
 import torch
+from basicsr.utils.registry import ARCH_REGISTRY
 from torch import nn
 from torch.nn import functional as F

@@ -8,14 +9,17 @@ from .stylegan2_clean_arch import StyleGAN2GeneratorClean


 class StyleGAN2GeneratorCSFT(StyleGAN2GeneratorClean):
-    """StyleGAN2 Generator.
+    """StyleGAN2 Generator with SFT modulation (Spatial Feature Transform).
+
+    It is the clean version without custom compiled CUDA extensions used in StyleGAN2.

    Args:
        out_size (int): The spatial size of outputs.
        num_style_feat (int): Channel number of style features. Default: 512.
        num_mlp (int): Layer number of MLP style layers. Default: 8.
-        channel_multiplier (int): Channel multiplier for large networks of
-            StyleGAN2. Default: 2.
+        channel_multiplier (int): Channel multiplier for large networks of StyleGAN2. Default: 2.
+        narrow (float): The narrow ratio for channels. Default: 1.
+        sft_half (bool): Whether to apply SFT on half of the input channels. Default: False.
    """

    def __init__(self, out_size, num_style_feat=512, num_mlp=8, channel_multiplier=2, narrow=1, sft_half=False):
@@ -25,7 +29,6 @@ class StyleGAN2GeneratorCSFT(StyleGAN2GeneratorClean):
            num_mlp=num_mlp,
            channel_multiplier=channel_multiplier,
            narrow=narrow)
-
        self.sft_half = sft_half

    def forward(self,
@@ -38,21 +41,18 @@ class StyleGAN2GeneratorCSFT(StyleGAN2GeneratorClean):
                truncation_latent=None,
                inject_index=None,
                return_latents=False):
-        """Forward function for StyleGAN2Generator.
+        """Forward function for StyleGAN2GeneratorCSFT.

        Args:
            styles (list[Tensor]): Sample codes of styles.
-            input_is_latent (bool): Whether input is latent style.
-                Default: False.
+            conditions (list[Tensor]): SFT conditions to generators.
+            input_is_latent (bool): Whether input is latent style. Default: False.
            noise (Tensor | None): Input noise or None. Default: None.
-            randomize_noise (bool): Randomize noise, used when 'noise' is
-                False. Default: True.
-            truncation (float): TODO. Default: 1.
-            truncation_latent (Tensor | None): TODO. Default: None.
-            inject_index (int | None): The injection index for mixing noise.
-                Default: None.
-            return_latents (bool): Whether to return style latents.
-                Default: False.
+            randomize_noise (bool): Randomize noise, used when 'noise' is False. Default: True.
+            truncation (float): The truncation ratio. Default: 1.
+            truncation_latent (Tensor | None): The truncation latent tensor. Default: None.
+            inject_index (int | None): The injection index for mixing noise. Default: None.
+            return_latents (bool): Whether to return style latents. Default: False.
        """
        # style codes -> latents with Style MLP layer
        if not input_is_latent:
@@ -69,7 +69,7 @@ class StyleGAN2GeneratorCSFT(StyleGAN2GeneratorClean):
            for style in styles:
                style_truncation.append(truncation_latent + truncation * (style - truncation_latent))
            styles = style_truncation
-        # get style latent with injection
+        # get style latents with injection
        if len(styles) == 1:
            inject_index = self.num_latent

@@ -98,15 +98,15 @@ class StyleGAN2GeneratorCSFT(StyleGAN2GeneratorClean):
            # the conditions may have fewer levels
            if i < len(conditions):
                # SFT part to combine the conditions
-                if self.sft_half:
+                if self.sft_half:  # only apply SFT to half of the channels
                    out_same, out_sft = torch.split(out, int(out.size(1) // 2), dim=1)
                    out_sft = out_sft * conditions[i - 1] + conditions[i]
                    out = torch.cat([out_same, out_sft], dim=1)
-                else:
+                else:  # apply SFT to all the channels
                    out = out * conditions[i - 1] + conditions[i]

            out = conv2(out, latent[:, i + 1], noise=noise2)
-            skip = to_rgb(out, latent[:, i + 2], skip)
+            skip = to_rgb(out, latent[:, i + 2], skip)  # feature back to the rgb space
            i += 2

        image = skip
@@ -118,11 +118,12 @@ class StyleGAN2GeneratorCSFT(StyleGAN2GeneratorClean):


 class ResBlock(nn.Module):
-    """Residual block with upsampling/downsampling.
+    """Residual block with bilinear upsampling/downsampling.

    Args:
        in_channels (int): Channel number of the input.
        out_channels (int): Channel number of the output.
+        mode (str): Upsampling/downsampling mode. Options: down | up. Default: down.
    """

    def __init__(self, in_channels, out_channels, mode='down'):
@@ -148,8 +149,27 @@ class ResBlock(nn.Module):
        return out


+@ARCH_REGISTRY.register()
 class GFPGANv1Clean(nn.Module):
-    """GFPGANv1 Clean version."""
+    """The GFPGAN architecture: Unet + StyleGAN2 decoder with SFT.
+
+    It is the clean version without custom compiled CUDA extensions used in StyleGAN2.
+
+    Ref: GFP-GAN: Towards Real-World Blind Face Restoration with Generative Facial Prior.
+
+    Args:
+        out_size (int): The spatial size of outputs.
+        num_style_feat (int): Channel number of style features. Default: 512.
+        channel_multiplier (int): Channel multiplier for large networks of StyleGAN2. Default: 2.
+        decoder_load_path (str): The path to the pre-trained decoder model (usually, the StyleGAN2). Default: None.
+        fix_decoder (bool): Whether to fix the decoder. Default: True.
+
+        num_mlp (int): Layer number of MLP style layers. Default: 8.
+        input_is_latent (bool): Whether input is latent style. Default: False.
+        different_w (bool): Whether to use different latent w for different layers. Default: False.
+        narrow (float): The narrow ratio for channels. Default: 1.
+        sft_half (bool): Whether to apply SFT on half of the input channels. Default: False.
+    """

    def __init__(
            self,
@@ -170,7 +190,7 @@ class GFPGANv1Clean(nn.Module):
        self.different_w = different_w
        self.num_style_feat = num_style_feat

-        unet_narrow = narrow * 0.5
+        unet_narrow = narrow * 0.5  # by default, use a half of input channels
        channels = {
            '4': int(512 * unet_narrow),
            '8': int(512 * unet_narrow),
@@ -218,6 +238,7 @@ class GFPGANv1Clean(nn.Module):

        self.final_linear = nn.Linear(channels['4'] * 4 * 4, linear_out_channel)

+        # the decoder: stylegan2 generator with SFT modulations
        self.stylegan_decoder = StyleGAN2GeneratorCSFT(
            out_size=out_size,
            num_style_feat=num_style_feat,
@@ -226,14 +247,16 @@ class GFPGANv1Clean(nn.Module):
            narrow=narrow,
            sft_half=sft_half)

+        # load pre-trained stylegan2 model if necessary
        if decoder_load_path:
            self.stylegan_decoder.load_state_dict(
                torch.load(decoder_load_path, map_location=lambda storage, loc: storage)['params_ema'])
+        # fix decoder without updating params
        if fix_decoder:
-            for name, param in self.stylegan_decoder.named_parameters():
+            for _, param in self.stylegan_decoder.named_parameters():
                param.requires_grad = False

-        # for SFT
+        # for SFT modulations (scale and shift)
        self.condition_scale = nn.ModuleList()
        self.condition_shift = nn.ModuleList()
        for i in range(3, self.log_size + 1):
@@ -251,13 +274,15 @@ class GFPGANv1Clean(nn.Module):
                    nn.Conv2d(out_channels, out_channels, 3, 1, 1), nn.LeakyReLU(0.2, True),
                    nn.Conv2d(out_channels, sft_out_channels, 3, 1, 1)))

-    def forward(self,
-                x,
-                return_latents=False,
-                save_feat_path=None,
-                load_feat_path=None,
-                return_rgb=True,
-                randomize_noise=True):
+    def forward(self, x, return_latents=False, return_rgb=True, randomize_noise=True):
+        """Forward function for GFPGANv1Clean.
+
+        Args:
+            x (Tensor): Input images.
+            return_latents (bool): Whether to return style latents. Default: False.
+            return_rgb (bool): Whether return intermediate rgb images. Default: True.
+            randomize_noise (bool): Randomize noise, used when 'noise' is False. Default: True.
+        """
        conditions = []
        unet_skips = []
        out_rgbs = []
@@ -273,13 +298,14 @@ class GFPGANv1Clean(nn.Module):
        style_code = self.final_linear(feat.view(feat.size(0), -1))
        if self.different_w:
            style_code = style_code.view(style_code.size(0), -1, self.num_style_feat)
+
        # decode
        for i in range(self.log_size - 2):
            # add unet skip
            feat = feat + unet_skips[i]
            # ResUpLayer
            feat = self.conv_body_up[i](feat)
-            # generate scale and shift for SFT layer
+            # generate scale and shift for SFT layers
            scale = self.condition_scale[i](feat)
            conditions.append(scale.clone())
            shift = self.condition_shift[i](feat)
@@ -288,12 +314,6 @@ class GFPGANv1Clean(nn.Module):
            if return_rgb:
                out_rgbs.append(self.toRGB[i](feat))

-        if save_feat_path is not None:
-            torch.save(conditions, save_feat_path)
-        if load_feat_path is not None:
-            conditions = torch.load(load_feat_path)
-            conditions = [v.cuda() for v in conditions]
-
        # decoder
        image, _ = self.stylegan_decoder([style_code],
                                         conditions,
--- a/gfpgan/archs/stylegan2_clean_arch.py
+++ b/gfpgan/archs/stylegan2_clean_arch.py
@@ -1,11 +1,10 @@
 import math
 import random
 import torch
-from torch import nn
-from torch.nn import functional as F
-
 from basicsr.archs.arch_util import default_init_weights
 from basicsr.utils.registry import ARCH_REGISTRY
+from torch import nn
+from torch.nn import functional as F


 class NormStyleCode(nn.Module):
@@ -32,12 +31,9 @@ class ModulatedConv2d(nn.Module):
        out_channels (int): Channel number of the output.
        kernel_size (int): Size of the convolving kernel.
        num_style_feat (int): Channel number of style features.
-        demodulate (bool): Whether to demodulate in the conv layer.
-            Default: True.
-        sample_mode (str | None): Indicating 'upsample', 'downsample' or None.
-            Default: None.
-        eps (float): A value added to the denominator for numerical stability.
-            Default: 1e-8.
+        demodulate (bool): Whether to demodulate in the conv layer. Default: True.
+        sample_mode (str | None): Indicating 'upsample', 'downsample' or None. Default: None.
+        eps (float): A value added to the denominator for numerical stability. Default: 1e-8.
    """

    def __init__(self,
@@ -88,6 +84,7 @@ class ModulatedConv2d(nn.Module):

        weight = weight.view(b * self.out_channels, c, self.kernel_size, self.kernel_size)

+        # upsample or downsample if necessary
        if self.sample_mode == 'upsample':
            x = F.interpolate(x, scale_factor=2, mode='bilinear', align_corners=False)
        elif self.sample_mode == 'downsample':
@@ -102,14 +99,12 @@ class ModulatedConv2d(nn.Module):
        return out

    def __repr__(self):
-        return (f'{self.__class__.__name__}(in_channels={self.in_channels}, '
-                f'out_channels={self.out_channels}, '
-                f'kernel_size={self.kernel_size}, '
-                f'demodulate={self.demodulate}, sample_mode={self.sample_mode})')
+        return (f'{self.__class__.__name__}(in_channels={self.in_channels}, out_channels={self.out_channels}, '
+                f'kernel_size={self.kernel_size}, demodulate={self.demodulate}, sample_mode={self.sample_mode})')


 class StyleConv(nn.Module):
-    """Style conv.
+    """Style conv used in StyleGAN2.

    Args:
        in_channels (int): Channel number of the input.
@@ -117,8 +112,7 @@ class StyleConv(nn.Module):
        kernel_size (int): Size of the convolving kernel.
        num_style_feat (int): Channel number of style features.
        demodulate (bool): Whether demodulate in the conv layer. Default: True.
-        sample_mode (str | None): Indicating 'upsample', 'downsample' or None.
-            Default: None.
+        sample_mode (str | None): Indicating 'upsample', 'downsample' or None. Default: None.
    """

    def __init__(self, in_channels, out_channels, kernel_size, num_style_feat, demodulate=True, sample_mode=None):
@@ -145,7 +139,7 @@ class StyleConv(nn.Module):


 class ToRGB(nn.Module):
-    """To RGB from features.
+    """To RGB (image space) from features.

    Args:
        in_channels (int): Channel number of input.
@@ -205,8 +199,7 @@ class StyleGAN2GeneratorClean(nn.Module):
        out_size (int): The spatial size of outputs.
        num_style_feat (int): Channel number of style features. Default: 512.
        num_mlp (int): Layer number of MLP style layers. Default: 8.
-        channel_multiplier (int): Channel multiplier for large networks of
-            StyleGAN2. Default: 2.
+        channel_multiplier (int): Channel multiplier for large networks of StyleGAN2. Default: 2.
        narrow (float): Narrow ratio for channels. Default: 1.0.
    """

@@ -223,6 +216,7 @@ class StyleGAN2GeneratorClean(nn.Module):
        # initialization
        default_init_weights(self.style_mlp, scale=1, bias_fill=0, a=0.2, mode='fan_in', nonlinearity='leaky_relu')

+        # channel list
        channels = {
            '4': int(512 * narrow),
            '8': int(512 * narrow),
@@ -310,21 +304,17 @@ class StyleGAN2GeneratorClean(nn.Module):
                truncation_latent=None,
                inject_index=None,
                return_latents=False):
-        """Forward function for StyleGAN2Generator.
+        """Forward function for StyleGAN2GeneratorClean.

        Args:
            styles (list[Tensor]): Sample codes of styles.
-            input_is_latent (bool): Whether input is latent style.
-                Default: False.
+            input_is_latent (bool): Whether input is latent style. Default: False.
            noise (Tensor | None): Input noise or None. Default: None.
-            randomize_noise (bool): Randomize noise, used when 'noise' is
-                False. Default: True.
-            truncation (float): TODO. Default: 1.
-            truncation_latent (Tensor | None): TODO. Default: None.
-            inject_index (int | None): The injection index for mixing noise.
-                Default: None.
-            return_latents (bool): Whether to return style latents.
-                Default: False.
+            randomize_noise (bool): Randomize noise, used when 'noise' is False. Default: True.
+            truncation (float): The truncation ratio. Default: 1.
+            truncation_latent (Tensor | None): The truncation latent tensor. Default: None.
+            inject_index (int | None): The injection index for mixing noise. Default: None.
+            return_latents (bool): Whether to return style latents. Default: False.
        """
        # style codes -> latents with Style MLP layer
        if not input_is_latent:
@@ -341,7 +331,7 @@ class StyleGAN2GeneratorClean(nn.Module):
            for style in styles:
                style_truncation.append(truncation_latent + truncation * (style - truncation_latent))
            styles = style_truncation
-        # get style latent with injection
+        # get style latents with injection
        if len(styles) == 1:
            inject_index = self.num_latent

@@ -367,7 +357,7 @@ class StyleGAN2GeneratorClean(nn.Module):
                                                        noise[2::2], self.to_rgbs):
            out = conv1(out, latent[:, i], noise=noise1)
            out = conv2(out, latent[:, i + 1], noise=noise2)
-            skip = to_rgb(out, latent[:, i + 2], skip)
+            skip = to_rgb(out, latent[:, i + 2], skip)  # feature back to the rgb space
            i += 2

        image = skip
--- a/gfpgan/data/init.py
+++ b/gfpgan/data/init.py
@@ -1,11 +1,10 @@
 import importlib
+from basicsr.utils import scandir
 from os import path as osp

-from basicsr.utils import scandir
-
 # automatically scan and import dataset modules for registry
-# scan all the files under the data folder with '_dataset' in file names
+# scan all the files that end with '_dataset.py' under the data folder
 data_folder = osp.dirname(osp.abspath(__file__))
 dataset_filenames = [osp.splitext(osp.basename(v))[0] for v in scandir(data_folder) if v.endswith('_dataset.py')]
 # import all the dataset modules
-_dataset_modules = [importlib.import_module(f'data.{file_name}') for file_name in dataset_filenames]
+_dataset_modules = [importlib.import_module(f'gfpgan.data.{file_name}') for file_name in dataset_filenames]
--- a/gfpgan/data/ffhq_degradation_dataset.py
+++ b/gfpgan/data/ffhq_degradation_dataset.py
@@ -4,18 +4,30 @@ import numpy as np
 import os.path as osp
 import torch
 import torch.utils.data as data
-from torchvision.transforms.functional import (adjust_brightness, adjust_contrast, adjust_hue, adjust_saturation,
-                                               normalize)
-
 from basicsr.data import degradations as degradations
 from basicsr.data.data_util import paths_from_folder
 from basicsr.data.transforms import augment
 from basicsr.utils import FileClient, get_root_logger, imfrombytes, img2tensor
 from basicsr.utils.registry import DATASET_REGISTRY
+from torchvision.transforms.functional import (adjust_brightness, adjust_contrast, adjust_hue, adjust_saturation,
+                                               normalize)


@DATASET_REGISTRY.register()
 class FFHQDegradationDataset(data.Dataset):
+    """FFHQ dataset for GFPGAN.
+
+    It reads high resolution images, and then generate low-quality (LQ) images on-the-fly.
+
+    Args:
+        opt (dict): Config for train datasets. It contains the following keys:
+            dataroot_gt (str): Data root path for gt.
+            io_backend (dict): IO backend type and other kwarg.
+            mean (list | tuple): Image mean.
+            std (list | tuple): Image std.
+            use_hflip (bool): Whether to horizontally flip.
+            Please see more options in the codes.
+    """

    def __init__(self, opt):
        super(FFHQDegradationDataset, self).__init__()
@@ -30,11 +42,13 @@ class FFHQDegradationDataset(data.Dataset):
        self.out_size = opt['out_size']

        self.crop_components = opt.get('crop_components', False)  # facial components
-        self.eye_enlarge_ratio = opt.get('eye_enlarge_ratio', 1)
+        self.eye_enlarge_ratio = opt.get('eye_enlarge_ratio', 1)  # whether enlarge eye regions

        if self.crop_components:
+            # load component list from a pre-process pth files
            self.components_list = torch.load(opt.get('component_path'))

+        # file client (lmdb io backend)
        if self.io_backend_opt['type'] == 'lmdb':
            self.io_backend_opt['db_paths'] = self.gt_folder
            if not self.gt_folder.endswith('.lmdb'):
@@ -42,9 +56,10 @@ class FFHQDegradationDataset(data.Dataset):
            with open(osp.join(self.gt_folder, 'meta_info.txt')) as fin:
                self.paths = [line.split('.')[0] for line in fin]
        else:
+            # disk backend: scan file list from a folder
            self.paths = paths_from_folder(self.gt_folder)

-        # degradations
+        # degradation configurations
        self.blur_kernel_size = opt['blur_kernel_size']
        self.kernel_list = opt['kernel_list']
        self.kernel_prob = opt['kernel_prob']
@@ -61,22 +76,20 @@ class FFHQDegradationDataset(data.Dataset):
        self.gray_prob = opt.get('gray_prob')

        logger = get_root_logger()
-        logger.info(f'Blur: blur_kernel_size {self.blur_kernel_size}, '
-                    f'sigma: [{", ".join(map(str, self.blur_sigma))}]')
+        logger.info(f'Blur: blur_kernel_size {self.blur_kernel_size}, sigma: [{", ".join(map(str, self.blur_sigma))}]')
        logger.info(f'Downsample: downsample_range [{", ".join(map(str, self.downsample_range))}]')
        logger.info(f'Noise: [{", ".join(map(str, self.noise_range))}]')
        logger.info(f'JPEG compression: [{", ".join(map(str, self.jpeg_range))}]')

        if self.color_jitter_prob is not None:
-            logger.info(f'Use random color jitter. Prob: {self.color_jitter_prob}, '
-                        f'shift: {self.color_jitter_shift}')
+            logger.info(f'Use random color jitter. Prob: {self.color_jitter_prob}, shift: {self.color_jitter_shift}')
        if self.gray_prob is not None:
            logger.info(f'Use random gray. Prob: {self.gray_prob}')
-
        self.color_jitter_shift /= 255.

    @staticmethod
    def color_jitter(img, shift):
+        """jitter color: randomly jitter the RGB values, in numpy formats"""
        jitter_val = np.random.uniform(-shift, shift, 3).astype(np.float32)
        img = img + jitter_val
        img = np.clip(img, 0, 1)
@@ -84,6 +97,7 @@ class FFHQDegradationDataset(data.Dataset):

    @staticmethod
    def color_jitter_pt(img, brightness, contrast, saturation, hue):
+        """jitter color: randomly jitter the brightness, contrast, saturation, and hue, in torch Tensor formats"""
        fn_idx = torch.randperm(4)
        for fn_id in fn_idx:
            if fn_id == 0 and brightness is not None:
@@ -104,6 +118,7 @@ class FFHQDegradationDataset(data.Dataset):
        return img

    def get_component_coordinates(self, index, status):
+        """Get facial component (left_eye, right_eye, mouth) coordinates from a pre-loaded pth file"""
        components_bbox = self.components_list[f'{index:08d}']
        if status[0]:  # hflip
            # exchange right and left eye
@@ -132,6 +147,7 @@ class FFHQDegradationDataset(data.Dataset):
            self.file_client = FileClient(self.io_backend_opt.pop('type'), **self.io_backend_opt)

        # load gt image
+        # Shape: (h, w, c); channel order: BGR; image range: [0, 1], float32.
        gt_path = self.paths[index]
        img_bytes = self.file_client.get(gt_path)
        img_gt = imfrombytes(img_bytes, float32=True)
@@ -140,6 +156,7 @@ class FFHQDegradationDataset(data.Dataset):
        img_gt, status = augment(img_gt, hflip=self.opt['use_hflip'], rotation=False, return_status=True)
        h, w, _ = img_gt.shape

+        # get facial component coordinates
        if self.crop_components:
            locations = self.get_component_coordinates(index, status)
            loc_left_eye, loc_right_eye, loc_mouth = locations
@@ -174,9 +191,9 @@ class FFHQDegradationDataset(data.Dataset):
        if self.gray_prob and np.random.uniform() < self.gray_prob:
            img_lq = cv2.cvtColor(img_lq, cv2.COLOR_BGR2GRAY)
            img_lq = np.tile(img_lq[:, :, None], [1, 1, 3])
-            if self.opt.get('gt_gray'):
+            if self.opt.get('gt_gray'):  # whether convert GT to gray images
                img_gt = cv2.cvtColor(img_gt, cv2.COLOR_BGR2GRAY)
-                img_gt = np.tile(img_gt[:, :, None], [1, 1, 3])
+                img_gt = np.tile(img_gt[:, :, None], [1, 1, 3])  # repeat the color channels

        # BGR to RGB, HWC to CHW, numpy to tensor
        img_gt, img_lq = img2tensor([img_gt, img_lq], bgr2rgb=True, float32=True)
--- a/gfpgan/models/init.py
+++ b/gfpgan/models/init.py
@@ -1,12 +1,10 @@
 import importlib
+from basicsr.utils import scandir
 from os import path as osp

-from basicsr.utils import scandir
-
 # automatically scan and import model modules for registry
-# scan all the files under the 'models' folder and collect files ending with
-# '_model.py'
+# scan all the files that end with '_model.py' under the model folder
 model_folder = osp.dirname(osp.abspath(__file__))
 model_filenames = [osp.splitext(osp.basename(v))[0] for v in scandir(model_folder) if v.endswith('_model.py')]
 # import all the model modules
-_model_modules = [importlib.import_module(f'models.{file_name}') for file_name in model_filenames]
+_model_modules = [importlib.import_module(f'gfpgan.models.{file_name}') for file_name in model_filenames]
--- a/gfpgan/models/gfpgan_model.py
+++ b/gfpgan/models/gfpgan_model.py
@@ -1,11 +1,6 @@
 import math
 import os.path as osp
 import torch
-from collections import OrderedDict
-from torch.nn import functional as F
-from torchvision.ops import roi_align
-from tqdm import tqdm
-
 from basicsr.archs import build_network
 from basicsr.losses import build_loss
 from basicsr.losses.losses import r1_penalty
@@ -13,15 +8,19 @@ from basicsr.metrics import calculate_metric
 from basicsr.models.base_model import BaseModel
 from basicsr.utils import get_root_logger, imwrite, tensor2img
 from basicsr.utils.registry import MODEL_REGISTRY
+from collections import OrderedDict
+from torch.nn import functional as F
+from torchvision.ops import roi_align
+from tqdm import tqdm


@MODEL_REGISTRY.register()
 class GFPGANModel(BaseModel):
-    """GFPGAN model for <Towards real-world blind face restoratin with generative facial prior>"""
+    """The GFPGAN model for Towards real-world blind face restoratin with generative facial prior"""

    def __init__(self, opt):
        super(GFPGANModel, self).__init__(opt)
-        self.idx = 0
+        self.idx = 0  # it is used for saving data for check

        # define network
        self.net_g = build_network(opt['network_g'])
@@ -52,8 +51,7 @@ class GFPGANModel(BaseModel):
            self.load_network(self.net_d, load_path, self.opt['path'].get('strict_load_d', True))

        # ----------- define net_g with Exponential Moving Average (EMA) ----------- #
-        # net_g_ema only used for testing on one GPU and saving
-        # There is no need to wrap with DistributedDataParallel
+        # net_g_ema only used for testing on one GPU and saving. There is no need to wrap with DistributedDataParallel
        self.net_g_ema = build_network(self.opt['network_g']).to(self.device)
        # load pretrained model
        load_path = self.opt['path'].get('pretrain_network_g', None)
@@ -66,7 +64,7 @@ class GFPGANModel(BaseModel):
        self.net_d.train()
        self.net_g_ema.eval()

-        # ----------- facial components networks ----------- #
+        # ----------- facial component networks ----------- #
        if ('network_d_left_eye' in self.opt and 'network_d_right_eye' in self.opt and 'network_d_mouth' in self.opt):
            self.use_facial_disc = True
        else:
@@ -103,17 +101,19 @@ class GFPGANModel(BaseModel):
            self.cri_component = build_loss(train_opt['gan_component_opt']).to(self.device)

        # ----------- define losses ----------- #
+        # pixel loss
        if train_opt.get('pixel_opt'):
            self.cri_pix = build_loss(train_opt['pixel_opt']).to(self.device)
        else:
            self.cri_pix = None

+        # perceptual loss
        if train_opt.get('perceptual_opt'):
            self.cri_perceptual = build_loss(train_opt['perceptual_opt']).to(self.device)
        else:
            self.cri_perceptual = None

-        # L1 loss used in pyramid loss, component style loss and identity loss
+        # L1 loss is used in pyramid loss, component style loss and identity loss
        self.cri_l1 = build_loss(train_opt['L1_opt']).to(self.device)

        # gan loss (wgan)
@@ -180,6 +180,7 @@ class GFPGANModel(BaseModel):
        self.optimizer_d = self.get_optimizer(optim_type, optim_params_d, lr, betas=betas)
        self.optimizers.append(self.optimizer_d)

+        # ----------- optimizers for facial component networks ----------- #
        if self.use_facial_disc:
            # setup optimizers for facial component discriminators
            optim_type = train_opt['optim_component'].pop('type')
@@ -222,6 +223,7 @@ class GFPGANModel(BaseModel):
            #     self.idx = self.idx + 1

    def construct_img_pyramid(self):
+        """Construct image pyramid for intermediate restoration loss"""
        pyramid_gt = [self.gt]
        down_img = self.gt
        for _ in range(0, self.log_size - 3):
@@ -230,7 +232,6 @@ class GFPGANModel(BaseModel):
        return pyramid_gt

    def get_roi_regions(self, eye_out_size=80, mouth_out_size=120):
-        # hard code
        face_ratio = int(self.opt['network_g']['out_size'] / 512)
        eye_out_size *= face_ratio
        mouth_out_size *= face_ratio
@@ -289,6 +290,7 @@ class GFPGANModel(BaseModel):
            p.requires_grad = False
        self.optimizer_g.zero_grad()

+        # do not update facial component net_d
        if self.use_facial_disc:
            for p in self.net_d_left_eye.parameters():
                p.requires_grad = False
@@ -420,11 +422,12 @@ class GFPGANModel(BaseModel):
        real_d_pred = self.net_d(self.gt)
        l_d = self.cri_gan(real_d_pred, True, is_disc=True) + self.cri_gan(fake_d_pred, False, is_disc=True)
        loss_dict['l_d'] = l_d
-        # In wgan, real_score should be positive and fake_score should benegative
+        # In WGAN, real_score should be positive and fake_score should be negative
        loss_dict['real_score'] = real_d_pred.detach().mean()
        loss_dict['fake_score'] = fake_d_pred.detach().mean()
        l_d.backward()

+        # regularization loss
        if current_iter % self.net_d_reg_every == 0:
            self.gt.requires_grad = True
            real_pred = self.net_d(self.gt)
@@ -435,8 +438,9 @@ class GFPGANModel(BaseModel):

        self.optimizer_d.step()

+        # optimize facial component discriminators
        if self.use_facial_disc:
-            # lefe eye
+            # left eye
            fake_d_pred, _ = self.net_d_left_eye(self.left_eyes.detach())
            real_d_pred, _ = self.net_d_left_eye(self.left_eyes_gt)
            l_d_left_eye = self.cri_component(
@@ -486,22 +490,32 @@ class GFPGANModel(BaseModel):
    def nondist_validation(self, dataloader, current_iter, tb_logger, save_img):
        dataset_name = dataloader.dataset.opt['name']
        with_metrics = self.opt['val'].get('metrics') is not None
+        use_pbar = self.opt['val'].get('pbar', False)
+
        if with_metrics:
-            self.metric_results = {metric: 0 for metric in self.opt['val']['metrics'].keys()}
-        pbar = tqdm(total=len(dataloader), unit='image')
+            if not hasattr(self, 'metric_results'):  # only execute in the first run
+                self.metric_results = {metric: 0 for metric in self.opt['val']['metrics'].keys()}
+            # initialize the best metric results for each dataset_name (supporting multiple validation datasets)
+            self._initialize_best_metric_results(dataset_name)
+            # zero self.metric_results
+            self.metric_results = {metric: 0 for metric in self.metric_results}
+
+        metric_data = dict()
+        if use_pbar:
+            pbar = tqdm(total=len(dataloader), unit='image')

        for idx, val_data in enumerate(dataloader):
            img_name = osp.splitext(osp.basename(val_data['lq_path'][0]))[0]
            self.feed_data(val_data)
            self.test()

-            visuals = self.get_current_visuals()
-            sr_img = tensor2img([visuals['sr']], min_max=(-1, 1))
-            gt_img = tensor2img([visuals['gt']], min_max=(-1, 1))
-
-            if 'gt' in visuals:
-                gt_img = tensor2img([visuals['gt']], min_max=(-1, 1))
+            sr_img = tensor2img(self.output.detach().cpu(), min_max=(-1, 1))
+            metric_data['img'] = sr_img
+            if hasattr(self, 'gt'):
+                gt_img = tensor2img(self.gt.detach().cpu(), min_max=(-1, 1))
+                metric_data['img2'] = gt_img
                del self.gt
+
            # tentative for out of GPU memory
            del self.lq
            del self.output
@@ -523,35 +537,38 @@ class GFPGANModel(BaseModel):
            if with_metrics:
                # calculate metrics
                for name, opt_ in self.opt['val']['metrics'].items():
-                    metric_data = dict(img1=sr_img, img2=gt_img)
                    self.metric_results[name] += calculate_metric(metric_data, opt_)
-            pbar.update(1)
-            pbar.set_description(f'Test {img_name}')
-        pbar.close()
+            if use_pbar:
+                pbar.update(1)
+                pbar.set_description(f'Test {img_name}')
+        if use_pbar:
+            pbar.close()

        if with_metrics:
            for metric in self.metric_results.keys():
                self.metric_results[metric] /= (idx + 1)
+                # update the best metric result
+                self._update_best_metric_result(dataset_name, metric, self.metric_results[metric], current_iter)

            self._log_validation_metric_values(current_iter, dataset_name, tb_logger)

    def _log_validation_metric_values(self, current_iter, dataset_name, tb_logger):
        log_str = f'Validation {dataset_name}\n'
        for metric, value in self.metric_results.items():
-            log_str += f'\t # {metric}: {value:.4f}\n'
+            log_str += f'\t # {metric}: {value:.4f}'
+            if hasattr(self, 'best_metric_results'):
+                log_str += (f'\tBest: {self.best_metric_results[dataset_name][metric]["val"]:.4f} @ '
+                            f'{self.best_metric_results[dataset_name][metric]["iter"]} iter')
+            log_str += '\n'
+
        logger = get_root_logger()
        logger.info(log_str)
        if tb_logger:
            for metric, value in self.metric_results.items():
-                tb_logger.add_scalar(f'metrics/{metric}', value, current_iter)
-
-    def get_current_visuals(self):
-        out_dict = OrderedDict()
-        out_dict['gt'] = self.gt.detach().cpu()
-        out_dict['sr'] = self.output.detach().cpu()
-        return out_dict
+                tb_logger.add_scalar(f'metrics/{dataset_name}/{metric}', value, current_iter)

    def save(self, epoch, current_iter):
+        # save net_g and net_d
        self.save_network([self.net_g, self.net_g_ema], 'net_g', current_iter, param_key=['params', 'params_ema'])
        self.save_network(self.net_d, 'net_d', current_iter)
        # save component discriminators
@@ -559,4 +576,5 @@ class GFPGANModel(BaseModel):
            self.save_network(self.net_d_left_eye, 'net_d_left_eye', current_iter)
            self.save_network(self.net_d_right_eye, 'net_d_right_eye', current_iter)
            self.save_network(self.net_d_mouth, 'net_d_mouth', current_iter)
+        # save training state
        self.save_training_state(epoch, current_iter)
--- a/gfpgan/train.py
+++ b/gfpgan/train.py
@@ -0,0 +1,11 @@
+# flake8: noqa
+import os.path as osp
+from basicsr.train import train_pipeline
+
+import gfpgan.archs
+import gfpgan.data
+import gfpgan.models
+
+if __name__ == '__main__':
+    root_path = osp.abspath(osp.join(__file__, osp.pardir, osp.pardir))
+    train_pipeline(root_path)
--- a/gfpgan/utils.py
+++ b/gfpgan/utils.py
@@ -0,0 +1,130 @@
+import cv2
+import os
+import torch
+from basicsr.utils import img2tensor, tensor2img
+from basicsr.utils.download_util import load_file_from_url
+from facexlib.utils.face_restoration_helper import FaceRestoreHelper
+from torchvision.transforms.functional import normalize
+
+from gfpgan.archs.gfpganv1_arch import GFPGANv1
+from gfpgan.archs.gfpganv1_clean_arch import GFPGANv1Clean
+
+ROOT_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
+
+
+class GFPGANer():
+    """Helper for restoration with GFPGAN.
+
+    It will detect and crop faces, and then resize the faces to 512x512.
+    GFPGAN is used to restored the resized faces.
+    The background is upsampled with the bg_upsampler.
+    Finally, the faces will be pasted back to the upsample background image.
+
+    Args:
+        model_path (str): The path to the GFPGAN model. It can be urls (will first download it automatically).
+        upscale (float): The upscale of the final output. Default: 2.
+        arch (str): The GFPGAN architecture. Option: clean | original. Default: clean.
+        channel_multiplier (int): Channel multiplier for large networks of StyleGAN2. Default: 2.
+        bg_upsampler (nn.Module): The upsampler for the background. Default: None.
+    """
+
+    def __init__(self, model_path, upscale=2, arch='clean', channel_multiplier=2, bg_upsampler=None):
+        self.upscale = upscale
+        self.bg_upsampler = bg_upsampler
+
+        # initialize model
+        self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
+        # initialize the GFP-GAN
+        if arch == 'clean':
+            self.gfpgan = GFPGANv1Clean(
+                out_size=512,
+                num_style_feat=512,
+                channel_multiplier=channel_multiplier,
+                decoder_load_path=None,
+                fix_decoder=False,
+                num_mlp=8,
+                input_is_latent=True,
+                different_w=True,
+                narrow=1,
+                sft_half=True)
+        else:
+            self.gfpgan = GFPGANv1(
+                out_size=512,
+                num_style_feat=512,
+                channel_multiplier=channel_multiplier,
+                decoder_load_path=None,
+                fix_decoder=True,
+                num_mlp=8,
+                input_is_latent=True,
+                different_w=True,
+                narrow=1,
+                sft_half=True)
+        # initialize face helper
+        self.face_helper = FaceRestoreHelper(
+            upscale,
+            face_size=512,
+            crop_ratio=(1, 1),
+            det_model='retinaface_resnet50',
+            save_ext='png',
+            device=self.device)
+
+        if model_path.startswith('https://'):
+            model_path = load_file_from_url(
+                url=model_path, model_dir=os.path.join(ROOT_DIR, 'gfpgan/weights'), progress=True, file_name=None)
+        loadnet = torch.load(model_path)
+        if 'params_ema' in loadnet:
+            keyname = 'params_ema'
+        else:
+            keyname = 'params'
+        self.gfpgan.load_state_dict(loadnet[keyname], strict=True)
+        self.gfpgan.eval()
+        self.gfpgan = self.gfpgan.to(self.device)
+
+    @torch.no_grad()
+    def enhance(self, img, has_aligned=False, only_center_face=False, paste_back=True):
+        self.face_helper.clean_all()
+
+        if has_aligned:  # the inputs are already aligned
+            img = cv2.resize(img, (512, 512))
+            self.face_helper.cropped_faces = [img]
+        else:
+            self.face_helper.read_image(img)
+            # get face landmarks for each face
+            self.face_helper.get_face_landmarks_5(only_center_face=only_center_face, eye_dist_threshold=5)
+            # eye_dist_threshold=5: skip faces whose eye distance is smaller than 5 pixels
+            # TODO: even with eye_dist_threshold, it will still introduce wrong detections and restorations.
+            # align and warp each face
+            self.face_helper.align_warp_face()
+
+        # face restoration
+        for cropped_face in self.face_helper.cropped_faces:
+            # prepare data
+            cropped_face_t = img2tensor(cropped_face / 255., bgr2rgb=True, float32=True)
+            normalize(cropped_face_t, (0.5, 0.5, 0.5), (0.5, 0.5, 0.5), inplace=True)
+            cropped_face_t = cropped_face_t.unsqueeze(0).to(self.device)
+
+            try:
+                output = self.gfpgan(cropped_face_t, return_rgb=False)[0]
+                # convert to image
+                restored_face = tensor2img(output.squeeze(0), rgb2bgr=True, min_max=(-1, 1))
+            except RuntimeError as error:
+                print(f'\tFailed inference for GFPGAN: {error}.')
+                restored_face = cropped_face
+
+            restored_face = restored_face.astype('uint8')
+            self.face_helper.add_restored_face(restored_face)
+
+        if not has_aligned and paste_back:
+            # upsample the background
+            if self.bg_upsampler is not None:
+                # Now only support RealESRGAN for upsampling background
+                bg_img = self.bg_upsampler.enhance(img, outscale=self.upscale)[0]
+            else:
+                bg_img = None
+
+            self.face_helper.get_inverse_affine(None)
+            # paste each restored face to the input image
+            restored_img = self.face_helper.paste_faces_to_input_image(upsample_img=bg_img)
+            return self.face_helper.cropped_faces, self.face_helper.restored_faces, restored_img
+        else:
+            return self.face_helper.cropped_faces, self.face_helper.restored_faces, None
--- a/gfpgan/weights/README.md
+++ b/gfpgan/weights/README.md
@@ -0,0 +1,3 @@
+# Weights
+
+Put the downloaded weights to this folder.
--- a/inference_gfpgan.py
+++ b/inference_gfpgan.py
@@ -0,0 +1,116 @@
+import argparse
+import cv2
+import glob
+import numpy as np
+import os
+import torch
+from basicsr.utils import imwrite
+
+from gfpgan import GFPGANer
+
+
+def main():
+    """Inference demo for GFPGAN.
+    """
+    parser = argparse.ArgumentParser()
+    parser.add_argument('--upscale', type=int, default=2, help='The final upsampling scale of the image')
+    parser.add_argument('--arch', type=str, default='clean', help='The GFPGAN architecture. Option: clean | original')
+    parser.add_argument('--channel', type=int, default=2, help='Channel multiplier for large networks of StyleGAN2')
+    parser.add_argument('--model_path', type=str, default='experiments/pretrained_models/GFPGANCleanv1-NoCE-C2.pth')
+    parser.add_argument('--bg_upsampler', type=str, default='realesrgan', help='background upsampler')
+    parser.add_argument(
+        '--bg_tile', type=int, default=400, help='Tile size for background sampler, 0 for no tile during testing')
+    parser.add_argument('--test_path', type=str, default='inputs/whole_imgs', help='Input folder')
+    parser.add_argument('--suffix', type=str, default=None, help='Suffix of the restored faces')
+    parser.add_argument('--only_center_face', action='store_true', help='Only restore the center face')
+    parser.add_argument('--aligned', action='store_true', help='Input are aligned faces')
+    parser.add_argument('--paste_back', action='store_false', help='Paste the restored faces back to images')
+    parser.add_argument('--save_root', type=str, default='results', help='Path to save root')
+    parser.add_argument(
+        '--ext',
+        type=str,
+        default='auto',
+        help='Image extension. Options: auto | jpg | png, auto means using the same extension as inputs')
+    args = parser.parse_args()
+
+    args = parser.parse_args()
+    if args.test_path.endswith('/'):
+        args.test_path = args.test_path[:-1]
+    os.makedirs(args.save_root, exist_ok=True)
+
+    # background upsampler
+    if args.bg_upsampler == 'realesrgan':
+        if not torch.cuda.is_available():  # CPU
+            import warnings
+            warnings.warn('The unoptimized RealESRGAN is very slow on CPU. We do not use it. '
+                          'If you really want to use it, please modify the corresponding codes.')
+            bg_upsampler = None
+        else:
+            from basicsr.archs.rrdbnet_arch import RRDBNet
+            from realesrgan import RealESRGANer
+            model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32, scale=2)
+            bg_upsampler = RealESRGANer(
+                scale=2,
+                model_path='https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.1/RealESRGAN_x2plus.pth',
+                model=model,
+                tile=args.bg_tile,
+                tile_pad=10,
+                pre_pad=0,
+                half=True)  # need to set False in CPU mode
+    else:
+        bg_upsampler = None
+    # set up GFPGAN restorer
+    restorer = GFPGANer(
+        model_path=args.model_path,
+        upscale=args.upscale,
+        arch=args.arch,
+        channel_multiplier=args.channel,
+        bg_upsampler=bg_upsampler)
+
+    img_list = sorted(glob.glob(os.path.join(args.test_path, '*')))
+    for img_path in img_list:
+        # read image
+        img_name = os.path.basename(img_path)
+        print(f'Processing {img_name} ...')
+        basename, ext = os.path.splitext(img_name)
+        input_img = cv2.imread(img_path, cv2.IMREAD_COLOR)
+
+        # restore faces and background if necessary
+        cropped_faces, restored_faces, restored_img = restorer.enhance(
+            input_img, has_aligned=args.aligned, only_center_face=args.only_center_face, paste_back=args.paste_back)
+
+        # save faces
+        for idx, (cropped_face, restored_face) in enumerate(zip(cropped_faces, restored_faces)):
+            # save cropped face
+            save_crop_path = os.path.join(args.save_root, 'cropped_faces', f'{basename}_{idx:02d}.png')
+            imwrite(cropped_face, save_crop_path)
+            # save restored face
+            if args.suffix is not None:
+                save_face_name = f'{basename}_{idx:02d}_{args.suffix}.png'
+            else:
+                save_face_name = f'{basename}_{idx:02d}.png'
+            save_restore_path = os.path.join(args.save_root, 'restored_faces', save_face_name)
+            imwrite(restored_face, save_restore_path)
+            # save comparison image
+            cmp_img = np.concatenate((cropped_face, restored_face), axis=1)
+            imwrite(cmp_img, os.path.join(args.save_root, 'cmp', f'{basename}_{idx:02d}.png'))
+
+        # save restored img
+        if restored_img is not None:
+            if args.ext == 'auto':
+                extension = ext[1:]
+            else:
+                extension = args.ext
+
+            if args.suffix is not None:
+                save_restore_path = os.path.join(args.save_root, 'restored_imgs',
+                                                 f'{basename}_{args.suffix}.{extension}')
+            else:
+                save_restore_path = os.path.join(args.save_root, 'restored_imgs', f'{basename}.{extension}')
+            imwrite(restored_img, save_restore_path)
+
+    print(f'Results are in the [{args.save_root}] folder.')
+
+
+if __name__ == '__main__':
+    main()
--- a/inference_gfpgan_full.py
+++ b/inference_gfpgan_full.py
@@ -1,153 +0,0 @@
-import argparse
-import cv2
-import glob
-import numpy as np
-import os
-import torch
-from facexlib.utils.face_restoration_helper import FaceRestoreHelper
-from torchvision.transforms.functional import normalize
-
-from archs.gfpganv1_arch import GFPGANv1
-from archs.gfpganv1_clean_arch import GFPGANv1Clean
-from basicsr.utils import img2tensor, imwrite, tensor2img
-
-
-def restoration(gfpgan,
-                face_helper,
-                img_path,
-                save_root,
-                has_aligned=False,
-                only_center_face=True,
-                suffix=None,
-                paste_back=False,
-                device='cuda'):
-    # read image
-    img_name = os.path.basename(img_path)
-    print(f'Processing {img_name} ...')
-    basename, _ = os.path.splitext(img_name)
-    input_img = cv2.imread(img_path, cv2.IMREAD_COLOR)
-    face_helper.clean_all()
-
-    if has_aligned:
-        input_img = cv2.resize(input_img, (512, 512))
-        face_helper.cropped_faces = [input_img]
-    else:
-        face_helper.read_image(input_img)
-        # get face landmarks for each face
-        face_helper.get_face_landmarks_5(only_center_face=only_center_face)
-        # align and warp each face
-        save_crop_path = os.path.join(save_root, 'cropped_faces', img_name)
-        face_helper.align_warp_face(save_crop_path)
-
-    # face restoration
-    for idx, cropped_face in enumerate(face_helper.cropped_faces):
-        # prepare data
-        cropped_face_t = img2tensor(cropped_face / 255., bgr2rgb=True, float32=True)
-        normalize(cropped_face_t, (0.5, 0.5, 0.5), (0.5, 0.5, 0.5), inplace=True)
-        cropped_face_t = cropped_face_t.unsqueeze(0).to(device)
-
-        try:
-            with torch.no_grad():
-                output = gfpgan(cropped_face_t, return_rgb=False)[0]
-                # convert to image
-                restored_face = tensor2img(output.squeeze(0), rgb2bgr=True, min_max=(-1, 1))
-        except RuntimeError as error:
-            print(f'\tFailed inference for GFPGAN: {error}.')
-            restored_face = cropped_face
-
-        restored_face = restored_face.astype('uint8')
-        face_helper.add_restored_face(restored_face)
-
-        if suffix is not None:
-            save_face_name = f'{basename}_{idx:02d}_{suffix}.png'
-        else:
-            save_face_name = f'{basename}_{idx:02d}.png'
-        save_restore_path = os.path.join(save_root, 'restored_faces', save_face_name)
-        imwrite(restored_face, save_restore_path)
-
-        # save cmp image
-        cmp_img = np.concatenate((cropped_face, restored_face), axis=1)
-        imwrite(cmp_img, os.path.join(save_root, 'cmp', f'{basename}_{idx:02d}.png'))
-
-    if not has_aligned and paste_back:
-        face_helper.get_inverse_affine(None)
-        save_restore_path = os.path.join(save_root, 'restored_imgs', img_name)
-        # paste each restored face to the input image
-        face_helper.paste_faces_to_input_image(save_restore_path)
-
-
-if __name__ == '__main__':
-    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
-
-    parser = argparse.ArgumentParser()
-
-    parser.add_argument('--upscale_factor', type=int, default=2)
-    parser.add_argument('--arch', type=str, default='clean')
-    parser.add_argument('--channel', type=int, default=2)
-    parser.add_argument('--model_path', type=str, default='experiments/pretrained_models/GFPGANCleanv1-NoCE-C2.pth')
-    parser.add_argument('--test_path', type=str, default='inputs/whole_imgs')
-    parser.add_argument('--suffix', type=str, default=None, help='Suffix of the restored faces')
-    parser.add_argument('--only_center_face', action='store_true')
-    parser.add_argument('--aligned', action='store_true')
-    parser.add_argument('--paste_back', action='store_false')
-    parser.add_argument('--save_root', type=str, default='results')
-
-    args = parser.parse_args()
-    if args.test_path.endswith('/'):
-        args.test_path = args.test_path[:-1]
-    os.makedirs(args.save_root, exist_ok=True)
-
-    # initialize the GFP-GAN
-    if args.arch == 'clean':
-        gfpgan = GFPGANv1Clean(
-            out_size=512,
-            num_style_feat=512,
-            channel_multiplier=args.channel,
-            decoder_load_path=None,
-            fix_decoder=False,
-            # for stylegan decoder
-            num_mlp=8,
-            input_is_latent=True,
-            different_w=True,
-            narrow=1,
-            sft_half=True)
-    else:
-        gfpgan = GFPGANv1(
-            out_size=512,
-            num_style_feat=512,
-            channel_multiplier=args.channel,
-            decoder_load_path=None,
-            fix_decoder=True,
-            # for stylegan decoder
-            num_mlp=8,
-            input_is_latent=True,
-            different_w=True,
-            narrow=1,
-            sft_half=True)
-
-    gfpgan.load_state_dict(torch.load(args.model_path, map_location=lambda storage, loc: storage)['params_ema'])
-    gfpgan.to(device).eval()
-
-    # initialize face helper
-    face_helper = FaceRestoreHelper(
-        args.upscale_factor,
-        face_size=512,
-        crop_ratio=(1, 1),
-        det_model='retinaface_resnet50',
-        save_ext='png',
-        device=device)
-
-    img_list = sorted(glob.glob(os.path.join(args.test_path, '*')))
-    for img_path in img_list:
-        restoration(
-            gfpgan,
-            face_helper,
-            img_path,
-            args.save_root,
-            has_aligned=args.aligned,
-            only_center_face=args.only_center_face,
-            suffix=args.suffix,
-            paste_back=args.paste_back,
-            device=device)
-
-    print(f'Results are in the [{args.save_root}] folder.')
--- a/inputs/whole_imgs/Blake_Lively.jpg
+++ b/inputs/whole_imgs/Blake_Lively.jpg
--- a/options/train_gfpgan_v1.yml
+++ b/options/train_gfpgan_v1.yml
@@ -1,7 +1,7 @@
 # general settings
 name: train_GFPGANv1_512
 model_type: GFPGANModel
-num_gpu: 4
+num_gpu: auto  # officially, we use 4 GPUs
 manual_seed: 0

 # dataset and data loader settings
@@ -194,7 +194,7 @@ val:
  save_img: true

  metrics:
-    psnr: # metric name, can be arbitrary
+    psnr: # metric name
      type: calculate_psnr
      crop_border: 0
      test_y_channel: false
--- a/options/train_gfpgan_v1_simple.yml
+++ b/options/train_gfpgan_v1_simple.yml
@@ -1,7 +1,7 @@
 # general settings
 name: train_GFPGANv1_512_simple
 model_type: GFPGANModel
-num_gpu: 4
+num_gpu: auto  # officially, we use 4 GPUs
 manual_seed: 0

 # dataset and data loader settings
@@ -40,10 +40,6 @@ datasets:
    # gray_prob: 0.01
    # gt_gray: True

-    # crop_components: false
-    # component_path: experiments/pretrained_models/FFHQ_eye_mouth_landmarks_512.pth
-    # eye_enlarge_ratio: 1.4
-
    # data loader
    use_shuffle: true
    num_worker_per_gpu: 6
@@ -86,20 +82,6 @@ network_d:
  channel_multiplier: 1
  resample_kernel: [1, 3, 3, 1]

-# network_d_left_eye:
-#   type: FacialComponentDiscriminator
-
-# network_d_right_eye:
-#   type: FacialComponentDiscriminator
-
-# network_d_mouth:
-#   type: FacialComponentDiscriminator
-
-network_identity:
-  type: ResNetArcFace
-  block: IRBlock
-  layers: [2, 2, 2, 2]
-  use_se: False

 # path
 path:
@@ -107,13 +89,7 @@ path:
  param_key_g: params_ema
  strict_load_g: ~
  pretrain_network_d: ~
-  # pretrain_network_d_left_eye: ~
-  # pretrain_network_d_right_eye: ~
-  # pretrain_network_d_mouth: ~
-  pretrain_network_identity: experiments/pretrained_models/arcface_resnet18.pth
-  # resume
  resume_state: ~
-  ignore_resume_networks: ['network_identity']

 # training settings
 train:
@@ -173,16 +149,6 @@ train:
    loss_weight: !!float 1e-1
  # r1 regularization for discriminator
  r1_reg_weight: 10
-  # facial component loss
-  # gan_component_opt:
-  #   type: GANLoss
-  #   gan_type: vanilla
-  #   real_label_val: 1.0
-  #   fake_label_val: 0.0
-  #   loss_weight: !!float 1
-  # comp_style_weight: 200
-  # identity loss
-  identity_weight: 10

  net_d_iters: 1
  net_d_init_iters: 0
@@ -194,7 +160,7 @@ val:
  save_img: true

  metrics:
-    psnr: # metric name, can be arbitrary
+    psnr: # metric name
      type: calculate_psnr
      crop_border: 0
      test_y_channel: false
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,10 +1,12 @@
-facexlib
-lmdb
-numpy
+torch>=1.7
+numpy<1.21  # numba requires numpy<1.21,>=1.17
 opencv-python
+torchvision
+scipy
+tqdm
+basicsr>=1.3.4.0
+facexlib>=0.2.0.3
+lmdb
 pyyaml
 tb-nightly
-torch>=1.7
-torchvision
-tqdm
 yapf
--- a/scripts/parse_landmark.py
+++ b/scripts/parse_landmark.py
@@ -1,25 +1,31 @@
 import cv2
 import json
 import numpy as np
+import os
 import torch
+from basicsr.utils import FileClient, imfrombytes
 from collections import OrderedDict

-from basicsr.utils import FileClient, imfrombytes
+# ---------------------------- This script is used to parse facial landmarks ------------------------------------- #
+# Configurations
+save_img = False
+scale = 0.5  # 0.5 for official FFHQ (512x512), 1 for others
+enlarge_ratio = 1.4  # only for eyes
+json_path = 'ffhq-dataset-v2.json'
+face_path = 'datasets/ffhq/ffhq_512.lmdb'
+save_path = './FFHQ_eye_mouth_landmarks_512.pth'

 print('Load JSON metadata...')
-# use the json file in FFHQ dataset
-with open('ffhq-dataset-v2.json', 'rb') as f:
+# use the official json file in FFHQ dataset
+with open(json_path, 'rb') as f:
    json_data = json.load(f, object_pairs_hook=OrderedDict)

 print('Open LMDB file...')
 # read ffhq images
-file_client = FileClient('lmdb', db_paths='datasets/ffhq/ffhq_512.lmdb')
-with open('datasets/ffhq/ffhq_512.lmdb/meta_info.txt') as fin:
+file_client = FileClient('lmdb', db_paths=face_path)
+with open(os.path.join(face_path, 'meta_info.txt')) as fin:
    paths = [line.split('.')[0] for line in fin]

-save_img = False
-scale = 0.5  # 0.5 for official FFHQ (512x512), 1 for others
-enlarge_ratio = 1.4  # only for eyes
 save_dict = {}

 for item_idx, item in enumerate(json_data.values()):
@@ -35,6 +41,7 @@ for item_idx, item in enumerate(json_data.values()):
        img_bytes = file_client.get(paths[item_idx])
        img = imfrombytes(img_bytes, float32=True)

+    # get landmarks for each component
    map_left_eye = list(range(36, 42))
    map_right_eye = list(range(42, 48))
    map_mouth = list(range(48, 68))
@@ -75,4 +82,4 @@ for item_idx, item in enumerate(json_data.values()):
    save_dict[f'{item_idx:08d}'] = item_dict

 print('Save...')
-torch.save(save_dict, './FFHQ_eye_mouth_landmarks_512.pth')
+torch.save(save_dict, save_path)
--- a/setup.cfg
+++ b/setup.cfg
@@ -16,7 +16,18 @@ split_before_expression_after_opening_paren = true
 line_length = 120
 multi_line_output = 0
 known_standard_library = pkg_resources,setuptools
-known_first_party = basicsr
-known_third_party = cv2,facexlib,numpy,torch,torchvision,tqdm
+known_first_party = gfpgan
+known_third_party = basicsr,cv2,facexlib,numpy,pytest,torch,torchvision,tqdm,yaml
 no_lines_before = STDLIB,LOCALFOLDER
 default_section = THIRDPARTY
+
+[codespell]
+skip = .git,./docs/build
+count =
+quiet-level = 3
+
+[aliases]
+test=pytest
+
+[tool:pytest]
+addopts=tests/
--- a/setup.py
+++ b/setup.py
@@ -0,0 +1,107 @@
+#!/usr/bin/env python
+
+from setuptools import find_packages, setup
+
+import os
+import subprocess
+import time
+
+version_file = 'gfpgan/version.py'
+
+
+def readme():
+    with open('README.md', encoding='utf-8') as f:
+        content = f.read()
+    return content
+
+
+def get_git_hash():
+
+    def _minimal_ext_cmd(cmd):
+        # construct minimal environment
+        env = {}
+        for k in ['SYSTEMROOT', 'PATH', 'HOME']:
+            v = os.environ.get(k)
+            if v is not None:
+                env[k] = v
+        # LANGUAGE is used on win32
+        env['LANGUAGE'] = 'C'
+        env['LANG'] = 'C'
+        env['LC_ALL'] = 'C'
+        out = subprocess.Popen(cmd, stdout=subprocess.PIPE, env=env).communicate()[0]
+        return out
+
+    try:
+        out = _minimal_ext_cmd(['git', 'rev-parse', 'HEAD'])
+        sha = out.strip().decode('ascii')
+    except OSError:
+        sha = 'unknown'
+
+    return sha
+
+
+def get_hash():
+    if os.path.exists('.git'):
+        sha = get_git_hash()[:7]
+    else:
+        sha = 'unknown'
+
+    return sha
+
+
+def write_version_py():
+    content = """# GENERATED VERSION FILE
+# TIME: {}
+__version__ = '{}'
+__gitsha__ = '{}'
+version_info = ({})
+"""
+    sha = get_hash()
+    with open('VERSION', 'r') as f:
+        SHORT_VERSION = f.read().strip()
+    VERSION_INFO = ', '.join([x if x.isdigit() else f'"{x}"' for x in SHORT_VERSION.split('.')])
+
+    version_file_str = content.format(time.asctime(), SHORT_VERSION, sha, VERSION_INFO)
+    with open(version_file, 'w') as f:
+        f.write(version_file_str)
+
+
+def get_version():
+    with open(version_file, 'r') as f:
+        exec(compile(f.read(), version_file, 'exec'))
+    return locals()['__version__']
+
+
+def get_requirements(filename='requirements.txt'):
+    here = os.path.dirname(os.path.realpath(__file__))
+    with open(os.path.join(here, filename), 'r') as f:
+        requires = [line.replace('\n', '') for line in f.readlines()]
+    return requires
+
+
+if __name__ == '__main__':
+    write_version_py()
+    setup(
+        name='gfpgan',
+        version=get_version(),
+        description='GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration',
+        long_description=readme(),
+        long_description_content_type='text/markdown',
+        author='Xintao Wang',
+        author_email='xintao.wang@outlook.com',
+        keywords='computer vision, pytorch, image restoration, super-resolution, face restoration, gan, gfpgan',
+        url='https://github.com/TencentARC/GFPGAN',
+        include_package_data=True,
+        packages=find_packages(exclude=('options', 'datasets', 'experiments', 'results', 'tb_logger', 'wandb')),
+        classifiers=[
+            'Development Status :: 4 - Beta',
+            'License :: OSI Approved :: Apache Software License',
+            'Operating System :: OS Independent',
+            'Programming Language :: Python :: 3',
+            'Programming Language :: Python :: 3.7',
+            'Programming Language :: Python :: 3.8',
+        ],
+        license='Apache License Version 2.0',
+        setup_requires=['cython', 'numpy'],
+        install_requires=get_requirements(),
+        zip_safe=False)
--- a/tests/data/ffhq_gt.lmdb/data.mdb
+++ b/tests/data/ffhq_gt.lmdb/data.mdb
--- a/tests/data/ffhq_gt.lmdb/lock.mdb
+++ b/tests/data/ffhq_gt.lmdb/lock.mdb
--- a/tests/data/ffhq_gt.lmdb/meta_info.txt
+++ b/tests/data/ffhq_gt.lmdb/meta_info.txt
@@ -0,0 +1 @@
+00000000.png (512,512,3) 1
--- a/tests/data/gt/00000000.png
+++ b/tests/data/gt/00000000.png
--- a/tests/data/test_eye_mouth_landmarks.pth
+++ b/tests/data/test_eye_mouth_landmarks.pth
--- a/tests/data/test_ffhq_degradation_dataset.yml
+++ b/tests/data/test_ffhq_degradation_dataset.yml
@@ -0,0 +1,24 @@
+name: UnitTest
+type: FFHQDegradationDataset
+dataroot_gt: tests/data/gt
+io_backend:
+  type: disk
+
+use_hflip: true
+mean: [0.5, 0.5, 0.5]
+std: [0.5, 0.5, 0.5]
+out_size: 512
+
+blur_kernel_size: 41
+kernel_list: ['iso', 'aniso']
+kernel_prob: [0.5, 0.5]
+blur_sigma: [0.1, 10]
+downsample_range: [0.8, 8]
+noise_range: [0, 20]
+jpeg_range: [60, 100]
+
+# color jitter and gray
+color_jitter_prob: 1
+color_jitter_shift: 20
+color_jitter_pt_prob: 1
+gray_prob: 1
--- a/tests/data/test_gfpgan_model.yml
+++ b/tests/data/test_gfpgan_model.yml
@@ -0,0 +1,140 @@
+num_gpu: 1
+manual_seed: 0
+is_train: True
+dist: False
+
+# network structures
+network_g:
+  type: GFPGANv1
+  out_size: 512
+  num_style_feat: 512
+  channel_multiplier: 1
+  resample_kernel: [1, 3, 3, 1]
+  decoder_load_path: ~
+  fix_decoder: true
+  num_mlp: 8
+  lr_mlp: 0.01
+  input_is_latent: true
+  different_w: true
+  narrow: 0.5
+  sft_half: true
+
+network_d:
+  type: StyleGAN2Discriminator
+  out_size: 512
+  channel_multiplier: 1
+  resample_kernel: [1, 3, 3, 1]
+
+network_d_left_eye:
+  type: FacialComponentDiscriminator
+
+network_d_right_eye:
+  type: FacialComponentDiscriminator
+
+network_d_mouth:
+  type: FacialComponentDiscriminator
+
+network_identity:
+  type: ResNetArcFace
+  block: IRBlock
+  layers: [2, 2, 2, 2]
+  use_se: False
+
+# path
+path:
+  pretrain_network_g: ~
+  param_key_g: params_ema
+  strict_load_g: ~
+  pretrain_network_d: ~
+  pretrain_network_d_left_eye: ~
+  pretrain_network_d_right_eye: ~
+  pretrain_network_d_mouth: ~
+  pretrain_network_identity: ~
+  # resume
+  resume_state: ~
+  ignore_resume_networks: ['network_identity']
+
+# training settings
+train:
+  optim_g:
+    type: Adam
+    lr: !!float 2e-3
+  optim_d:
+    type: Adam
+    lr: !!float 2e-3
+  optim_component:
+    type: Adam
+    lr: !!float 2e-3
+
+  scheduler:
+    type: MultiStepLR
+    milestones: [600000, 700000]
+    gamma: 0.5
+
+  total_iter: 800000
+  warmup_iter: -1  # no warm up
+
+  # losses
+  # pixel loss
+  pixel_opt:
+    type: L1Loss
+    loss_weight: !!float 1e-1
+    reduction: mean
+  # L1 loss used in pyramid loss, component style loss and identity loss
+  L1_opt:
+    type: L1Loss
+    loss_weight: 1
+    reduction: mean
+
+  # image pyramid loss
+  pyramid_loss_weight: 1
+  remove_pyramid_loss: 50000
+  # perceptual loss (content and style losses)
+  perceptual_opt:
+    type: PerceptualLoss
+    layer_weights:
+      # before relu
+      'conv1_2': 0.1
+      'conv2_2': 0.1
+      'conv3_4': 1
+      'conv4_4': 1
+      'conv5_4': 1
+    vgg_type: vgg19
+    use_input_norm: true
+    perceptual_weight: !!float 1
+    style_weight: 50
+    range_norm: true
+    criterion: l1
+  # gan loss
+  gan_opt:
+    type: GANLoss
+    gan_type: wgan_softplus
+    loss_weight: !!float 1e-1
+  # r1 regularization for discriminator
+  r1_reg_weight: 10
+  # facial component loss
+  gan_component_opt:
+    type: GANLoss
+    gan_type: vanilla
+    real_label_val: 1.0
+    fake_label_val: 0.0
+    loss_weight: !!float 1
+  comp_style_weight: 200
+  # identity loss
+  identity_weight: 10
+
+  net_d_iters: 1
+  net_d_init_iters: 0
+  net_d_reg_every: 1
+
+# validation settings
+val:
+  val_freq: !!float 5e3
+  save_img: True
+  use_pbar: True
+
+  metrics:
+    psnr: # metric name
+      type: calculate_psnr
+      crop_border: 0
+      test_y_channel: false
--- a/tests/test_arcface_arch.py
+++ b/tests/test_arcface_arch.py
@@ -0,0 +1,49 @@
+import torch
+
+from gfpgan.archs.arcface_arch import BasicBlock, Bottleneck, ResNetArcFace
+
+
+def test_resnetarcface():
+    """Test arch: ResNetArcFace."""
+
+    # model init and forward (gpu)
+    if torch.cuda.is_available():
+        net = ResNetArcFace(block='IRBlock', layers=(2, 2, 2, 2), use_se=True).cuda().eval()
+        img = torch.rand((1, 1, 128, 128), dtype=torch.float32).cuda()
+        output = net(img)
+        assert output.shape == (1, 512)
+
+        # -------------------- without SE block ----------------------- #
+        net = ResNetArcFace(block='IRBlock', layers=(2, 2, 2, 2), use_se=False).cuda().eval()
+        output = net(img)
+        assert output.shape == (1, 512)
+
+
+def test_basicblock():
+    """Test the BasicBlock in arcface_arch"""
+    block = BasicBlock(1, 3, stride=1, downsample=None).cuda()
+    img = torch.rand((1, 1, 12, 12), dtype=torch.float32).cuda()
+    output = block(img)
+    assert output.shape == (1, 3, 12, 12)
+
+    # ----------------- use the downsmaple module--------------- #
+    downsample = torch.nn.UpsamplingNearest2d(scale_factor=0.5).cuda()
+    block = BasicBlock(1, 3, stride=2, downsample=downsample).cuda()
+    img = torch.rand((1, 1, 12, 12), dtype=torch.float32).cuda()
+    output = block(img)
+    assert output.shape == (1, 3, 6, 6)
+
+
+def test_bottleneck():
+    """Test the Bottleneck in arcface_arch"""
+    block = Bottleneck(1, 1, stride=1, downsample=None).cuda()
+    img = torch.rand((1, 1, 12, 12), dtype=torch.float32).cuda()
+    output = block(img)
+    assert output.shape == (1, 4, 12, 12)
+
+    # ----------------- use the downsmaple module--------------- #
+    downsample = torch.nn.UpsamplingNearest2d(scale_factor=0.5).cuda()
+    block = Bottleneck(1, 1, stride=2, downsample=downsample).cuda()
+    img = torch.rand((1, 1, 12, 12), dtype=torch.float32).cuda()
+    output = block(img)
+    assert output.shape == (1, 4, 6, 6)
--- a/tests/test_ffhq_degradation_dataset.py
+++ b/tests/test_ffhq_degradation_dataset.py
@@ -0,0 +1,96 @@
+import pytest
+import yaml
+
+from gfpgan.data.ffhq_degradation_dataset import FFHQDegradationDataset
+
+
+def test_ffhq_degradation_dataset():
+
+    with open('tests/data/test_ffhq_degradation_dataset.yml', mode='r') as f:
+        opt = yaml.load(f, Loader=yaml.FullLoader)
+
+    dataset = FFHQDegradationDataset(opt)
+    assert dataset.io_backend_opt['type'] == 'disk'  # io backend
+    assert len(dataset) == 1  # whether to read correct meta info
+    assert dataset.kernel_list == ['iso', 'aniso']  # correct initialization the degradation configurations
+    assert dataset.color_jitter_prob == 1
+
+    # test __getitem__
+    result = dataset.__getitem__(0)
+    # check returned keys
+    expected_keys = ['gt', 'lq', 'gt_path']
+    assert set(expected_keys).issubset(set(result.keys()))
+    # check shape and contents
+    assert result['gt'].shape == (3, 512, 512)
+    assert result['lq'].shape == (3, 512, 512)
+    assert result['gt_path'] == 'tests/data/gt/00000000.png'
+
+    # ------------------ test with probability = 0 -------------------- #
+    opt['color_jitter_prob'] = 0
+    opt['color_jitter_pt_prob'] = 0
+    opt['gray_prob'] = 0
+    opt['io_backend'] = dict(type='disk')
+    dataset = FFHQDegradationDataset(opt)
+    assert dataset.io_backend_opt['type'] == 'disk'  # io backend
+    assert len(dataset) == 1  # whether to read correct meta info
+    assert dataset.kernel_list == ['iso', 'aniso']  # correct initialization the degradation configurations
+    assert dataset.color_jitter_prob == 0
+
+    # test __getitem__
+    result = dataset.__getitem__(0)
+    # check returned keys
+    expected_keys = ['gt', 'lq', 'gt_path']
+    assert set(expected_keys).issubset(set(result.keys()))
+    # check shape and contents
+    assert result['gt'].shape == (3, 512, 512)
+    assert result['lq'].shape == (3, 512, 512)
+    assert result['gt_path'] == 'tests/data/gt/00000000.png'
+
+    # ------------------ test lmdb backend -------------------- #
+    opt['dataroot_gt'] = 'tests/data/ffhq_gt.lmdb'
+    opt['io_backend'] = dict(type='lmdb')
+
+    dataset = FFHQDegradationDataset(opt)
+    assert dataset.io_backend_opt['type'] == 'lmdb'  # io backend
+    assert len(dataset) == 1  # whether to read correct meta info
+    assert dataset.kernel_list == ['iso', 'aniso']  # correct initialization the degradation configurations
+    assert dataset.color_jitter_prob == 0
+
+    # test __getitem__
+    result = dataset.__getitem__(0)
+    # check returned keys
+    expected_keys = ['gt', 'lq', 'gt_path']
+    assert set(expected_keys).issubset(set(result.keys()))
+    # check shape and contents
+    assert result['gt'].shape == (3, 512, 512)
+    assert result['lq'].shape == (3, 512, 512)
+    assert result['gt_path'] == '00000000'
+
+    # ------------------ test with crop_components -------------------- #
+    opt['crop_components'] = True
+    opt['component_path'] = 'tests/data/test_eye_mouth_landmarks.pth'
+    opt['eye_enlarge_ratio'] = 1.4
+    opt['gt_gray'] = True
+    opt['io_backend'] = dict(type='lmdb')
+
+    dataset = FFHQDegradationDataset(opt)
+    assert dataset.crop_components is True
+
+    # test __getitem__
+    result = dataset.__getitem__(0)
+    # check returned keys
+    expected_keys = ['gt', 'lq', 'gt_path', 'loc_left_eye', 'loc_right_eye', 'loc_mouth']
+    assert set(expected_keys).issubset(set(result.keys()))
+    # check shape and contents
+    assert result['gt'].shape == (3, 512, 512)
+    assert result['lq'].shape == (3, 512, 512)
+    assert result['gt_path'] == '00000000'
+    assert result['loc_left_eye'].shape == (4, )
+    assert result['loc_right_eye'].shape == (4, )
+    assert result['loc_mouth'].shape == (4, )
+
+    # ------------------ lmdb backend should have paths ends with lmdb -------------------- #
+    with pytest.raises(ValueError):
+        opt['dataroot_gt'] = 'tests/data/gt'
+        opt['io_backend'] = dict(type='lmdb')
+        dataset = FFHQDegradationDataset(opt)
--- a/tests/test_gfpgan_arch.py
+++ b/tests/test_gfpgan_arch.py
@@ -0,0 +1,203 @@
+import torch
+
+from gfpgan.archs.gfpganv1_arch import FacialComponentDiscriminator, GFPGANv1, StyleGAN2GeneratorSFT
+from gfpgan.archs.gfpganv1_clean_arch import GFPGANv1Clean, StyleGAN2GeneratorCSFT
+
+
+def test_stylegan2generatorsft():
+    """Test arch: StyleGAN2GeneratorSFT."""
+
+    # model init and forward (gpu)
+    if torch.cuda.is_available():
+        net = StyleGAN2GeneratorSFT(
+            out_size=32,
+            num_style_feat=512,
+            num_mlp=8,
+            channel_multiplier=1,
+            resample_kernel=(1, 3, 3, 1),
+            lr_mlp=0.01,
+            narrow=1,
+            sft_half=False).cuda().eval()
+        style = torch.rand((1, 512), dtype=torch.float32).cuda()
+        condition1 = torch.rand((1, 512, 8, 8), dtype=torch.float32).cuda()
+        condition2 = torch.rand((1, 512, 16, 16), dtype=torch.float32).cuda()
+        condition3 = torch.rand((1, 512, 32, 32), dtype=torch.float32).cuda()
+        conditions = [condition1, condition1, condition2, condition2, condition3, condition3]
+        output = net([style], conditions)
+        assert output[0].shape == (1, 3, 32, 32)
+        assert output[1] is None
+
+        # -------------------- with return_latents ----------------------- #
+        output = net([style], conditions, return_latents=True)
+        assert output[0].shape == (1, 3, 32, 32)
+        assert len(output[1]) == 1
+        # check latent
+        assert output[1][0].shape == (8, 512)
+
+        # -------------------- with randomize_noise = False ----------------------- #
+        output = net([style], conditions, randomize_noise=False)
+        assert output[0].shape == (1, 3, 32, 32)
+        assert output[1] is None
+
+        # -------------------- with truncation = 0.5 and mixing----------------------- #
+        output = net([style, style], conditions, truncation=0.5, truncation_latent=style)
+        assert output[0].shape == (1, 3, 32, 32)
+        assert output[1] is None
+
+
+def test_gfpganv1():
+    """Test arch: GFPGANv1."""
+
+    # model init and forward (gpu)
+    if torch.cuda.is_available():
+        net = GFPGANv1(
+            out_size=32,
+            num_style_feat=512,
+            channel_multiplier=1,
+            resample_kernel=(1, 3, 3, 1),
+            decoder_load_path=None,
+            fix_decoder=True,
+            # for stylegan decoder
+            num_mlp=8,
+            lr_mlp=0.01,
+            input_is_latent=False,
+            different_w=False,
+            narrow=1,
+            sft_half=True).cuda().eval()
+        img = torch.rand((1, 3, 32, 32), dtype=torch.float32).cuda()
+        output = net(img)
+        assert output[0].shape == (1, 3, 32, 32)
+        assert len(output[1]) == 3
+        # check out_rgbs for intermediate loss
+        assert output[1][0].shape == (1, 3, 8, 8)
+        assert output[1][1].shape == (1, 3, 16, 16)
+        assert output[1][2].shape == (1, 3, 32, 32)
+
+        # -------------------- with different_w = True ----------------------- #
+        net = GFPGANv1(
+            out_size=32,
+            num_style_feat=512,
+            channel_multiplier=1,
+            resample_kernel=(1, 3, 3, 1),
+            decoder_load_path=None,
+            fix_decoder=True,
+            # for stylegan decoder
+            num_mlp=8,
+            lr_mlp=0.01,
+            input_is_latent=False,
+            different_w=True,
+            narrow=1,
+            sft_half=True).cuda().eval()
+        img = torch.rand((1, 3, 32, 32), dtype=torch.float32).cuda()
+        output = net(img)
+        assert output[0].shape == (1, 3, 32, 32)
+        assert len(output[1]) == 3
+        # check out_rgbs for intermediate loss
+        assert output[1][0].shape == (1, 3, 8, 8)
+        assert output[1][1].shape == (1, 3, 16, 16)
+        assert output[1][2].shape == (1, 3, 32, 32)
+
+
+def test_facialcomponentdiscriminator():
+    """Test arch: FacialComponentDiscriminator."""
+
+    # model init and forward (gpu)
+    if torch.cuda.is_available():
+        net = FacialComponentDiscriminator().cuda().eval()
+        img = torch.rand((1, 3, 32, 32), dtype=torch.float32).cuda()
+        output = net(img)
+        assert len(output) == 2
+        assert output[0].shape == (1, 1, 8, 8)
+        assert output[1] is None
+
+        # -------------------- return intermediate features ----------------------- #
+        output = net(img, return_feats=True)
+        assert len(output) == 2
+        assert output[0].shape == (1, 1, 8, 8)
+        assert len(output[1]) == 2
+        assert output[1][0].shape == (1, 128, 16, 16)
+        assert output[1][1].shape == (1, 256, 8, 8)
+
+
+def test_stylegan2generatorcsft():
+    """Test arch: StyleGAN2GeneratorCSFT."""
+
+    # model init and forward (gpu)
+    if torch.cuda.is_available():
+        net = StyleGAN2GeneratorCSFT(
+            out_size=32, num_style_feat=512, num_mlp=8, channel_multiplier=1, narrow=1, sft_half=False).cuda().eval()
+        style = torch.rand((1, 512), dtype=torch.float32).cuda()
+        condition1 = torch.rand((1, 512, 8, 8), dtype=torch.float32).cuda()
+        condition2 = torch.rand((1, 512, 16, 16), dtype=torch.float32).cuda()
+        condition3 = torch.rand((1, 512, 32, 32), dtype=torch.float32).cuda()
+        conditions = [condition1, condition1, condition2, condition2, condition3, condition3]
+        output = net([style], conditions)
+        assert output[0].shape == (1, 3, 32, 32)
+        assert output[1] is None
+
+        # -------------------- with return_latents ----------------------- #
+        output = net([style], conditions, return_latents=True)
+        assert output[0].shape == (1, 3, 32, 32)
+        assert len(output[1]) == 1
+        # check latent
+        assert output[1][0].shape == (8, 512)
+
+        # -------------------- with randomize_noise = False ----------------------- #
+        output = net([style], conditions, randomize_noise=False)
+        assert output[0].shape == (1, 3, 32, 32)
+        assert output[1] is None
+
+        # -------------------- with truncation = 0.5 and mixing----------------------- #
+        output = net([style, style], conditions, truncation=0.5, truncation_latent=style)
+        assert output[0].shape == (1, 3, 32, 32)
+        assert output[1] is None
+
+
+def test_gfpganv1clean():
+    """Test arch: GFPGANv1Clean."""
+
+    # model init and forward (gpu)
+    if torch.cuda.is_available():
+        net = GFPGANv1Clean(
+            out_size=32,
+            num_style_feat=512,
+            channel_multiplier=1,
+            decoder_load_path=None,
+            fix_decoder=True,
+            # for stylegan decoder
+            num_mlp=8,
+            input_is_latent=False,
+            different_w=False,
+            narrow=1,
+            sft_half=True).cuda().eval()
+
+        img = torch.rand((1, 3, 32, 32), dtype=torch.float32).cuda()
+        output = net(img)
+        assert output[0].shape == (1, 3, 32, 32)
+        assert len(output[1]) == 3
+        # check out_rgbs for intermediate loss
+        assert output[1][0].shape == (1, 3, 8, 8)
+        assert output[1][1].shape == (1, 3, 16, 16)
+        assert output[1][2].shape == (1, 3, 32, 32)
+
+        # -------------------- with different_w = True ----------------------- #
+        net = GFPGANv1Clean(
+            out_size=32,
+            num_style_feat=512,
+            channel_multiplier=1,
+            decoder_load_path=None,
+            fix_decoder=True,
+            # for stylegan decoder
+            num_mlp=8,
+            input_is_latent=False,
+            different_w=True,
+            narrow=1,
+            sft_half=True).cuda().eval()
+        img = torch.rand((1, 3, 32, 32), dtype=torch.float32).cuda()
+        output = net(img)
+        assert output[0].shape == (1, 3, 32, 32)
+        assert len(output[1]) == 3
+        # check out_rgbs for intermediate loss
+        assert output[1][0].shape == (1, 3, 8, 8)
+        assert output[1][1].shape == (1, 3, 16, 16)
+        assert output[1][2].shape == (1, 3, 32, 32)
--- a/tests/test_gfpgan_model.py
+++ b/tests/test_gfpgan_model.py
@@ -0,0 +1,132 @@
+import tempfile
+import torch
+import yaml
+from basicsr.archs.stylegan2_arch import StyleGAN2Discriminator
+from basicsr.data.paired_image_dataset import PairedImageDataset
+from basicsr.losses.losses import GANLoss, L1Loss, PerceptualLoss
+
+from gfpgan.archs.arcface_arch import ResNetArcFace
+from gfpgan.archs.gfpganv1_arch import FacialComponentDiscriminator, GFPGANv1
+from gfpgan.models.gfpgan_model import GFPGANModel
+
+
+def test_gfpgan_model():
+    with open('tests/data/test_gfpgan_model.yml', mode='r') as f:
+        opt = yaml.load(f, Loader=yaml.FullLoader)
+
+    # build model
+    model = GFPGANModel(opt)
+    # test attributes
+    assert model.__class__.__name__ == 'GFPGANModel'
+    assert isinstance(model.net_g, GFPGANv1)  # generator
+    assert isinstance(model.net_d, StyleGAN2Discriminator)  # discriminator
+    # facial component discriminators
+    assert isinstance(model.net_d_left_eye, FacialComponentDiscriminator)
+    assert isinstance(model.net_d_right_eye, FacialComponentDiscriminator)
+    assert isinstance(model.net_d_mouth, FacialComponentDiscriminator)
+    # identity network
+    assert isinstance(model.network_identity, ResNetArcFace)
+    # losses
+    assert isinstance(model.cri_pix, L1Loss)
+    assert isinstance(model.cri_perceptual, PerceptualLoss)
+    assert isinstance(model.cri_gan, GANLoss)
+    assert isinstance(model.cri_l1, L1Loss)
+    # optimizer
+    assert isinstance(model.optimizers[0], torch.optim.Adam)
+    assert isinstance(model.optimizers[1], torch.optim.Adam)
+
+    # prepare data
+    gt = torch.rand((1, 3, 512, 512), dtype=torch.float32)
+    lq = torch.rand((1, 3, 512, 512), dtype=torch.float32)
+    loc_left_eye = torch.rand((1, 4), dtype=torch.float32)
+    loc_right_eye = torch.rand((1, 4), dtype=torch.float32)
+    loc_mouth = torch.rand((1, 4), dtype=torch.float32)
+    data = dict(gt=gt, lq=lq, loc_left_eye=loc_left_eye, loc_right_eye=loc_right_eye, loc_mouth=loc_mouth)
+    model.feed_data(data)
+    # check data shape
+    assert model.lq.shape == (1, 3, 512, 512)
+    assert model.gt.shape == (1, 3, 512, 512)
+    assert model.loc_left_eyes.shape == (1, 4)
+    assert model.loc_right_eyes.shape == (1, 4)
+    assert model.loc_mouths.shape == (1, 4)
+
+    # ----------------- test optimize_parameters -------------------- #
+    model.feed_data(data)
+    model.optimize_parameters(1)
+    assert model.output.shape == (1, 3, 512, 512)
+    assert isinstance(model.log_dict, dict)
+    # check returned keys
+    expected_keys = [
+        'l_g_pix', 'l_g_percep', 'l_g_style', 'l_g_gan', 'l_g_gan_left_eye', 'l_g_gan_right_eye', 'l_g_gan_mouth',
+        'l_g_comp_style_loss', 'l_identity', 'l_d', 'real_score', 'fake_score', 'l_d_r1', 'l_d_left_eye',
+        'l_d_right_eye', 'l_d_mouth'
+    ]
+    assert set(expected_keys).issubset(set(model.log_dict.keys()))
+
+    # ----------------- remove pyramid_loss_weight-------------------- #
+    model.feed_data(data)
+    model.optimize_parameters(100000)  # large than remove_pyramid_loss = 50000
+    assert model.output.shape == (1, 3, 512, 512)
+    assert isinstance(model.log_dict, dict)
+    # check returned keys
+    expected_keys = [
+        'l_g_pix', 'l_g_percep', 'l_g_style', 'l_g_gan', 'l_g_gan_left_eye', 'l_g_gan_right_eye', 'l_g_gan_mouth',
+        'l_g_comp_style_loss', 'l_identity', 'l_d', 'real_score', 'fake_score', 'l_d_r1', 'l_d_left_eye',
+        'l_d_right_eye', 'l_d_mouth'
+    ]
+    assert set(expected_keys).issubset(set(model.log_dict.keys()))
+
+    # ----------------- test save -------------------- #
+    with tempfile.TemporaryDirectory() as tmpdir:
+        model.opt['path']['models'] = tmpdir
+        model.opt['path']['training_states'] = tmpdir
+        model.save(0, 1)
+
+    # ----------------- test the test function -------------------- #
+    model.test()
+    assert model.output.shape == (1, 3, 512, 512)
+    # delete net_g_ema
+    model.__delattr__('net_g_ema')
+    model.test()
+    assert model.output.shape == (1, 3, 512, 512)
+    assert model.net_g.training is True  # should back to training mode after testing
+
+    # ----------------- test nondist_validation -------------------- #
+    # construct dataloader
+    dataset_opt = dict(
+        name='Demo',
+        dataroot_gt='tests/data/gt',
+        dataroot_lq='tests/data/gt',
+        io_backend=dict(type='disk'),
+        scale=4,
+        phase='val')
+    dataset = PairedImageDataset(dataset_opt)
+    dataloader = torch.utils.data.DataLoader(dataset=dataset, batch_size=1, shuffle=False, num_workers=0)
+    assert model.is_train is True
+    with tempfile.TemporaryDirectory() as tmpdir:
+        model.opt['path']['visualization'] = tmpdir
+        model.nondist_validation(dataloader, 1, None, save_img=True)
+        assert model.is_train is True
+        # check metric_results
+        assert 'psnr' in model.metric_results
+        assert isinstance(model.metric_results['psnr'], float)
+
+    # validation
+    with tempfile.TemporaryDirectory() as tmpdir:
+        model.opt['is_train'] = False
+        model.opt['val']['suffix'] = 'test'
+        model.opt['path']['visualization'] = tmpdir
+        model.opt['val']['pbar'] = True
+        model.nondist_validation(dataloader, 1, None, save_img=True)
+        # check metric_results
+        assert 'psnr' in model.metric_results
+        assert isinstance(model.metric_results['psnr'], float)
+
+        # if opt['val']['suffix'] is None
+        model.opt['val']['suffix'] = None
+        model.opt['name'] = 'demo'
+        model.opt['path']['visualization'] = tmpdir
+        model.nondist_validation(dataloader, 1, None, save_img=True)
+        # check metric_results
+        assert 'psnr' in model.metric_results
+        assert isinstance(model.metric_results['psnr'], float)
--- a/tests/test_stylegan2_clean_arch.py
+++ b/tests/test_stylegan2_clean_arch.py
@@ -0,0 +1,52 @@
+import torch
+
+from gfpgan.archs.stylegan2_clean_arch import StyleGAN2GeneratorClean
+
+
+def test_stylegan2generatorclean():
+    """Test arch: StyleGAN2GeneratorClean."""
+
+    # model init and forward (gpu)
+    if torch.cuda.is_available():
+        net = StyleGAN2GeneratorClean(
+            out_size=32, num_style_feat=512, num_mlp=8, channel_multiplier=1, narrow=0.5).cuda().eval()
+        style = torch.rand((1, 512), dtype=torch.float32).cuda()
+        output = net([style], input_is_latent=False)
+        assert output[0].shape == (1, 3, 32, 32)
+        assert output[1] is None
+
+        # -------------------- with return_latents ----------------------- #
+        output = net([style], input_is_latent=True, return_latents=True)
+        assert output[0].shape == (1, 3, 32, 32)
+        assert len(output[1]) == 1
+        # check latent
+        assert output[1][0].shape == (8, 512)
+
+        # -------------------- with randomize_noise = False ----------------------- #
+        output = net([style], randomize_noise=False)
+        assert output[0].shape == (1, 3, 32, 32)
+        assert output[1] is None
+
+        # -------------------- with truncation = 0.5 and mixing----------------------- #
+        output = net([style, style], truncation=0.5, truncation_latent=style)
+        assert output[0].shape == (1, 3, 32, 32)
+        assert output[1] is None
+
+        # ------------------ test make_noise ----------------------- #
+        out = net.make_noise()
+        assert len(out) == 7
+        assert out[0].shape == (1, 1, 4, 4)
+        assert out[1].shape == (1, 1, 8, 8)
+        assert out[2].shape == (1, 1, 8, 8)
+        assert out[3].shape == (1, 1, 16, 16)
+        assert out[4].shape == (1, 1, 16, 16)
+        assert out[5].shape == (1, 1, 32, 32)
+        assert out[6].shape == (1, 1, 32, 32)
+
+        # ------------------ test get_latent ----------------------- #
+        out = net.get_latent(style)
+        assert out.shape == (1, 512)
+
+        # ------------------ test mean_latent ----------------------- #
+        out = net.mean_latent(2)
+        assert out.shape == (1, 512)
--- a/tests/test_utils.py
+++ b/tests/test_utils.py
@@ -0,0 +1,43 @@
+import cv2
+from facexlib.utils.face_restoration_helper import FaceRestoreHelper
+
+from gfpgan.archs.gfpganv1_arch import GFPGANv1
+from gfpgan.archs.gfpganv1_clean_arch import GFPGANv1Clean
+from gfpgan.utils import GFPGANer
+
+
+def test_gfpganer():
+    # initialize with the clean model
+    restorer = GFPGANer(
+        model_path='experiments/pretrained_models/GFPGANCleanv1-NoCE-C2.pth',
+        upscale=2,
+        arch='clean',
+        channel_multiplier=2,
+        bg_upsampler=None)
+    # test attribute
+    assert isinstance(restorer.gfpgan, GFPGANv1Clean)
+    assert isinstance(restorer.face_helper, FaceRestoreHelper)
+
+    # initialize with the original model
+    restorer = GFPGANer(
+        model_path='experiments/pretrained_models/GFPGANv1.pth',
+        upscale=2,
+        arch='original',
+        channel_multiplier=1,
+        bg_upsampler=None)
+    # test attribute
+    assert isinstance(restorer.gfpgan, GFPGANv1)
+    assert isinstance(restorer.face_helper, FaceRestoreHelper)
+
+    # ------------------ test enhance ---------------- #
+    img = cv2.imread('tests/data/gt/00000000.png', cv2.IMREAD_COLOR)
+    result = restorer.enhance(img, has_aligned=False, paste_back=True)
+    assert result[0][0].shape == (512, 512, 3)
+    assert result[1][0].shape == (512, 512, 3)
+    assert result[2].shape == (1024, 1024, 3)
+
+    # with has_aligned=True
+    result = restorer.enhance(img, has_aligned=True, paste_back=False)
+    assert result[0][0].shape == (512, 512, 3)
+    assert result[1][0].shape == (512, 512, 3)
+    assert result[2] is None
--- a/train.py
+++ b/train.py
@@ -1,10 +0,0 @@
-import os.path as osp
-
-import archs  # noqa: F401
-import data  # noqa: F401
-import models  # noqa: F401
-from basicsr.train import train_pipeline
-
-if __name__ == '__main__':
-    root_path = osp.abspath(osp.join(__file__, osp.pardir))
-    train_pipeline(root_path)
Author	SHA1	Message	Date
Xintao	ee3e556f18	v0.2.4	2021-12-12 22:54:36 +08:00
Xintao	ad1397180d	fix bug in inference: RealESRGAN model is None	2021-12-12 22:46:07 +08:00
Xintao	37237da798	update utils and unittest	2021-11-28 23:09:38 +08:00
Xintao	be73d6d9a4	clean and add more comments	2021-11-27 19:59:23 +08:00
Xintao	0ff1cf7215	update setup.py, V0.2.3	2021-10-22 17:06:29 +08:00
Xintao	4f0562df64	fix setup bug	2021-10-22 16:11:17 +08:00
Xintao	eadf03cac8	ReadMe: format discriminator download link	2021-10-06 02:10:27 +08:00
Xintao	a070b88e9e	ReadMe: add discriminator download link	2021-10-06 02:08:44 +08:00
Xintao	c2e88f8eb8	add eye_dist_threshold=5	2021-09-27 15:55:20 +08:00
Xintao	69bcfff4ef	add codespell	2021-09-27 15:50:44 +08:00
Xintao	e5adc0dd06	add ext option	2021-09-10 18:46:45 +08:00
Xintao	f6d3f70646	update: format and standards	2021-09-08 11:28:21 +08:00
Vincent	ad70ce4653	Reordered requirements (#54 ) * Reordered requirements It had the wrong order, resulting in `basicsr` not wanting to install. * update order * add comments Co-authored-by: Xintao <xintao.alpha@gmail.com>	2021-09-02 00:58:44 +08:00
Xintao	d7cb9f77f1	update readme	2021-08-30 00:09:42 +08:00
Xintao	06ea21690c	update open issues	2021-08-29 11:51:56 +08:00
AK391	1d5963b2e6	Add Gradio Web Demo in README.md (#52 )	2021-08-29 11:50:17 +08:00
Xintao	7176e63809	update publish-pip	2021-08-28 13:27:10 +08:00
Xintao	3da90f924e	add vscode format setting	2021-08-18 10:21:26 +08:00
Xintao	250b75c364	add no-response action workflow	2021-08-18 10:04:42 +08:00
Xintao	1e1c863dae	change bg_tile default to 400, update readme	2021-08-18 09:38:35 +08:00
Muhammad Danish	a75e39e323	Update README.md (#43 ) upscale_factor is not a recognized argument.	2021-08-18 08:06:36 +08:00
Xintao	11c3957a8f	fix save bugs in inference	2021-08-09 20:56:09 +08:00
Xintao	4a7b2cc325	update readme	2021-08-09 02:09:15 +08:00
Xintao	99eda83cce	Update README	2021-08-09 02:06:39 +08:00
Xintao	a87388fd2f	Update README	2021-08-09 01:59:19 +08:00
Xintao	262ee3399f	Update README	2021-08-09 01:57:46 +08:00
Xintao	996d1e3df9	Major revision: Support Pypi (#37 ) * reorganize * update inference * update inference * format	2021-08-09 01:28:10 +08:00
Xintao	77dc85b882	fix typo	2021-08-06 22:00:21 +08:00
Xintao	805e62af29	Merge branch 'readme'	2021-08-06 21:09:49 +08:00
Xintao	b529c9789d	update README	2021-08-06 21:09:01 +08:00