GAN模型代写|Assignment 2: Image-to-Image Translation



1 Introduction

In this assignment, you will implement generative adversarial networks (GANs), and apply them to image-to-image translation tasks. This system will translate one image to another correlative image.

1.1 What’s image-to-image translation?

The Image-to-image translation is the task of taking images from one domain and translating them to another domain, so that they have the style (or characteristics) of images from another domain. For example, in the following pictures, image-to-image translation tasks are responsible for translating the semantic images to the street scene images, or translating aerial images to map images.

1.2 What will you learn from this assignment?

This assignment will walk you through the specific aerial-to-maps tasks. You can refer to the following picture for intuitive illustration.

Besides, you will train a GAN model from scratch, and you will also be asked to utilize the standard SSIM metrics for evaluation.

The goals of this assignment are as follows:

  • Understand the architecture of generative adversarial networks (GANs) and how they work to generate a realistic photo.
  • Understand how the Generative model and the Discriminative model competes with each other in GAN.
  • Implement a GAN model, and train them with the maps-to-aerial dataset.
  • Understand and utilize the SSIM metrics for evaluation.

2 Setup

You can work on the assignment in one of two ways: locally on your own machine, or on a virtual machine on HKU GPU Farm.

Note: after following these instructions, make sure you go to Working on the assignment below (i.e., you can skip the Working locally section).

As part of this course, you can use HKU GPU Farm for your assignments. We recommend you follow the quickstart provided by the official website to get familiar with HKU GPU Farm.

After checking the quickstart document, make sure you have gained the following skills:

  • Knowing how to access the GPU Farm and use GPUs in interactive mode. We recommend using GPU support for this assignment, since your training will go much, much faster.
  • Geting familar with running Jupyter Lab without starting a web browser.
  • Knowing how to use tmux for unstable networks connections.

2.2 Working locally

If you have the GPU resources on your own PC/laptop. Here’s how you install the necessary dependencies:

Installing GPU drivers (Recommend if work locally) : If you choose to work locally, you are at no disadvantage for the first parts of the assignment. Still, having a GPU will be a significant advantage. If you have your own NVIDIA GPU, however, and wish to use that, that’s fine – you’ll need to install the drivers for your GPU, install CUDA, install cuDNN, and then install PyTorch. You could theoretically do the entire assignment with no GPUs, though this will make training GAN model much slower.

Installing Python 3.6+: To use python3, make sure to install version 3.6+ on your local machine.

Virtual environment: If you decide to work locally, we recommend using virtual environment via anaconda for the project. If you choose not to use a virtual environment, it is up to you to make sure that all dependencies for the code are installed globally on your machine. To set up a conda virtual environment, run the following:

conda create -n gan_env python=3.6 -y
conda activate gan_env

Install the pytorch environment following the official instructions. Here we use PyTorch 1.7.0 and CUDA 11.0. You may also switch to other version by specifying the version number.

conda install pytorch==1.7.0 torchvision==0.8.0 torchaudio==0.7.0 cudatoolkit=11.0 -c pytorch

Install other environments in the provided requirement.txt file.

pip install -r requirements.txt

3. Working on the assignment

3.1 Basis of generative models

Before starting, learning some basics of GAN is necessary. We recommend you refer to google document for a general introduction of GANs, including the overview of GAN structures, the generators and discriminators, the GAN training, and etc.
Note that some basic knowledge in this google document may appear in the final quiz.

If you are interested, please read the related papers (e.g., GAN, and pix2pix) for more details.

3.2 Task descriptions

Image-to-image translation (I2I) aims to transfer images from a source domain to a target domain while preserving the content representations. I2I has drawn increasing attention and made tremendous progress in recent years because of its wide range of applications in many computer vision and image processing problems, such as image synthesis, segmentation, style transfer, restoration, and pose estimation.