Skip to content

This GitHub repo features a Cycle GAN model for Zebra and Horse Dataset image translation. It uses Wasserstein loss instead of L1 norm loss for improved stability, realism, and fidelity in the generated images. The goal is to achieve accurate and high-quality transformations between the two domains.

Notifications You must be signed in to change notification settings

Sameer-Ahmed7/Wasserstein-CycleGAN

Repository files navigation

Project Title:

CycleGAN Image Translation

Overview:

Open In Colab

This repository details the implementation of Wasserstein CycleGAN, a model that stands at the forefront of image translation technology. CycleGAN operates as a generative adversarial network (GAN) that facilitates bidirectional image translation between horse and zebra domains without the need for paired examples. This innovative model allows for the seamless transformation of horse images to zebra images and vice versa. The project was developed as part of the VISION AND PERCEPTION course, taught by Professors Irene Amerini and Paolo Russo, within my Master’s in Artificial Intelligence and Robotics at the Sapienza University of Rome, showcasing the practical application of advanced concepts in a real-world scenario.

What is CycleGAN?

CycleGAN is a deep learning model that is specifically designed for unpaired image-to-image translation tasks. It aims to learn mappings between two different domains without the need for paired training data, meaning there is no requirement for corresponding images in the two domains during the training process.

  • Unpaired (Means Un-supervised Learning): CycleGAN
  • Paired (Means Supervised Learning): Pix2Pix

Advantages of CycleGAN over Pix2Pix:

Why do we use CycleGAN, If we have a Supervised Learning technique Like Pix2Pix?

The reason behind that, we have to use CycleGAN, instead of Pix2Pix, because for many tasks, paired training data will not be available. We present an approach for learning to translate an image from a source domain X to a target domain Y in the absence of paired example.

Mapping G : X ➡ Y

Dataset:

The dataset used for this implementation consists of paired horse and zebra images. Unfortunately, we cannot provide the dataset directly within this repository due to size and licensing constraints. However, you can acquire the dataset from public sources or repositories and organize it accordingly. Or download the dataset from Kaggle repository Horse2zebra Dataset

Losses:

In CycleGAN, three distinct loss functions are utilized: adversarial loss, cycle consistency loss, and optional identity loss.

  • The adversarial loss aims to train the generator to produce realistic images by distinguishing them from the real images using Mean Squared Error (MSE) loss.
  • For the cycle consistency loss, the Wasserstein loss is employed instead of the traditional L1 norm loss. It ensures that translating an image from one domain to another and then back again generates a reconstructed image that closely resembles the original.
  • Lastly, the identity loss, although optional, aims to preserve the identity of an input image from the target domain. Notably, the original paper did not utilize identity loss, leaving it as an optional component.

Results:

Here are some samples of the translated images generated by the trained CycleGAN model:

  • Zebra-to-Horse
  • Horse-to-Zebra

Training Limitations:

Due to limitations in the training environment (e.g., computational resources, time constraints, or limitations in the Google Colab platform), the model was trained for a reduced number of epochs, stopping at 128 epochs instead of the originally intended 200 epochs. That's the main reason, the model is not quite good.

Acknowledgments:

This implementation is based on the original CycleGAN paper:

Zhu, J., Park, T., Isola, P., & Efros, A. A. (2017). Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV).

About

This GitHub repo features a Cycle GAN model for Zebra and Horse Dataset image translation. It uses Wasserstein loss instead of L1 norm loss for improved stability, realism, and fidelity in the generated images. The goal is to achieve accurate and high-quality transformations between the two domains.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published