Skip to content

hyeonsieun/Text-to-Image_Generation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 

Repository files navigation

Text-to-Image_Generation

This is the final project for the 2nd OUTTA AI Bootcamp, where I serve as the overall leader in the non-profit AI education organization I run, called OUTTA.

This project is designed to generate images based on text input.

Along with the OUTTA members, I created this project, set it as the final team project assignment for the 2023 2nd OUTTA AI Bootcamp, and evaluated the submissions to select the top-performing teams.

If you're interested in undertaking this project yourself, you can download the skeleton code from here.

This repository contains the solution for the project.

For a more detailed explanation about this project, please refer to the uploaded '2023_final_project_guideline.pdf'.

To execute this project, you'll need to modify the 'network.py' and 'train.py' files; it is recommended not to change other files.

A brief explanatory video about this project is available at the following link.

CelebA-Dialog

Dataset can be downloaded from here.

You can see the source of the dataset at the following link.

Command for data preprocessing:

python preproc_datasets_celeba_zip_train.py --source=./multimodal_celeba_hq.zip \
                                            --dest train_data_6cap.zip --width 256 --height 256 \
                                            --transform center-crop --emb_dim 512 --width=256 --height=256

Zip files at directory ./multimodal_celeba_hq.zip is like:

./multimodal_celeba_hq.zip
  ├── image
  │   ├── 0.jpg
  │   ├── 1.jpg
  │   ├── 2.jpg
  │   └── ...
  └── celea-caption
  │   ├── 0.txt
  │   ├── 1.txt
  │   ├── 2.txt
  │   └── ...

Reference

This repository is implemented based on LAFITE, StackGAN++ and AttnGAN.