First upload (major version update)

steveefemsc · May 4, 2018 · ee75b52 · ee75b52
1 parent dbe4747
commit ee75b52
Show file tree

Hide file tree

Showing 48 changed files with 3,630 additions and 246 deletions.
diff --git a/.gitattributes b/.gitattributes
@@ -0,0 +1 @@
+v1/ linguist-detectable=false
diff --git a/.gitignore b/.gitignore
@@ -101,4 +101,11 @@ ENV/
 .mypy_cache/
 
 # vscode
-.vscode/*
+/.vscode/
+
+# Experiments
+/data/
+/preprocessing/
+/exp/
+logs/
+BACKUP/
diff --git a/README.md b/README.md
@@ -16,68 +16,84 @@ drums, guitar, piano and strings tracks.
 
 Sample results are available [here](https://salu133445.github.io/musegan/results).
 
-## Papers
+## BinaryMuseGAN
 
-Hao-Wen Dong\*, Wen-Yi Hsiao\*, Li-Chia Yang and Yi-Hsuan Yang,
-"**MuseGAN: Multi-track Sequential Generative Adversarial Networks for
-Symbolic Music Generation and Accompaniment**,"
-in *AAAI Conference on Artificial Intelligence* (AAAI), 2018.
-[[arxiv](http://arxiv.org/abs/1709.06298)]
-[[slides](https://salu133445.github.io/musegan/pdf/musegan-aaai2018-slides.pdf)]
-
-Hao-Wen Dong\*, Wen-Yi Hsiao\*, Li-Chia Yang and Yi-Hsuan Yang,
-"**MuseGAN: Demonstration of a Convolutional GAN Based Model for Generating
-Multi-track Piano-rolls**,"
-in *ISMIR Late-Breaking and Demo Session*, 2017.
-(non-peer reviewed two-page extended abstract)
-[[paper](https://salu133445.github.io/musegan/pdf/musegan-ismir2017-lbd-paper.pdf)]
-[[poster](https://salu133445.github.io/musegan/pdf/musegan-ismir2017-lbd-poster.pdf)]
+[BinaryMuseGAN](https://salu133445.github.io/bmusegan/) is a follow-up project
+of the [MuseGAN](https://salu133445.github.io/musegan/) project.
 
-\* *These authors contributed equally to this work.*
+In this project, we first investigate how the real-valued piano-rolls generated
+by the generator may lead to difficulties in training the discriminator for
+CNN-based models. To overcome the binarization issue, we propose to append to
+the generator an additional refiner network, which try to refine the real-valued
+predictions generated by the pretrained generator to binary-valued ones. The
+proposed model is able to directly generate binary-valued piano-rolls at test
+time.
 
-## Usage
+We trained the network with
+[Lakh Pianoroll Dataset](https://salu133445.github.io/lakh-pianoroll-dataset/)
+(LPD). We use the model to generate four-bar musical phrases consisting of eight
+tracks: *Drums*, *Piano*, *Guitar*, *Bass*, *Ensemble*, *Reed*, *Synth Lead* and
+*Synth Pad*. Audio samples are available
+[here](https://salu133445.github.io/bmusegan/samples).
 
-```python
-import tensorflow as tf
-from musegan.core import MuseGAN
-from musegan.components import NowbarHybrid
-from config import *
+## Run the code
 
-# Initialize a tensorflow session
-with tf.Session() as sess:
+### Configuration
 
-    # === Prerequisites ===
-    # Step 1 - Initialize the training configuration
-    t_config = TrainingConfig
+Modify `config.py` for configuration.
 
-    # Step 2 - Select the desired model
-    model = NowbarHybrid(NowBarHybridConfig)
+- Quick setup
 
-    # Step 3 - Initialize the input data object
-    input_data = InputDataNowBarHybrid(model)
+  Change the values in the dictionary `SETUP` for a quick setup. Documentation
+  is provided right after each key.
 
-    # Step 4 - Load training data
-    path_train = 'train.npy'
-    input_data.add_data(path_train, key='train')
+- More configuration options
 
-    # Step 5 - Initialize a museGAN object
-    musegan = MuseGAN(sess, t_config, model)
+  Four dictionaries `EXP_CONFIG`, `DATA_CONFIG`, `MODEL_CONFIG` and
+  `TRAIN_CONFIG` define experiment-, data-, model- and training-related
+  configuration variables, respectively.
 
-    # === Training ===
-    musegan.train(input_data)
+  > The automatically-determined experiment name is based only on the values
+defined in the dictionary `SETUP`, so remember to provide the experiment name
+manually (so that you won't overwrite a trained model).
 
-    # === Load a Pretrained Model ===
-    musegan.load(musegan.dir_ckpt)
+### Run
 
-    # === Generate Samples ===
-    path_test = 'train.npy'
-    input_data.add_data(path_test, key='test')
-    musegan.gen_test(input_data, is_eval=True)
+```sh
+python main.py
 ```
 
 ## Training Data
 
-- [tra_phr.npy](https://drive.google.com/uc?id=1-bQCO6ZxpIgdMM7zXhNJViovHjtBKXde&export=download)
-  (7.54 GB) contains 50,266 four-bar phrases. The shape is (50266, 384, 84, 5).
-- [tra_bar.npy](https://drive.google.com/uc?id=1Xxj6WU82fcgY9UtBpXJGOspoUkMu58xC&export=download)
-  (4.79 GB) contains 127,734 bars. The shape is (127734, 96, 84, 5).
+- Prepare your own data
+
+  The array will be reshaped to (-1, `num_bar`, `num_timestep`, `num_pitch`,
+  `num_track`). These variables are defined in `config.py`.
+
+- Download our training data with this [script](training_data/download.sh) or
+  download it manually [here](https://salu133445.github.io/musegan/data).
+
+## Papers
+
+- Hao-Wen Dong and Yi-Hsuan Yang,
+  "Convolutional Generative Adversarial Networks with Binary Neurons for
+  Polyphonic Music Generation",
+  *arXiv preprint, arXiv:1804.09399*, 2018.
+  [[arxiv](https://arxiv.org/abs/1804.09399)]
+
+- Hao-Wen Dong\*, Wen-Yi Hsiao\*, Li-Chia Yang and Yi-Hsuan Yang,
+  "MuseGAN: Multi-track Sequential Generative Adversarial Networks for
+  Symbolic Music Generation and Accompaniment,"
+  in *AAAI Conference on Artificial Intelligence* (AAAI), 2018.
+  [[arxiv](http://arxiv.org/abs/1709.06298)]
+  [[slides](https://salu133445.github.io/musegan/pdf/musegan-aaai2018-slides.pdf)]
+
+- Hao-Wen Dong\*, Wen-Yi Hsiao\*, Li-Chia Yang and Yi-Hsuan Yang,
+  "MuseGAN: Demonstration of a Convolutional GAN Based Model for Generating
+  Multi-track Piano-rolls,"
+  in *ISMIR Late-Breaking and Demo Session*, 2017.
+  (non-peer reviewed two-page extended abstract)
+  [[paper](https://salu133445.github.io/musegan/pdf/musegan-ismir2017-lbd-paper.pdf)]
+  [[poster](https://salu133445.github.io/musegan/pdf/musegan-ismir2017-lbd-poster.pdf)]
+
+\* *These authors contributed equally to this work.*