Skip to content

Error when training starts. ZeroDivisionError: integer division or modulo by zero #9

Open
@mpwsh

Description

I know you pretty much abandoned this project, but i'm trying to make it work with tensorflow 0.12.1 and i'm getting this when the training "starts" (actually it freezes at 0% and then shows this error)

Model creation...
WARNING:tensorflow:From /media/sata/MusicGenerator/deepmusic/model.py:246 in _build_network.: scalar_summary (from tensorflow.python.ops.logging_ops) is deprecated and will be removed after 2016-11-30.
Instructions for updating:
Please switch to tf.summary.scalar. Note that tf.summary.scalar uses the node name instead of the tag. This means that TensorFlow will automatically de-duplicate summary names based on the scope they are created in. Also, passing a tensor or list of tags to a scalar summary op is no longer supported.
E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:923] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties: 
name: NVIDIA Tegra X1
major: 5 minor: 3 memoryClockRate (GHz) 0.9984
pciBusID 0000:00:00.0
Total memory: 3.89GiB
Free memory: 2.85GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: NVIDIA Tegra X1, pci bus id: 0000:00:00.0)
E tensorflow/core/common_runtime/gpu/gpu_device.cc:586] Could not identify NUMA node of /job:localhost/replica:0/task:0/gpu:0, defaulting to 0.  Your kernel may not have been built with NUMA support.
Initialize variables...
WARNING: No previous model found, but some files/folders found at /media/sata/MusicGenerator/save/model. Cleaning...
Removing /media/sata/MusicGenerator/save/model/train/events.out.tfevents.1511396622.tegra-ubuntu
Start training (press Ctrl+C to save and exit)...

------- Epoch 1 (lr=0.0001) -------
Subsampling the songs (train)...
Shuffling the dataset...
Generating batches...
Subsampling the songs (test)...
Shuffling the dataset...
Generating batches...
Training:   0%|                                                                                                | 0/2 [00:00<?, ?it/s]Traceback (most recent call last):
  File "main.py", line 29, in <module>
    composer.main()
  File "/media/sata/MusicGenerator/deepmusic/composer.py", line 197, in main
    self._main_train()
  File "/media/sata/MusicGenerator/deepmusic/composer.py", line 255, in _main_train
    next_batch_test = batches_test[self.glob_step % len(batches_test)]  # Generate test batches in a cycling way (test set smaller than train set)
ZeroDivisionError: integer division or modulo by zero

Any ideas?
Does this relates to the tags warning?

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions