-
Notifications
You must be signed in to change notification settings - Fork 63
mnist and mnistm dataloaders #20
base: master
Are you sure you want to change the base?
Conversation
Hello @arikanev! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on January 15, 2018 at 16:55 Hours UTC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your PR!
There are a few things to fix, but LGTM overall.
Once you fix these it can be merged! Thanks
dataset_loaders/images/mnist.py
Outdated
import errno | ||
from PIL import Image | ||
from dataset_loaders.parallel_loader import ThreadedDataset | ||
from timeit import default_timer as timer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you reorder the imports so that they are grouped according to https://www.python.org/dev/peps/pep-0008/#imports and each chunk is alphabetically ordered?
dataset_loaders/images/mnist.py
Outdated
class MnistDataset(ThreadedDataset): | ||
"""The mnist handwritten digit dataset | ||
|
||
The dataset should be downloaded from [1] into the `shared_path` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please replace [1]
with [MNIST]_
.
dataset_loaders/images/mnist.py
Outdated
|
||
References | ||
---------- | ||
[1] Mnist dataset pickle file: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please replace [1]
with [MNIST]
dataset_loaders/images/mnist.py
Outdated
References | ||
---------- | ||
[1] Mnist dataset pickle file: | ||
your_username_here@elisa1.iro.umontreal.ca:/data/lisa/data/mnist/mnist_seg/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't make this MILA specific. Replace with the MNIST website or the download link.
dataset_loaders/images/mnist.py
Outdated
# optional arguments | ||
data_shape = (28, 28, 3) | ||
mean = [0, 0, 0] | ||
std = [1, 1, 1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to specify mean and std unless you actually compute them. Please either compute them or remove it.
dataset_loaders/images/mnist.py
Outdated
max_files = 50000 | ||
|
||
GTclasses = range(2) | ||
mapping_type = 'mnist' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is never used. Please remove.
dataset_loaders/images/mnist.py
Outdated
non_void_nclasses = 2 | ||
_void_labels = [] | ||
GTclasses = range(2) | ||
# GTclasses = GTclasses + [-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the commented line
dataset_loaders/images/mnist.py
Outdated
your_username_here@elisa1.iro.umontreal.ca:/data/lisa/data/mnist/mnist_seg/ | ||
|
||
""" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the empty line please
dataset_loaders/images/mnist.py
Outdated
it also creates/copies the dataset in self.path if not already there | ||
mnist data is in 3 directories train, test, valid) | ||
""" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the empty line please
dataset_loaders/images/mnistm.py
Outdated
from dataset_loaders.parallel_loader import ThreadedDataset | ||
|
||
|
||
class MnistMDataset(ThreadedDataset): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please inherit from MnistDataset
and modify only the relevant parts.
Implemented changes as requested, let me know if I should modify anything |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your new commit. There is still a lot of repeated code in the MnistmDataset
class, please remove it. I also noticed a couple of new things that should be fixed.
When you are done, please also add the new datasets to the __init__.py
in the root (follow what was done for the other datasets).
Thank you!
dataset_loaders/images/mnist.py
Outdated
|
||
The dataset should be downloaded from [1] into the `shared_path` | ||
The dataset should be downloaded from [MNIST] into the `shared_path` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"[MNIST]" should be "[MNIST]_"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
dataset_loaders/images/mnistm.py
Outdated
class MnistMDataset(MnistDataset): | ||
"""The mnist-m handwritten digit dataset. | ||
|
||
The dataset should be downloaded from [1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix the link as done for the other file. Also please add a small description of the dataset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do. I want to add the download to mnistm via my website akanev.com. Should I wait to push changes until I link it, or should I push a change to just the mnistm download link later on? Or neither and link via google drive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will push changes now and then push that later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any advantage to download through your website rather than the original download link? I think it's best to link to an established website to avoid to potentially generate confusion in the users and to make sure the link is up to date in case the original authors notice some problem with the data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no really established website for mnistm. I believe Yaroslav, (a MILA phd student), created it for previous research. There is one link to the old mnistm data via google drive, which I think could look sketchy. The new mnistm that I generated using a script is not available via the web right now, which is why I wanted to upload it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, theres some errors in the code when inheriting mnistm from mnist that I need to work out. So I will just push mnist for now, and fix the mnistm issues in the meantime.
} | ||
|
||
_cmap = { | ||
0: (0), # background |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cmap should return RGB triples. Please fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. To confirm: (0, 0, 0)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly.
name = 'mnist' | ||
|
||
# optional arguments | ||
data_shape = (28, 28, 3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe MNIST data_shape should be (28, 28)
. Can you verify this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes sorry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:)
dataset_loaders/images/mnistm.py
Outdated
# optional arguments | ||
data_shape = (28, 28, 3) | ||
|
||
GTclasses = range(2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't necessary, as is already defined in MnistDataset
. The same goes for all the other properties that are not different between the two.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
dataset_loaders/images/mnistm.py
Outdated
|
||
_filenames = None | ||
|
||
def __init__(self, which_set='train', *args, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is the same as the one in MnistDataset
. There is no need to redefine it.
The same goes with all the other methods: remove the parts that do not change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
Removed mnistm.py, mnist.py should have all of the necessary changes you requested. |
Mnist segmentations are useful for quick training when testing models