Skip to content

Commit

Permalink
update Danbooru2017 for pre-train models.
Browse files Browse the repository at this point in the history
Former-commit-id: 94b8d1e
Former-commit-id: ce546e2fb7a762c38d724f37b2b315fe25bbb3ca
  • Loading branch information
yu45020 committed Aug 19, 2018
1 parent 89ce2d8 commit dd0bcef
Show file tree
Hide file tree
Showing 4 changed files with 5 additions and 1 deletion.
Binary file added Danbooru2017/113k_imgs_512tags_encoded.7z
Binary file not shown.
6 changes: 5 additions & 1 deletion Danbooru2017/ReadMe.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
[Danbooru 2017 database](https://www.gwern.net/Danbooru2017)

The txt file contains 113k 512x512 image file names for training a CNN-LSTM classifier. You may download them by `rsync` with `-- files-from`
The ```[Danbooru2017] training image list_``` file contains 113k 512x512 image file names for training a CNN-LSTM classifier. You may download them by `rsync` with `-- files-from`

The ```113k_imgs_512tags_encoded.7z``` is a json file containing tags for 113k images. The ```sk-LabelEncoder_512tags.pk``` is used for one hot encoding from scikit-learn.

If you need a complete list of tags, please open an issue. I might be able to provide it. For a reference, if you process the large meta file from Danbooru2017, you may need around 8G to unzip the file and around 40 minutes with 8 cores to run through all json files. Good luck.
Binary file not shown.
Binary file added Danbooru2017/sk-LabelEncoder_512tags.pk
Binary file not shown.

0 comments on commit dd0bcef

Please sign in to comment.